Skip to main content

The Origin of Information

 
Information theory studies the quantification, storage, and communication of information. It was originally proposed by Claude E. Shannon in 1948 to find fundamental limits on signal processing and communication operations such as data compression, in a landmark paper entitled "A Mathematical Theory of Communication".(wikipedia quote)
 
 
 
What, however, is a source and destination of information, is not a subject of that theory. Later on, information naturally occurs in AI theory as a sensory input to an intelligent agent.
 


 

The picture shows an "intelligent information cycle", intelligent agent percepts information via sensors from its environment, analyzes them and decides on how to act towards the environment, actions are performed via actuators, and the cycle goes on continuously. The same model is used to describe responsiveness to stimuli, one of fundamental characteristics of living things. Others are:

  • metabolism (catabolism vs anabolism, digestion respiration and excretion, ...)
  • homeostasis
  • cellular organization (includes cell specialization in multicellular organisms)
  • reproduction (sexual and asexual)
  • growth and development (from fertilized egg to embryo, and on, to fully grown individual)
  • healing, repair and regeneration (response to acute injuries: mechanical, thermic, pH..., or chronic: aging, oxydative damage)
  • heredity and evolution

These characteristics are connected with a model which describes another important role that information plays in biology, and that is functioning of self reproducing molecular machines. Model is depicted by following diagrams:

Self-reproducing machine S takes raw material (nutrients) N from environment and duplicates itself producing waste W as a side effect. More detailed description:

An accurate self-reproducer (top) comprises the replicator R (blue outline) and the vehicle V (green outline)—containing the copier C and the constructor B. In the copy phase, C copies the replicator R − C[R] (red outline) acts as a constructor. In the construction phase, B executes the recipe in R to build a vehicle from generic resources N − B[R] (red outline) acts as a constructor. Finally (bottom), the copy of R and the newly constructed vehicle form the offspring. The pictures and description of the process are from Constructor theory of life by Chiara Marletto. The idea itself, however, is older, and roots back to Theory of self-reproducing automata by John von Neumann. If one studies the pages 84-87 of von Neumann paper, one can see that he identifies another separate part of the vehicle (not shown separately in the picture) responsible for coordination of copy and construction phase, that is one that "turns on" copier and constructor in adequate point in time, and assembles the results of both phases into a new instance of self-reproducer. All these parts of the vehicle reponsible for the functioning of self reproduction cannot be altered without serious risk of mulfunctioning, that is, if part of the replicator based on which these parts are constructed is mutated, probably the error will be fatal, ie self reproduction will not be able to continue. There is another (significant) part of vehicle that von Neuman mentions, which is also not presented in the detailed picture as it is responsible for other functions of self reproducer, not related to self reproduction itself. That is actually the only part that can be altered due to the mutation of replicator, that actually preserves some chances for survival of changes in next generations. Whether this is going to be the case, natural selection will determine, by favorizing beneficial mutations. Let's notice that the copy process is different from construction process, in a sense that copied information is identical to the original, while construction performed by programmable constructor such as here in the construction phase, does decompression of information stored in the replicator in a blind and precise manner, very much as computer performs instructions given in a program that determines the computer behaviour during its execution, see this for example:  DNA seen through the eyes of a coder by Bert Hubert.
The abstract model refers concretely to a cell, that is a self-reproducer, copier comprises two enzymes, DNA helicase, which is a motor protein that unwinds two DNA strands, and DNA polymerase which synthesizes the new strands by adding nucleotides that complement each strand, while constructor consists of RNA polymerase, which performs transcription, and ribosome which performs translation. One idea is that information flows always due to such process from DNA via RNA (product of transcription) to protein (product of translation), and never in the oposite direction, and this is called the central dogma of molecular biology, the opposite idea is proposed by James Alan Shapiro, which he calls natural genetic engineering, which allows for intervention of the cell (that is proteins) into DNA code, at certain moments, prior to transcription and translation. The shift of paradigm is that DNA is not ROM of cell, subject only to random changes, but rather its RW memory. We identify cell with proteins, because these molecules are what the cell is made of, its most active part and major building block. And once proteins are synthesized from DNA, cells (and its organelles, as another layer of organization and complexity) are built from proteins, tissues are built from cells, organs are made of tissues, and organisms are made of organs, the whole organism is described by one molecule, that is a supreme compression of information, considering the complexity of any organism. Purely abstractly speaking, if the object Y can be constructed by a certain programmable constructor solely from information contained in object X, by blindly and precisely following these instructions without introducing any novel information itself in the process of construction, ie without any inventiveness or creativity, then the information contained in the object X is sufficient to describe Y, and it can only undergo decompression in the process.
Let's notice that information circulating in the "intelligent cycle" are different from those circulating in the "self-reproduction" cycle. The former serve to a living entity to increase chances of survival and reproduction in interaction with its environment, survival of not only the entity, but the survival of species, and survival of genes within the genetic pool of the species. The latter serve as instructions to molecular machines that build the living entity. What is the connection between these two types of information? The obvious link is that living entities choose their sexual partners for mating, and that in order to be able to reproduce, they need to survive to the point of maturity for that process, which both affects the content of information circulating in the self-reproductive cycle, and both is obviously affected by the processing of the information in the intelligent cycle, but, is that all? According to Shapiro this question is open, and he doesn't seem to be alone, considering the number of scientists present in the third way of evolution site . If cell intelligence exists, can it influence the DNA information content? First of all, in single cell organisms, the intelligence of the cell is all intelligence such organism can possibly possess, but in multicellular ones, such thing must be subordinated to higher levels of intelligent organization, that exists in such organisms. Is there a way for the information from higher levels to be propagated to lower levels, ie from organism level to the cell level? Can that information affect DNA, such as bad psychological shape causing immune system to deteriorate and cause cancer, or the chemicals cortisone and catecholamines created by mental stress which can create free radicals, which can damage the DNA? And what is information (both types) ontologically speaking? How it appeared in the first place, and is this the same question as that of abiogenesis?  How novel information arises in the self-reproductive cycle of evolution? Without the possibility of intelligent cycle to intervene, one possible way is a random action of various agents (radiation, chemical agents, viruses,...) to damage (alter) the DNA information and that such damage doesn't get repaired before being passed to next generation, ie it must be germline mutation, and it must prove to be beneficial, by allowing survival of next generation. OK, maybe there are ways that fall somewhere in between, and are not strictly neither of these ways, but what are they exactly? Another obvious link between intelligent and self-reproductive information cycle is that living entities seem to be genetically programmed to understand the implicit goal of life (the previously mentioned survival and reproduction in a harsh environment) and successfully create and perform strategies to accomplish that ultimate goal, by decomposing it to a number of intermediate goals during their lives. So, in that sense, information is definitely flowing from self-reproductive to intelligent cycle. When I first read in Marletto's paper that von Neumann was the inventor of vehicle <--> replicator paradigm, that permits only a flow of information according to the central dogma of molecular biology, treating organisms as mere passive transporters of genes, I was somewhat surprised because I always thought Richard Dawkins invented that view, but he is merely the best known propagator of this idea. Francis Crick himself coined the term central dogma by using the word "dogma", to emphasize the fact that there is no experimental proof that the idea is 100% true.
If we go back to the beginning, we can see how Shannon's information theory defines information. First of all, it is clear from the diagram that there is an essential difference between information, flowing from its source, and noise that is superposed to it, flowing or occuring from its stochastic source. If you appear at the crossroad without knowing which way to go (left or right) in order to reach your final destination, you have 50% chance to find it out by tossing the coin, instead of asking for that one bit of information. And if there are n such crossroads on your way, your chances are one half to the power of n. Hence, the quantity of information I (in bits) that you need is equal with reversed sign to the binary logarithm of probability of guessing p, ie I=-log2p. The more information that is needed, the lesser chances of getting it randomely, and vice versa, the more information content offered in the signal, the lesser chances it appeared from the stochastic source. So, what is a source of information that is not stochastic? It obviously is an informed agent, and the question is, is it the same thing as an intelligent agent? It is definitely not an information medium, because it still requires an agent to store the information there, for example, you could not get your answers from the signs on the crossroads, if there were no people to write the needed information and place these signs there. The precise answer depends on the precise definition of intelligence, after all, if we want to establish precise relation between information and intelligence we have to have precise definitions of both terms. Fortunately, there are people who deal with the problem of defining intelligence, see please Universal Intelligence: A Definition of Machine Intelligence by Shane Legg and Marcus Hutter. So, the question is, if information is a fundamental physical quantity, is it also the discriminant between the living and non living things? And if man succeeds to produce artificial life based on knowledge about how things work with respect to information, what does that actually prove? Does it prove that some higher level of intelligence produced natural life, and some even higher level produced these producers, and so on, ad infinitum, or is it actually the fact that information exists in a non living world, at some quantum level, even though it actually doesn't inform anyone (no intelligent agent), and that natural life began when the first self-reproducing molecular machines managed to get constructed spontaneously, and that there was nothing before that, that resembles intelligence of any kind? But to understand that, we have to get back to constructor theory. Here is another paper, by David Deutsch, The philosophy of constructor theory. It explains the paradigm shift proposed by him, in a sense that physical laws are outlined as counterfactual statements of what tasks (transformations of the state of substrates) are possible and what are not, and within possible there is a room for discovering laws of motion that lead initial states to final states as in a prevailing concept of physics (present before that paradigm shift). Constructor is an object that performs the task, transforming input state of substrate to its output state, without being changed itself in the process. Obviously, this is a generalization of the notion of a catalyst or enzyme, or fully automated car factory. Now, there is an algebra of tasks that formalizes task operations, since tasks are central notion of that theory, and there is a notion of a programmable constructor, which can be depicted this way:
The difference between the programmable and a simple constructor which can perform only elementary tasks, is that the latter works spontaneously only on a ground of physical laws that do not require information, while the former requires information on top of that, in a form of program that provides instructions for its work. Both copier C, and constructor B, are in conjuction with replicator R programmable constructors that use information stored in R as a program, input state of substrates are generic resources N, and output is replicator R (in a copy phase), and vehicle V (in construction phase), plus waste W.
The task of abiogenesis is to construct self-reproducing machines, that is, to allow them to arise from generic resources only, via natural selection. The proof presented in Marletto's paper that this is actually possible, is insightful and logical, but it is also a bit wordy, and is definitely related to objections of ID (intelligent design) proponents to a claim that chemical evolution is a sure thing that filled the gap between the great complexity of the cell (which they call irreducible) and much lesser complexity of molecules, constituents of the cell, which are not living objects by them selves. In a sense, how actually such molecules managed to get themselves assembled in a proper way to start functioning as a living thing? These objections are not that irrational, since there is actually not much living evidence, besides viruses (http://math.ucr.edu/home/baez/subcellular.html), of subcellular (not cellular) organisms, and they, although possess DNA, totaly depend on cellular organisms to be able to reproduce. Of course, it is entirely possible and very likely that before cell appeared, there was a whole world of such organisms that were able to reproduce by themselves, and that they all are extinct now, but it still is a great puzzle, which we are fortunate to have, because, what is science without mysteries to be solved? For completeness sake, we can honorably mention prions, as another entity that appears to be half living, but these are only actually badly folded proteins, not even having their own DNA. There is a huge dispute regarding the abiogenesis between ID community and community of mainstream biologists, and the word "dispute" is a mild description of the state of affairs. Proving the possibility that abiogenesis occured spontaneously from physical laws that do not include "design", is still only a half of the problem, the other half is that intelligence is totaly ignored as a fundamental characteristic of living things, as well as its potential role in evolution. A great deal of logic presented in Marletto's proofs is based on replicator <--> vehicle paradigma, in which vehicle is not allowed to affect replicator purposefully, and if that assumption is false, the whole logic is shaky, or at least a little bit different with respect to evolution. Do we have to exclude the possibility to extrapolate conclusion from the intelligent cycle to the self-reproductive cycle? And that conclusion is that intelligent agent is the only possible source of information. Or is that conclusion also wrong? Although it's clear that any change in an environment (such as "it started to rain") or any state of the environment (such as "the temperature is 30 degrees Celsius"), that has nothing to do with any intelligent agents, which may be not present at all in the environment, can become a source of information iff there is an intelligent agent that can percept them as such via its sensors, the information can be communicated via information channel only from one intelligent agent to the other, by exchanging meaningful messages that they both understand and interprete in the same way, everything else is a noise, and the other "f" in "iff" may be important. Because, if information is an input that converts an unknown variable to a known, as a result of measurement or observing, there always needs to be an intelligent agent to which something is known or unknown.
Another question, do these two information cycles sufficiently describe life in an abstract way, or is there another ingredient missing, such as consciousness for example? Because intelligent agent doesn't have to be concious. There is a mathematical theory of consciousness developed by Miranker and Zuckerman Mathematical Foundations of Consciousness , but this is a topic for another blog.





Comments

Popular posts from this blog

More on AGI

  Have you ever wondered what mathematical abstraction describes a computer program most adequately? If you consider it a black box, then the right answer is probably a function, because from the outside a program appears as a mapper of some input data into some output data. If you consider how it does what it does, then some additional abstractions may come into mind, like an algorithm, because a program is a sequence of instructions, non linear, in a sense that it can make loops, branching, forking, spawning, calling (sub)functions in the main function, etc. But it always presents a decomposition of the main task into smaller steps, subtasks, so, this decomposition is the essence of what an algorithm is. Obviously there must be some smallest steps then, that need no decomposition in order to be performed, or rather, that cannot be decomposed to anything more basic/fundamental on a certain computer. These are either machine instructions, or instructions of the higher level programming

On Cancer and Evolution of Multicellular Organisms

  I noticed on youtube an excellent podcast called  Target Cancer , hosted by Sanjay Juneja, about all the latest technologies and treatments for cancer, and decided to review two of its episodes which I find particularly important and informative, the one in which a guest was Jason Fung:  The Surprising Link Between Intermittent Fasting, Diabetes, and Cancer. Dr. Fung Explains - Part 2  , and the one in which a guest was Michael Levin:  Fixing cancer cells and Immortality .  To me, there is no doubt about which of the two questions mentioned in the title of this essay is more important, it is cancer, however, the connection between them increases the importance of the other too. To an untrained eye in this issues, to which the existence of this web site may also come as a surprise:  https://www.thethirdwayofevolution.com/  , let me sketch that connection. In his exposé, Dr. Fung uses this table to compare traits of three categories of living agents: The last row raises the question, i

Two Challenges

 I continued to follow Michael Levin on youtube,  Michael Levin & Matthew Segall discuss Meaning, Matter & Memory in Developmental Biology  and started to notice a rather strange comment repeated by that brilliant scientist. When people ask "Where is the information that controlls morphogenesis written?", he adds in that context the following question "Where does the shape of the bell curve of normal distribution that emerges in stochastic processes (like when dropping marbles into Galton board) come from?", trying to say that not everything has to be written somewhere, "we get some things for free from mathematics". And this is true, we get so many things like that, that is a whole point of applied mathematics, for example a trajectory of a stone thrown by hand is a parabola, it may degenerate to a line if we throw it vertically, and there is always a slight air resistance, but if we disregard that, we can ask the same question: "Where does p