1. Introduction
Many people are wondering whether biology, including humans, will find benefit or harm in new developments in information, including artificial intelligence (AI) [
1,
2,
3,
4,
5,
6,
7]. Such a question appears to identify biology and information as separate things. However, since the middle of last century, it has been pointed out that biology frequently (some would say always) is based on various types of information [
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20]. Despite the pervasiveness of information in biology, biologists have been slow to incorporate approaches used for non-biological information, or to recognize when information-related approaches are used frequently in biology, such as the most commonly-used frequency-sensitive biodiversity index, Shannon [
21]. This article discusses important parallels and connections, pointing out the possibility that biological and non-biological information might be components of a unified process of information evolution, which has been proceeding since before biology existed. This process could tentatively be named ‘PanEvolution’ or ‘Pan-Evo’. Such naming would recognize that all information, inside or outside biology, is involved in the same four basic processes [
20] (
Figure 1) :
- -
Innovation
- -
Transmission and replication, including the random processes therein
- -
Adaptation, and processes such as selection that often produce adaptation
- -
Movement
A similar set of common processes have been proposed for ecology and evolution by Vellend [
22], and for cultural evolution by Muthukrishna [
23]. Often several of these processes will be operating simultaneously, at one or more levels, each of which may incorporate biological and/or non-biological aspects, such as molecules, individuals, populations, species, and ecosystems [
22,
24,
25,
26,
27,
28,
29]. Of course, ecosystems include not only biology, but all the physical aspects of the world. This article will first present ideas about information, life and intelligence, then consider each of the four processes above, showing commonalities for biological and non-biological information. Next, there will be discussion of how these four processes might result in speciation, biological and non-biological. Finally, this article will examine how artificial intelligence could possibly benefit or harm biology, including humans, stressing that AI might not spell doom for humans, if AI actually becomes intelligent.
2. Information, Life, and Intelligence
Of course, one must set out what is meant by ‘information’ and ‘life’, and it turns out that both relate to order (versus disorder) and a related concept called ‘entropy’, and what is more, there is only a rather blurry distinction between ‘information’ and ‘life’. After discussing that, we will consider ‘intelligence’.
The important point for this article is that any ordered arrangement, whether biological or not, can be considered to potentially contain information. ‘Information’ is related to entropy, which can be thought of as the potential of any collection of objects or symbols to code a message. In Shannon’s paper [
30] that many regard as the origin of information theory, his title makes it clear that a message must be communicated (“The mathematical theory of communication”). Thus the measure he uses in that paper, derived from thermodynamics, is a measure of the potential for some assemblage to carry information, but its informativeness depends upon decoding by a recipient of some kind. In other words, informativeness depends not only upon what ordered arrangements can be made from the subunits, but also on the availability of a person or system that can make use of the ordered arrangement – receive knowledge, or change its state in some other way. Let us look at four groups of fourteen possibly informative letters, and a measure of each group’s order or disorder, called Shannon entropy (in natural log scale) [
30].
- -
oooooooooooooo Entropy = zero
- -
ooooooohhhhhhh Entropy = 0.69
- -
oooyhaerlelhwu Entropy = 2.11
- -
hellohowareyou Entropy = 2.11
The first group has little potential to code information, and its entropy is low. The other groups have higher entropy and higher potential to code information. The group with the second lowest entropy could likely code only a small range of messages, in English or any other system. The two highest entropy groups can potentially code many messages, and contain the same letters, in different orders. For a recipient who speaks English, the last group actually does encode a message “hello how are you”, but that group has the same entropy as the other high-entropy group, which is just nonsense. Thus, it is not entirely satisfactory to take entropy as a measure of information, because anything’s informativeness always depends upon that thing’s ability to be interpreted (and possibly used) by a recipient [
8]. For the purposes of this article, ‘information’ will be taken to denote some kind of order that can potentially code for a meaning, to an appropriate recipient, who has not necessarily been identified yet.
Next, what is the definition of ‘life’? Often this definition hinges on maintenance and transmission of order - roughly the opposite of entropy. Again, there are many definitions, although many researchers consider that life is defined by the ability to use energy and materials to create and maintain some ordered form that can reproduce itself with variation [
31]. However, it will be seen below that many things that are conventionally regarded as non-living have all these functions, often simultaneously. So is there a boundary between living and non-living, and if so where is it? Viruses are often said to be on this boundary, because although they can indeed use energy and materials to create and maintain some ordered form that can reproduce itself with variation, they can only do this within a host cell. That seems to be a very arbitrary decision, because some other parasites can only do those things inside a host cell, including bacteria in the rickettsias or mycoplasmas.
How to apply these notions to Pan-Evo, which necessarily will contain a wide variety of types of object - molecule, individual, ecosystem, etc - plus the variants within each category, such as isomers of molecules, different species in an ecosystem, or different variants within a species? Dealing with this array of possibilities requires definition and quantification of the units (eg individual) [
29,
32,
33]. One approach to this definition comes from assembly theory; in this theory, each object can be characterized by the assembly index, which is the minimum number of steps needed to construct the object. In the past it was proposed that assembly theory be used to distinguish biological and non-biological entities, but this is not widely accepted, and now the index, and the number of copies for each type, might be used to describe evolution in biology or outside it [
34]. However, note that two very different types of object might have identical assembly indices, so it is necessary to add the relationships between the objects, as defined by whether there are common components in the assembly pathway, and how many are shared [
35]. Examples of pathways might be a molecule’s non-biological synthesis mechanism (natural or artificial), or part of a biological phylogenetic tree, as shown in
Figure 2. In
Figure 2 it can be seen that the number of steps is an inadequate summary of the differences between the entities produced. For example, in
Figure 2a, starting from the same component there are two steps in the assembly tree of A and also for C, but the intermediate component is not shared, so that A and C would not be classified as being the same chemical simply because their assembly paths are the same length. Similarly, in
Figure 2b, the number of steps might need to be adjusted by upweighting some steps that are chemically more challenging (transversions), also the alternative pathways and back-mutation show that the number of steps is not a complete summary of the assembly tree.
Finally, what is ‘intelligence’, and could it occur inside or outside life? The Oxford Dictionary defines it as “the ability to acquire and apply knowledge and skills”, and if we substitute ‘information’ for ‘knowledge’, we will see below that some living and non-living systems can do this. However, not all systems display intelligence, even in humans, as set out in a paper discussing ‘Natural Stupidity’ [
4]. Some think it likely that AI will never develop the same quality of intelligence that humans have, being already set on a course to become very ‘intelligent’ at some tasks, but appalling at other tasks that we regard as simple [
36]. This mimics the way that individual humans vary in their ability to display intelligence in different settings.
3a. Innovation in Biology
Biological variation derives from mutations in the DNA, with mechanisms ranging from single base (A,T,G,C) substitutions deletions or insertions, to larger rearrangements, plus variation that does not involve base changes, called ‘epigenetic’, as well as changes that do not directly relate to DNA, such as behavioral variation [
37,
38]. Whether each of these is transmissible is discussed later. Secondarily, but usually much faster than mutation, innovation can occur through recombination: the exchange of information by physical breakage and reunion of the DNA string of information, to unite variants that were previously on separate DNA molecules (or ‘haplotypes’). In relatively short stretches of DNA (eg a megabase), recombination rates are low, so haplotypes can persist and sometimes may have great adaptive significance [
20].
Mutation models could also be employed as approximations for the production of novel variants at other biological levels. For example, at the ecological level, some models (eg single nucleotide substitution, or stepwise mutation) might be used as models of speciation that occurs by the alteration of a single character, such as the ‘magic traits’ discussed in the speciation literature [
20,
26,
39]. On the other hand, models that assume that every new variant is unique (eg the infinite alleles model) might be more appropriate for species that occur via multiple changes that accumulate during a period when two parts of a single species’ range are separated by a barrier [
20,
26,
39].
4b. Transmission and Replication outside Biology
The modes of transmission and replication just described also happen to ordered structures outside biology, especially by autocatalysis, which is when the product(s) of the reaction speed the reaction, and thus catalyze their own synthesis (
Figure 3 [
45]). Autocatalysis can occur either by the molecule acting as a template, or by other means, and of course is dependent upon availability of the raw materials. A huge range of molecular types, and mixtures thereof, can show autocatalysis [
46]. Such non-biological replication can also involve more than one molecule. Intriguingly, such interactions may have been involved in the origin of the genetic code; aptamers are nucleic acids bound to specific target molecules, and one such aptamer includes the codon for arginine, and preferentially targets binding to arginine [
47].
Also, artificial intelligence is moving towards achieving fully autonomous replication, with no human involvement (except perhaps provision of energy), plus the mutation and recombination discussed in the previous section. Electronic information can be programmed to use any of the replication modes above, and we see this replication everywhere, including in the computer-malware that we all seek to avoid. For example, in Generative Adversarial Networks, if one system has been devised or trained to make certain decisions (eg “this is/is not a picture of a bus”), another electronic system can, after slightly variable replication, be assessed by the second system on its ability to make the same decisions, and after many repetitions of this process the second system can become equally good at making such decisions, though perhaps with some differences in the underlying code [
48]. This rapid, massive process of trial-error-replication gives AI enormous power at many tasks.
5b. Adaptation outside Biology
Non-biological autocatalytic systems can show adaptation, in which there is competition between similar chemicals produced by factors such as radiation. Those that persist or reproduce better become more numerous. An example is the reaction shown in
Figure 3, where an autocatalytic chemical can be altered by irradiation, to make a new chemical whose autocatalysis outcompetes its ancestor chemical [
45].
As seen above in discussion of generative adversarial networks, some software-development is now using a form of adaptation: accepting or rejecting random alterations to the information in the code, based upon autonomous training that, with the other three basic processes, leads to ‘evolutionary programming’ and ‘neural networks’ [
40] including ‘Artificial Intelligence’ [
54,
55]. Hopefully the information that is available for training will be as unbiased as possible, but bias could be provided deliberately or inadvertently by humans, who are not known for their impartiality, as set out in a paper discussing ‘Natural Stupidity’ [
4]. For example, Google’s ‘AI Overviews’ tool has recently recommended adding glue to pizza, and eating rocks [
56]. One might hope that software engineers would be sufficiently logical to avoid such pitfalls, but it is worth noting that for one or two decades, a system of software engineers plus marketers gave us a major operating system in which, to stop it, you clicked on a button labelled ‘start’; there are many other such examples. Recently, there has been a plea for engineers designing electronic information systems to make greater use of long-standing methods by which biological systems handle information [
57]. For example, it is expected that communications systems involving satellites and terrestrial transmitters would operate better if they used a type of selection amongst varying signals: transmission would be prioritized for signals with low signal-to-noise ratio [
58].
Non-living systems that have some variants that are better at converting local resources into replicates might not only compete with other non-biological systems, but also compete with living systems [
6,
7,
59]. Whether this is likely will be discussed later.
6b. Movement outside Biology
Outside biology, it is also common to see movement of structures that show some order, and thus have information-coding potential. This occurs at scales from molecular diffusion through winds and currents up to tectonic plate movements and astronomical processes. AI-related entities can also move, such as robotic soccer-players, self-drive cars, and rovers on the surface of the moon or mars. Of course, information can also move through means such as wires or radio transmissions, without any physical movement of structures.
Reaction–diffusion models typically include the change in space and time of the concentration of one or more chemical substances, due to the interplay of chemical transformations and diffusion; the usefulness of these models has been demonstrated in one-dimensional and two-dimensional experiments [
61]. Reaction–diffusion models can also be useful descriptions of some processes involving biological information-carrying entities, such as development of phenotypes [
62], ecological invasions and epidemics [
61], and transmission of information in nerve pulses [
63].
7a. Speciation in Biology
Speciation is thought to often involve all of the four processes: innovation, transmission/replication, adaptation, and movement. Typically, there will be generation of variant information, via mutations or via recombination producing new combinations of existing variants. The frequency of these variants will be modified by stochastic processes, and possibly by adaptive changes. At the same time, there may be movement to and from different environments, where different variants might be adaptive. The process is usually considered to take place over multiple generations [
39]. But how we assess whether speciation has been achieved is subject to much debate; Mayr’s Biological Species Concept (also sometimes reproductive or isolation concept) is characterized as “groups of actually or potentially interbreeding natural populations, which are reproductively isolated from other such groups” [
64]. Such species might also be called a group of individuals that preferentially affect each other’s heritable information by transmission between individuals, with less emphasis on affecting information by other methods such as competition, cooperation or predation. There are many alternative definitions of a species, most of which attempt to focus on one or more of the logical consequences of the divisions envisaged by Mayr, such as discontinuity of the distributions of physical or genetic characteristics [
39]. This avoidance of Mayr’s definition is because diagnosis of species using that definition requires such an overwhelmingly large testing effort. Considering only eukaryotic species, there are potentially ~5 +/- 3 million species [
65]. If for each of those 5 million species, we tested it against the nine most closely-related potential species, then there would be tens or hundreds of millions of tests needed, each test following two generations of hundreds of crosses between the potential species, plus a similar number of control crosses within potential species. This is clearly not achievable – a review found less than 1000 such investigations in eukaryotes [
66]. Therefore, most species definitions attempt to avoid crosses and use some easier method to approximate Mayr’s idea of a species [
39]. Also some authors avoid the species concept altogether, preferring to discuss ‘taxa’, ‘evolutionary significant units’ or ‘management units’ [
67,
68].
It is worth pointing out that there is similar difficulty of defining species outside the eukaryotes, where there are several other ways of facilitating or limiting exchange of information between groups of microorganisms. Lan and Reeves [
69] echo Mayr by saying that for microbial taxonomics “a major factor in maintenance of species specificity ... is the existence of a recombination barrier between species”, and then their article discusses many alternative methods of delineating microbial species, all of which are also frequently used in eukaryotes [
69].
7b. Speciation outside Biology
It is always difficult to think about the future – it is said that in the 1940s, IBM’s president thought that in the long-term, there would be a global market for a total of about five computers [
70]. Moreover, at present some of the most-developed systems in information, large language models (LLMs, eg Chat-GPT) are still susceptible to hallucination, which like human hallucination, is when the LLM produces text that is “nonsensical, or unfaithful to the provided source input” [
71]. Nevertheless, we have consistently seen IT developments overcome barriers, and outperform positive predictions. Thus it is likely that non-biological information will evolve to become increasingly sophisticated at autonomous innovation, replication and transmission, adaptation and movement. Given that these four functions underly all biological evolution, including biological speciation, it is worth asking how Pan-speciation might occur within the non-biological realm, or between biology and non-biology.
How could we extend the species concept beyond the biological realm? And what should we call such a process when defined both inside and outside biology? Note that the words ‘species’ and ‘speciation’ are not exclusive to biology, being used in chemistry, geology etc, and there is a general need to identify the limits to the units that we are discussing [
33]. However, perhaps a new name is needed to encompass the related processes that happen within and outside biology, and I suggest ‘Panspeciation’ and ‘Panspecies’. Rather than trying to extend each competing version of the species concept within biology, one might extend Mayr’s general species concept in the following way: that members of the same panspecies should be able to influence other members of the same panspecies more by exchange of information than by other processes such as competition, whereas members of different panspecies influence members of other panspecies more by processes such as competition, cooperation or predation rather than exchange of information. As with the species-definition within biology, the experimental work necessary to validate panspeciation might be inaccessible for some reason, in which case there are similar approaches to avoid that experimental work. For example, the idea of a panspecies might be related to compatibility of particular groups of software and hardware, whether generated by humans or by AI [
72]. Also, the panspecies definition might be based on the assembly theory mentioned previously, so that a panspecies would be an ensemble of objects that have more common steps in their construction that they have with objects outside the panspecies (
Figure 2).
8. Integration of Non-Biological and Biological
There have been some explicit attempts to integrate biological and non-biological information. Often these rely upon the idea that certain systems of equations will give a common currency to be used across biological and non-biological evolution and ecology. Examples abound. One is the commonality of biological and automaton self-replication, including importance of replication of the information [
73,
74]. Another example is the similarity of ‘drift’ in gas or electron mechanics to ‘selection’ in biology (despite that fact that the term ‘drift’ is used for a non-directional process in population genetics [
75,
76]. Another example is the similarity of equations for Boltzmann’s thermodynamic equation and Shannon’s information equation [
51], which is appealing because biological processes always have some underlying energetic basis [
77]. However simply showing that two equations have a similar form does not mean that they represent the same thing [
78]. In a similar vein, there have also been attempts to unify the analysis of biological processes across scales from genes to ecosystems [
22,
25,
26,
27], and possibly this might be extended into non-biological systems, with appropriate caution.
In particular, O’Connor et al. [
8] attempt to integrate the analysis of information, energy, and materials, and suggest that biology is an emergent property of information processing systems, at multiple scales. They stress that behavior of the system will depend upon the available information, and that units of selection might extend beyond the biological individual.
An obvious point of commonality is the way that biological processes might have evolved out of non-biological processes. Autocatalysis appears from industrial application through to possible pre-biotic evolution [
79]. Autocatalytic RNA molecules can undergo mutation, recombination, and selection [
80]. Transfer-RNA activities are not limited to protein translation, but also have other functions, particularly in viruses where they can be involved in replication, reverse transcription, and as telomeres (chromosome ends) [
81]. Other non-RNA self-replicators are being used to build denovo life [
82]; for example, catalytic DNA is more stable than RNA, so favored industrially [
83].
9a. Possible Benefits to Biology Including Humans
There has been much recent discussion of artificial intelligence (AI) and whether it might cause a ‘singularity’, perhaps even human extinction [
3]. Of course, if humans are actually intelligent (though see [
4]) they will devise effective regulation to limit societal harms of AI [
1,
2,
84]. Irrespective of such regulation, AI might even help humans, if AI actually becomes intelligent.
It has been pointed out that the interaction of heritable evolutionary changes and cultural changes might have given rise to humans’ extraordinary ability to cooperate with non-kin in huge groupings, leading to our present dominant status; in particular, the different transmission methods of genes and culture might have allowed such groups to maintain cultural cohesion without genetic relatedness [
85]. Of course, everything in current information technology is part of human culture, or a derivative thereof, so this symbiotic relationship might continue to prosper. If we accept that life derived from information’s evolution, but that we have so far mainly focused on understanding evolution of living systems, we might see benefits in broadening our investigation to the evolution of all information, including the possible future.
Information change in biology and in the physical world might both be managed for our benefit. An example of this would be to manage two evolutionary challenges: delaying adaptation of pathogens pests etc, and speeding adaptation of valued organisms [
86]. However note that this will require careful regulations, because of the ever-present possibility of disagreement between individual and public good [
86], as well as the possibility that AI might not help us, but instead hinder us, if AI is non-intelligent. Other possible benefits include the use of artificial replicators to build denovo life [
79,
82], and the use of assembly theory for drug discovery [
35], and information theory to predict the likely phenotypic effects of novel mutations before they are manifest in the next generation, either natural or in-vitro [
87]. Finally, information-based analyses can assist our analysis and forecasting of natural processes at many levels [
20,
40,
51,
88].
9b. Possible Threats to Biology Including Humans
AI might pose several possible threats to biology including humans, including human-AI coevolution, and direct competition that is discussed in the next section. An example of potential human-AI coevolution would be if AI takes over more tasks, such as your car identifying that its brakes need fixing and driving itself off to get repaired. (Hopefully the AI will not decide that you want the cheapest brakes!) As a result, there might be selection against maintaining brain parts or functions that do things AI can do for us. This selection against certain brain functions or regions might be driven by the energetic expense of running a large brain [
89], and possible risks to mother and child during the birth of a baby with a large head [
90,
91]. Thus what is initially a mutually beneficial arrangement might gradually result in increased dependency of humans upon AI, especially given the attention-getting strategies of AI [
59]. However, note that currently undesirable allelic variants (eg for small brain) are usually recessive and rare, and therefore it could take thousands of generations for such selection to change allele proportions very much.
This article does not intend to extensively discuss what regulations are currently needed for AI, but there are some obvious ones, such as banning lethal AI, and a number of measures to increase trustworthiness in AI, including requiring each piece of information to have prominent identification of the humans or machines that generated it, which would limit malign influence in politics or elsewhere [
36]. This regulation will of course be anthropocentric, and will only be achievable if humans themselves behave intelligently [
4]. Thus there is a need for education of humans, to better understand how we should be evaluating information and its sources, as well as improving cooperation. This will minimize the effect of human stupidity [
4] and avoid producing artificial simulated stupidity (ASS). We are, or can be, in charge of this process, and need to act [
23].
10. AI, Competition, and Panspeciation
In biological evolution, distinct lineages often evolve to occupy different environments [
39,
92]. Note that humans and machines are best suited to vastly different environments (
Figure 4). Humans require things that damage computers and other machines: copious water, Oxygen, and Carbon Dioxide for their food plants. Also machines can also be manufactured to tolerate a much wider range of temperatures and radiation than humans. Thus, environments most favorable to humans are present on much of Earth, whereas environments more favorable to machines are scarce on Earth, but abundant elsewhere, such as the Moon and Mars. So, if AI is actually intelligent, competition might be minimized because machines might principally seek different environments to humans [
5]. Of course, we might continue other human/machine interactions that do not need physical proximity.
As machines improve at operating their own design and production, the information that previously resided in humans’ minds might diverge into two groups with independent transmission of information, some still transmitted by humans, some entirely independently transmitted - would we call that ‘incipient panspeciation’? As stated above, we often say that two groups belong to different species if their internal interactions have greater emphasis on transmission of information whereas between-group interactions have greater emphasis on competition, cooperation or predation. Diagnosis of separate species status is especially likely if the groups also display environmental segregation [
39,
92], as just postulated. If this comes to pass, then humans and machines might show relatively little competition because of their different favored environments (
Figure 4). Thus spatially-separated coexistence might be possible, depending on the variation in humans and AI – both the differentiation between the average members of each group, and the variation within each group, which also affects competition and cooexistence [
93,
94].
11. Conclusions
If someone might like to use the word ‘panspeciation’ to describe the splitting of information into that carried by humans and that carried by AI and its successors, then we are currently in the early stages of the first speciation-like event involving Homo sapiens for many thousands of years. But it would not be particularly hostile to humans, if AI is actually intelligent.
At present, the only likely source of Artificial Intelligence is from human-run brains and organizations. For various reasons, this situation may not last, but it means that we currently have the capacity to limit any harm, if we wish to do so. In particular, it will be important to be quite sure of what we mean by ‘intelligence’, and make sure that it is possessed by not only by AI but also by those humans who manage AI, so that both can see the benefits of partitioning the environment, rather than competing.