Preprint
Article

Information Gradient among Nucleotide Sequences of Essential RNAs in an Evolutionary Perspective

Altmetrics

Downloads

110

Views

40

Comments

0

A peer-reviewed article of this preprint also exists.

Submitted:

24 May 2024

Posted:

24 May 2024

You are already at the latest version

Alerts
Abstract
We hypothesize that the first ancestral "ante-cell" molecular structures, i.e., first RNAs and peptides which gradually transformed into real cells once the Earth had cooled sufficiently for organic molecules to appear there, have leaved traces in RNAs and genes of present cells. We propose a circular RNA which could have been one of these ancestral structures whose vestigial pentameric subsequences would mark the evolution from this key moment when the ante-cells have begun to join the living organisms. In particular, we propose that in present RNAs (ribosomal or messenger) playing an important role in the metabolism of current cells, we look for traces of the proposed primitive structure in the form of pentamers (or longer fragments) belonging to their nucleotide sequence. The result obtained can be summarized in the existence of a gradient of occurrence of such pentamers, with a high frequency for the most vital functions (protein synthesis, nucleic synthesis, cell respiration, etc.). This gradient is also visible between organisms, from the oldest (Archaea) to the most recent (Eukaryotes) in evolution of species.
Keywords: 
Subject: Biology and Life Sciences  -   Ecology, Evolution, Behavior and Systematics

1. Introduction

The oldest geological samples containing indisputable traces of life date back approximately 3.7 billion years [1]. We have no trace of previous life forms before the slow cooling of the Earth. To explain how life appeared, we propose a scenario according to which things would have happened with two stages: i) acquisition of organic molecular structures typical of currently living cells and ii) acquisition of elementary functions necessary for the survival of the first living cells. These functions could have appeared in ante-cells, i.e., systems with two fundamental properties: 1) a cooperation between first peptides and ARNs favoring their both reproduction and 2) a capacity to ensure redox transformations using basic substrates (methane, oxygen, hydrogen sulphide, etc.).
Following the discovery by Stanley Miller that small organic molecules (namely nucleotides and amino-acids) could appear spontaneously in the Earth's atmosphere assumed to have existed some 4 billion years ago [2], an intensive research activity has gone into determining the origin of life [3,4]. A huge literature has accumulated, with contributions ranging from biology to astrophysics (see for instance [5,6,7,8,9,10,11,12,13,14,15,16,17]). Two main stream of interpretative theories have been followed, which are usually referred to as "DNA first" and "RNA first" [18].
However, although a century has passed since the pioneering article by Alexander Oparin [19], the ambitious project to develop a mixture of reagents capable of changing into a living system [20,22] is far to be achieved. However, the discovery of the existence of a nucleic oligomer called AL (for Archetypal Loop [23,24,25]) which could have marked the transition process from ante-cells to real cells. we propose that, in the nucleotide sequences of RNAs (ribosomal or messenger) playing an important role in the metabolism of current cells, we look for traces of this primitive structure in the form of pentamers (or longer fragments) belonging to its nucleotide sequence. The result obtained can be summarized in the existence of a gradient of occurrence frequancy of such pentamers, with a high frequency for the most vital functions (protein synthesis, nucleic synthesis, cell respiration, etc.). This gradient is also visible between organisms, from the oldest (Archaea) to the most recent (Eukaryotes) in the evolution of species.
In a first section, we recall the properties of the archetypal AL structure, then, in a second section, we systematically search for traces of pentamers originating from the nucleotide sequence of AL in RNAs (rRNAs and mRNAs) of organisms ranging from the most ancient (Archaea) to the most recent (Eukaryotes). We demonstrate the existence of a frequency gradient of these pentamers and finally, in a third Discussion section, we demonstrate the coherence of our approach using phylogenetic results from the literature and those from a new classification, based on the use of the Maxwell© classifier [25].

2. Results

2.1. First Works on the Origin of Life

No form of life, including microbial, appeared in an adequate nutrient medium in the absence of any source of contamination according to Louis Pasteur [26,27] and others. It was not possible for them to do an infinite number of experiments and a rare event could very well have escaped them. Therefore, these experiments do not prove that the probability of a “non-living → living” transition is strictly zero, but only that it is very low. However, living beings, including those least evolved, are the seat of complex cross-regulated processes. How could inert matter change directly, as if by the wave of a magic wand, into such a finely regulated system? In short, the rule that "a living being can only arise from an already living being, while the “non-living → living” transition is impossible" appears to be inescapable. That said, this was the case at Pasteur's time, i.e. at a time when the Earth had become relatively calm; but it leaves open the possibility that something crucial happened on the nascent Earth.
The solar system, including our Earth, arose from the gravitational collapse of a gigantic cosmic cloud of gas and dust approximately 4.568 billion years ago. This was taken as instant zero of the age of the solar system. What followed was a complicated series of violent events during one billion years followed by a cooling period more favorable to the appearance of life [28].

2.2. Ante-Cell Structures

Ante-cells structure would have started at this cooling period to involve organic molecules, at first simple then more and more complex. After up to, perhaps, several hundred million years of this functioning, one or more of these ante-cells would have acquired the organic equipment and the structure of real cells. The steps considered in the classical theories (DNA first or RNA first) are suitable for explaining how this would have happened, and we have proposed the AL RNA ring [23,24,25] as an example of molecular structure marking the transition from a purely mineral behavior to the involvement of organic molecules, by using genomic sequences from multiple species from [29,30,31,32].

2.3. The AL RNA Ring as an Ancestor of the Ribosome?

When considering the primary structure of the tRNA loops of Archaea, the same pentameric sequences occurred first in many species like those belonging to the ancient Archaea studied in Figure 1 showing the following motifs: TGGTA for the D-loop, CTGCCA for the Anti-codon loop and TTCAA for the T-loop [33,34,35,36,37]. Exemples of such small sequences is given in Figure 1 and Table 2.
From Figure 1 and Table 2, it is easy to find that the following 19-nucleotide length consensus sequence defined from tRNA-Gly loops: 5’-AAUGGUACUGCCAUUCAAG-3’ has a secondary hairpin form if we add AUG (as start codon) on the 5’ extremity. This hairpin form is described on Figure 2 right and its ring form on Figure 2 left. This sequence we call AL (for Archetypal Loop or ALpha ring) has some optimal combinatorial properties which suggest that it could have played a catalytic role in primitive peptide-genesis [33,34].
The choice of the tRNA-GlyGCC comes from the fact that its anticodon-loop is the most frequent in archaeal tRNA-Gly [39] (see 11 tRNA-GlyGCC in Table 2 and 246 more in Supplementary Material of [29,30]). The combinatorial properties result from opposite constraints in a min-max optimization problem and can be summarized as follows [34]:
i)
if the ring form has to play a reactive role with respect to the amino-acids in order to facilitate their polymerization (as a “marriage agency”) taking advantage of the affinities between amino-acids and the codons and anticodons of their specific synonymy classes in the genetic code [40], the ring has to be short. If the ring has to survive in a stable hairpin form with minimal free energy, it does present a self-complementarity between its two half-parts.
ii)
Among the rings satisfying the principle “to be as short as possible and containing at least one codon of each amino-acid”, there is no solution for a length below 22 nucleotides. For the length 22, 29520 among 176 1011 solutions contain only one repeated codon AUN, N being G for 52% of the solutions,
iii)
from these 25 rings, 19 encompass both the start and stop codon UGA,
iv)
through calculation of several distances (e.g., circular Hamming distance, permutation distance and edit distance), the ring AL (Figure 2 left) exhibits a minimum average distance as compared to the others. Only this sequence is thus acting as barycenter of the set of the 18 others.
From detailed studies [35,36,37], it appeared that pentamers or hexamers from AL could be remnants on the way to the progressive construction of nucleic acids involved in ribosome building and in mRNA of enzymes involved in ATP metabolism. Thus, following the concentration of these small sequences in many primitive species (Archaea or Bacteria) during the evolution makes it possible to obtain information on what could have happened during a long period when we do not have fossils or any other source of data. The study is continuing now towards the intervention of these small RNAs in the progressive construction of the ancestors of the current ribosome [41,42].
This is done by using the pentamer proximity PpAL to AL of a given sequence of nucleotides, equal to twice the number pAL of standard deviations in the difference between observed and expected numbers of pentamers (chosen from a set of 9 pentamers: ATTCA, TTCAA, TCAAG, CAAGA, AAGAT, AGATG, GATGA, ATGAA, TGAAT located at the head of the hairpin form of AL and easy to fragment from the rest of the hairpin) common between AL and this sequence. This proximity is denoted PpAL (for Pentamer proximity to AL) and supports some current quantitative and qualitative phylogenies proposed in absence of any identified root. This root could be the AL ring, as a possible LUCARN (Last Universal Common Ancestor RNA). On Table 3 and Figure 3, we give the PpAL for functions and molecules of 4 species and PpAL ranking respects both the gradient of seniority (for species) and necessity (for functions).

2.4. A Potential Role of AL for Ante-Cells to Join the Organic World

The AL structure, which has two conformations (ring and hairpin) and is a possible vestige on the way to the construction of ribosomes, already provides solid data on this period without fossils, and we may hope to obtain more in searching for new nucleic oligomers through the study of other nucleic acids and/or other living species [34,35,36,37,38].
One of the major problems of evolution after the appearance of autocatalytic molecular systems as centered on a primitive nucleic oligomer ancestor of the tRNAs and ribosomal RNAs [39,40], is related to the appearance of the cell membrane, which isolates these primitive systems from the outside world and promotes their survival. The small peptides resulting from the interaction between the RNA ring AL and amino acids can give rise to a diffusion of polymerizing peptide material in the vicinity of the initial nucleic structure. We can imagine that such structures segment the space by prefiguring cellular proto-membranes [41]. Similarly, the interaction between nucleo-peptide structures and lipids synthesized in the primitive atmosphere can lead to a space-separating proteo-lipidic organization by reinforcing the initial peptide proto-membranes, in the current mode of trans-membrane peptides fixed in phospho-lipid membrane pores or acting as rafts in a lipid medium locally organized like a liquid crystal [42,43,44,45,46,47,48,49,50,51].

3. Discussion

In [52,53], we have used a new classifier reversible (able to make explicit and give explanation on its reasoning at all the steps of clustering) and agnostic (not using an a priori semantic knowledge on the data) called Maxwell©. On Figure 4, we see the PpAL for the whole genomes of Methanomada Archaea clusterized by the classifier Maxwell© [53] and a phylogenetic tree [54] with indication of the PpAL (in red) respecting the phylogeny ranking. On Figure 5, we observe the same phenomenon for the PpAL of the whole genomes of halophilic Archaea, which give consistency to the method of classification based on the Maxwell© classifier and also pertinence to the use of a proximity to AL based on only 5 pentamers (AGATG, GATGA, GTGGC, TGGCC, GGCCT from the extremities of one of the 18 hairpins respecting the conditions i) to iii) of AL construction), which respects roughly the biological phylogeny previously obtained from the Methyltransferase mRNA sequences clustering of the concerned Archaea species [55].
Figure 6 shows the AL pentamer proximity of the whole genome of extreme halophilic Archaea belonging to the Methanococcus genus (from Methanococcaceae family [56]) and this proximity is in agreement with the clustering obtained using the Maxwell© classifier. Only Thermococcus sibiricus presents a negative proximity, due probably to the fact that it is the target of a virus, whose genetic sequence has contaminated the whole genome of Thermococcus sibiricus [57,58].
The adequacy between the results of a classification based on a choice of mRNAs of ribosomal proteins, an agnostic classification based on the entire genome and calculations of proximity to AL showing relics of its fragments in the genome of current species in connection with their seniority, reinforces the idea that AL structure could have participated as primordial RNA to the mechanisms involved at the origin of life.

4. Materials and Methods

We shall use in the following different public databases (Table 1) for identifying the most frequent sequences of the tRNA loops in different realms of life, as well as for calculating the free energy of some hairpin structures and for counting the occurrence of pentamer motifs in gene sequences.

5. Conclusions

In the ante-cell scenario, there is no straightforward transition from an inert mixture of chemical substances to a living cell. Life on Earth could have appeared as a progressive acquisition of molecular characteristics of real cells from the moment the Earth had cooled enough for organic molecules to exist there. There is no evidence that ante-cells really existed; however, structures such as the RNA ring AL comes at the right time to mark the crucial moment when the processes involved would have reached the frontier between organic world and life realm: RNA rings could have co-existed in equilibrium with DNA hairpins having same nucleotide sequence inside lipidic vesicles containing also amino-acids and sugars [59,60]. The ante-cell scenario is far from answering all the questions that one can ask about life and its origin, but at least it underlines the need to take into account the progressive acquisition of some characteristics of life and to propose a logical way for it to occur.
We propose, as a follow-up to this work, to systematically explore many other species in the kingdoms of archaea, bacteria and eukaryotes, to confirm whether the vital functions examined in this article and others presumed less important for survival, present the same gradient of proximity to the RNA ring AL, according to their supposed ancient (for species) and necessary (for functions) character.

Author Contributions

conceptualization, J.D.; methodology, J.G. and J.D; investigation, H.B., M.J., M.B., H.R., C.B., J.G. and J.D.; data curation, H.B., M.B., H.R. and C.B.; writing—original draft preparation, M.J. and J.D.; writing—review and editing, H.B, M.J., M.B., H.R., C.B., J.G. and J.D.; all authors have read and agreed to the published version of the manuscript.

Funding

This research received no funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the data are obtained from readily available public databases that are cited in the text.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Nutman, A.P.; Bennet, V.C.; Friend, C.R.L.; Van Kranendonk, M.J.; Chivas, A.R. Rapid emergence of life shown by discovery of 3,700-million-year-old microbial structures. Nature 2016, 537, 535–538. [Google Scholar] [CrossRef] [PubMed]
  2. Miller, S.L. A production of amino acids under possible primitive Earth conditions. Science 1953, 117, 528–529. [Google Scholar] [CrossRef] [PubMed]
  3. Darwin, C.; Kebler, L. On the Origin of Species by Means of Natural Selection, or, The Preservation of Favoured Races in the Struggle for life. J. Murray: London, UK, 1859.
  4. Dyson, F. Origins of life. Cambridge University Press: Cambridge, UK, 1999.
  5. Björn, L.O. Stratospheric ozone, ultraviolet radiation, and cryptogams. Biological Conservation 2007, 135, 326–333. [Google Scholar] [CrossRef]
  6. Khodachenko, M.L.; Lammer, H.; Lichtenegger, H.I.M.; Grießmeier, J.-M.; Holmström, M.; Ekenbäck, A. The role of intrinsic magnetic fields in planetary evolution and habitability: the planetary protection aspect. Proceedings of the International Astronomical Union 2008, 4, 283–294. [Google Scholar] [CrossRef]
  7. Fox, G.E. Origin and evolution of the ribosome. Cold Spring Harb. Perspect. Biol. 2010, 2, a003483. [Google Scholar] [CrossRef] [PubMed]
  8. Schrum, J.P.; Zhu, T.F.; Szostak, J.W. The origins of cellular life. Cold Spring Harb. Perspect. Biol. 2010, 2, a002212. [Google Scholar] [CrossRef]
  9. Trevors, J.T.; Saier, M.H. Three Laws of Biology. Water, Air, and Soil Pollution 2010, 205, 87–89. [Google Scholar] [CrossRef]
  10. Lovett, R. A. Tidal heating shrinks the “goldilocks zone. ” Nature 2012, 485, 10601. [Google Scholar] [CrossRef]
  11. Alberts, B.; Johnson, A.; Lewis, J.; Morgan, D.; Raff, M.; Roberts, K.; Walter, P. (2017). Molecular Biology of the Cell. W.W. Norton & Company, New York, USA, 2017.
  12. Ramirez, R. A More Comprehensive Habitable Zone for Finding Life on Other Planets. Geosciences 2018, 8, 280. [Google Scholar] [CrossRef]
  13. Cech, T.R.; Steitz, J.A.; Atkins, J.F. RNA Worlds: New Tools for Deep Exploration; Cold Spring Harbor Laboratory Press, New York, USA, 2019.
  14. Kahana, A.; Schmitt-Kopplin, P.; Lancet, D. Enceladus: First Observed Primordial Soup Could Arbitrate Origin-of-Life Debate. Astrobiology 2019, 19, 1263–1278. [Google Scholar] [CrossRef] [PubMed]
  15. Osinski, G.R.; Cockell, C.S.; Pontefract, A.; Sapers, H.M. The Role of Meteorite Impacts in the Origin of Life. Astrobiology 2020, 20, 1121–1149. [Google Scholar] [CrossRef] [PubMed]
  16. Siraj, A.; Loeb, A. Breakup of a long-period comet as the origin of the dinosaur extinction. Scientific Reports 2021, 11, 3803. [Google Scholar] [CrossRef] [PubMed]
  17. Vlachakis, D.; Chrousos, G.; Eliopoulos, E. On the origins of life: A molecular and a cellular journey driven by genentropy. International Journal of Epigenetics 2021, 1, 7. [Google Scholar] [CrossRef]
  18. Demongeot, J.; Thellier, M. Primitive oligomeric RNAs at the origins of life on earth. Int. J. Mol. Sci. 2023, 24, 2274. [Google Scholar] [CrossRef] [PubMed]
  19. Oparin, A. The origin of life, translation by Ann Synge. In: Bernal, J. D. (ed.), The origin of life. Weidenfeld & Nicolson, London, 1967, pp. 199–234.
  20. Blain, J.C.; Szostak, J.W. Progress towards synthetic cells. Ann. Rev. Biochem. 2014, 83, 615–640. [Google Scholar] [CrossRef] [PubMed]
  21. Joyce, G.F.; Szostak, J.W. Protocells and RNA Self-Replication. Cold Spring Harb. Perspect. Biol. 2018, 10, a034801. [Google Scholar] [CrossRef] [PubMed]
  22. Ding, D.; Zhou, L.; Giurgiu, C.; Szostak, J.W. Kinetic explanation for the sequence biases observed in the nonenzymatic copying of RNA templates. Nucleic Acid Res. 2022, 50, 35–45. [Google Scholar] [CrossRef] [PubMed]
  23. Demongeot, J. Au sujet de quelques modèles stochastiques appliqués à la biologie. Modélisation et simulation. tel-00286222. Université Joseph-Fourier: Grenoble, 1975.
  24. Demongeot, J. Sur la possibilité de considérer le code génétique comme un code à enchaînement. Revue de Biomaths 1978, 62, 61–66. [Google Scholar]
  25. Gardes, J.; Maldivi, C.; Boisset, D.; Aubourg, T.; Demongeot, J. An unsupervised classifier for the whole genome phylogenies, the Maxwell© tool. IJMS 2023, 24, 16278. [Google Scholar] [CrossRef]
  26. Pennetier, G. Un débat scientifique : Pouchet et Pasteur (1858-1868). Actes du Museum d’Histoire Naturelle de Rouen. J. Girieud imprimeur, Rouen, France, 1907.
  27. Pasteur, L. Sur les corpuscules organisés qui existent dans l’atmosphère, examen de la doctrine des générations spontanées. Leçon professée à la Société clinique de Paris, le 19 mai 1861. Imprimerie C. Lahure, Paris, France, 1862.
  28. Meunier, A. La Naissance de la Terre. Dunod, Paris, France, 2014.
  29. GtRNAdb. Available online: http://lowelab.ucsc.edu/GtRNAdb/Lafri3/Lafri3-align.html (accessed on 23 March 2024).
  30. Bioinf. Available online: http://trna.bioinf.uni-leipzig.de/DataOutput/Result (accessed on 23 March 2024).
  31. tRNAviz. Available online: http://trna.ucsc.edu/tRNAviz/ (accessed on 23 March 2024).
  32. NCBI. Available online: https://www.ncbi.nlm.nih.gov/nucleotide?cmd=search (accessed on 23 March 2024).
  33. Demongeot, J.; Besson, J. Code génétique et codes à enchaînement I. C. R. Acad. Sc. III 1983, 296, 807–810. [Google Scholar]
  34. Demongeot, J.; Moreira, A. A circular RNA at the origin of life. J. Theor. Biol. 2007, 249, 314–324. [Google Scholar] [CrossRef] [PubMed]
  35. Demongeot, J.; Norris, V. Emergence of a "Cyclosome" in a primitive network capable of building "infinite" proteins. Life 2019, 9, 51. [Google Scholar] [CrossRef] [PubMed]
  36. Demongeot, J.; Moreira, A.; Seligmann, H. Negative CG dinucleotide bias: An explanation based on feedback loops between Arginine codon assignments and theoretical minimal RNA rings. Bioessays 2021, 43, 2000071. [Google Scholar] [CrossRef] [PubMed]
  37. Demongeot, J.; Thellier, M. Primitive oligomeric RNAs at the origins of life on Earth. IJMS 2023, 24, 2274. [Google Scholar] [CrossRef] [PubMed]
  38. Root-Bernstein, R.; Kim, Y.; Sanjay, A.; Burton, Z.F. tRNA evolution from the proto-tRNA minihelix world. Transcription 2016, 7, 153–163. [Google Scholar] [CrossRef] [PubMed]
  39. Mohanta, T.K.; Mishra, A.K.; Hashem, A.; Abd Allah, E.F.; Khan,A.L.; Al-Harrasi, A. Construction of anti-codon table of the plant kingdom and evolution of tRNA selenocysteine (tRNASec). BMC Genomics 2020, 21, 804.
  40. Moghadam, S.A.; Preto, J.; Klobukowski, M.; Tuszynski, J.A. Testing amino acid-codon affinity hypothesis using molecular docking. Biosystems 2020, 198, 104251. [Google Scholar] [CrossRef]
  41. Lee, M.T. Biophysical characterization of peptide–membrane interactions. Advances in Physics 2018, 3, 1. [Google Scholar] [CrossRef]
  42. Fiore, M.; Strazewski, P. Prebiotic Lipidic Amphiphiles and Condensing Agents on the Early Earth. Life 2016, 6, 17. [Google Scholar] [CrossRef] [PubMed]
  43. Laczano, A.; Oro, J.; Miller, S.L. Primitive Earth environments: organic syntheses and the origin and early evolution of life. Precambrian Research 1983, 20, 259–282. [Google Scholar]
  44. Jordan, S.F.; Nee, E.; Lane, N. Isoprenoids enhance the stability of fatty acid membranes at the emergece of life potentially leading to an early lipid divide. Interface Focus 2019, 9, 20190067. [Google Scholar] [CrossRef] [PubMed]
  45. Cohen, Z.R.; Todd, Z.R.; Wogan, N.; Black, R.A.; Keller, S.L.; Catling, D.C. Plausible Sources of Membrane-Forming Fatty Acids on the Early Earth: A Review of the Literature and an Estimation of Amounts. ACS Earth and Space Chemistry 2023, 7, 11–27. [Google Scholar] [CrossRef]
  46. Kraft, M.L.; Weber, P.K.; Longo, M.L.; Hutcheon, I.D.; Boxwer, S.G. Phase separation of lipid membranes analyzed with high-resolution secondary ion mass spectrometry. Science 2006, 313, 1948–1951. [Google Scholar] [CrossRef] [PubMed]
  47. Raine, D.J.; Norris, V. Lipid domain boundaries as prebiotic catalysts of peptide bond formation. J. Theor. Biol. 2007, 246, 176–185. [Google Scholar] [CrossRef] [PubMed]
  48. Deamer, D. The Role of Lipid Membranes in Life's Origin. Life 2017, 7, 5. [Google Scholar] [CrossRef] [PubMed]
  49. Lancet, D.; Zidovetzki, R.; Markovitch, O. Systems protobiology: Origin of life in lipid catalytic networks. J. R. Soc. Interface 2018, 15, 20180159. [Google Scholar] [CrossRef] [PubMed]
  50. Dalai, P.; Sahai, N. Mineral-Lipid Interactions in the Origins of Life. Trends Biochem Sci. 2019, 44, 331–341. [Google Scholar] [CrossRef]
  51. Subbotin, V.; Fiksel, G. Exploring the Lipid World Hypothesis: A Novel Scenario of Self-Sustained Darwinian Evolution of the Liposomes. Astrobiology 2023, 23, 344–357. [Google Scholar] [CrossRef] [PubMed]
  52. Gardes, J.; Maldivi, C.; Boisset, *!!! REPLACE !!!*; Aubourg, T.; Demongeot, J. ; Demongeot, J. An unsupervised classifier for the whole genome phylogenies, the Maxwell© tool. IJMS 2023, 24, 16278. [Google Scholar] [CrossRef]
  53. Demongeot, J.; Gardes, J.; Maldivi, C.; Boisset, D.; Boufama, K; Touzouti, I. Genomic phylogeny by Maxwell®, a new classifier based on Burrows-Wheeler transform. Computation 2023, 11, 158. [Google Scholar] [CrossRef]
  54. Adam, P.; Borrel, G.; Brochier-Armanet, C. The growing tree of Archaea: new perspectives on their diversity, evolution and ecology. ISME J. 2017, 11, 2407–2425. [Google Scholar] [CrossRef] [PubMed]
  55. Ouellette, M.; Gogarten, J.P.; Lajoie, J.; Makkay, A.M.; Papke, R.T. Characterizing the DNA Methyltransferases of Haloferax volcanii via Bioinformatics, Gene Deletion, and SMRT Sequencing. Genes 2018, 9, 129. [Google Scholar] [CrossRef] [PubMed]
  56. Mardanov, A.V; Ravin, N.V.; Svetlitchnyi, V.A.; Beletsky, A.V.; Miroshnichenko, M.L; Bonch-Osmolovskaya, E.A.; Skryabin, K.G. Metabolic versatility and indigenous origin of the archaeon Thermococcus sibiricus, isolated from a siberian oil reservoir, as revealed by genome analysis. Appl. Environ. Microbiol. 2009, 75, 4580–4588. [Google Scholar] [CrossRef]
  57. Gorlas, A.; Koonin, E.V.; Bienvenu, N.; Prieur, D.; Geslin, C. TPV1, the first virus isolated from the hyperthermophilic genus Thermococcus. Environ. Microbiol. 2012, 14, 503–516. [Google Scholar] [CrossRef] [PubMed]
  58. Nikolic, N.; Smole, Z.; Krisko, A. Proteomic properties reveal phyloecological clusters of Archaea. PLoS One 2012, 7, e48231. [Google Scholar] [CrossRef] [PubMed]
  59. Gavette, J.V.; Stoop, M.; Hud, N.V.; Krishnamurthy, R. RNA-DNA Chimeras in the Context of an RNA World Transition to an RNA/DNA World. Angew. Chem. Int. Ed. Engl. 2016, 55, 13204–13209. [Google Scholar] [CrossRef] [PubMed]
  60. Todd, Z.R.; Cohen, Z.R.; Catling, D.C.; Keller, S.L.; Black, R.A. Growth of Prebiotically Plausible Fatty Acid Vesicles Proceeds in the Presence of Prebiotic Amino Acids, Dipeptides, Sugars, and Nucleic Acid Components. Langmuir 2022, 38, 15106–15112. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Consensus content of proto-tRNA minihelices of tRNA-GlyGCC from 24 Archaea of [38].
Figure 1. Consensus content of proto-tRNA minihelices of tRNA-GlyGCC from 24 Archaea of [38].
Preprints 107348 g001
Figure 2. Left: ring form of AL sequence. Right: hairpin form of AL sequence.
Figure 2. Left: ring form of AL sequence. Right: hairpin form of AL sequence.
Preprints 107348 g002

Species Articulation D-loop Anticodon-loop Ty-loop
Methanococcus maripaludis GCGGCTTTGATGTAG ACTGGTATCATACGGCCCTGCCACGGCCGACACCCGGGTTCAAATCCCGGAGGCCGCA
Methanococcus vannielii GCGGCTTTGATGTAG ACTGGTATCATACGGCCCTGCCACGGCCGACACCCGGGTTCAAATCCCGGAGGCCGCA
Methanococcus voltae GCGGCCTTGATGTAG TGGTATCATACGGCCCTGCCACGGCCGATACCCGGGTTCAAATCCCGGAGGCCGCA
Methanocaldococcus jannaschii GCGGCCTTGGTGTAG CCTGGTAACACACGGGCCTGCCACGCCCGGACCCCGGGTTCAAATCCCGGAGGCCGCA
Halorhabdus utahensis GCGACGGTGGTGTAGTGGTATCACAGGACCCTGCCACGGTCCTAACCCGAGTTCAAATCTCGGCCGTCGCA
Petromyzon marinus (lamprey) GCATCGGTGGTTCAGTGGTAGAAATCTCGCCTGCCACGCGGGAGGCCCGGGTTCAATTCCCGGCCGATGCA
Danio rerio (zebrafish) ACATTGGTGGTTCAGTGGTAGATTTCTCGCCTGCCACGTGGGAGGCCCGGGTTCAATTCCCGGCCAATGCA
Strongylocentrotus purpuratus GCATTGGTGGTTCAGTGGTAGAATTCTCGCCTGCCACGCGGGGGACCCGGGTTCAATTCCCGGCCAATGCA
Loxodonta africana (elephant) GCATTGGTGGTTCAGTGGTAGAATTCTCGCCTGCCACGTGGGAGGCCTGGGTTCAATTCCCAGCCAGTTCT
Callithrix jacchus (marmoset) GCATGGGTGGTTCAGTGGTAGAATTCTCGCCTGCCACGCGGGAGTCCTGGGTTCAATCCCCGGCCCACGCA
Arabidopsis thaliana GCACCAGTGGTCTAGTGGTAGAATAGTACCCTGCCACGGTACAGACCCGGGTTCAATTCCCGGCTGGTGCA
Medicago truncatula GCACCAGTGGTCTAGTGGTAGAATAGTACCCTGCCACGCTACAGACCCGGGTTCAATTCCTGGCTGGTGCA
Figure 3. A: Mean AL pentamer proximity (in blue) for the four species of Table 3 [29,30]. B: AL ring.
Figure 3. A: Mean AL pentamer proximity (in blue) for the four species of Table 3 [29,30]. B: AL ring.
Preprints 107348 g003
Figure 4. Left: Maxwell© clustering of Archaea. Right: pAL proximity (in red) of Methanomada and Thermococci on the Archaea phylogenetic tree based on ribosomal proteins (L7-L12, L30, S4) [54].
Figure 4. Left: Maxwell© clustering of Archaea. Right: pAL proximity (in red) of Methanomada and Thermococci on the Archaea phylogenetic tree based on ribosomal proteins (L7-L12, L30, S4) [54].
Preprints 107348 g004
Figure 5. Haloferacaceae A: phylogenetic tree obtained using Methyltransferase mRNA sequences (after [55]) with indication (in red) of a proximity to AL based on 5 pentamers; B: consensus secondary structure tRNA-Gly [32]; C: consensus primary sequence tRNA-Gly [29,30].
Figure 5. Haloferacaceae A: phylogenetic tree obtained using Methyltransferase mRNA sequences (after [55]) with indication (in red) of a proximity to AL based on 5 pentamers; B: consensus secondary structure tRNA-Gly [32]; C: consensus primary sequence tRNA-Gly [29,30].
Preprints 107348 g005
Figure 6. A: Indication of pAL (in red) for extreme halophilic Archaea on a phylogenetic tree obtained using a non-redundant set of proteomic features [58]; B: Methanococcus phylogeny obtained using the Maxwell© classifier and based on the whole genome.
Figure 6. A: Indication of pAL (in red) for extreme halophilic Archaea on a phylogenetic tree obtained using a non-redundant set of proteomic features [58]; B: Methanococcus phylogeny obtained using the Maxwell© classifier and based on the whole genome.
Preprints 107348 g006
Table 2. Some species whose tRNA-GlyGCC primary sequence have the same constant motifs in their loops [29,30].
Table 2. Some species whose tRNA-GlyGCC primary sequence have the same constant motifs in their loops [29,30].
Species Articulation D-loop Anticodon-loop Ty-loop
Methanococcus maripaludis GCGGCTTTGATGTAG ACTGGTATCATACGGCCCTGCCACGGCCGACACCCGGGTTCAAATCCCGGAGGCCGCA
Methanococcus vannielii GCGGCTTTGATGTAG ACTGGTATCATACGGCCCTGCCACGGCCGACACCCGGGTTCAAATCCCGGAGGCCGCA
Methanococcus voltae GCGGCCTTGATGTAG TGGTATCATACGGCCCTGCCACGGCCGATACCCGGGTTCAAATCCCGGAGGCCGCA
Methanocaldococcus jannaschii GCGGCCTTGGTGTAG CCTGGTAACACACGGGCCTGCCACGCCCGGACCCCGGGTTCAAATCCCGGAGGCCGCA
Halorhabdus utahensis GCGACGGTGGTGTAGTGGTATCACAGGACCCTGCCACGGTCCTAACCCGAGTTCAAATCTCGGCCGTCGCA
Petromyzon marinus (lamprey) GCATCGGTGGTTCAGTGGTAGAAATCTCGCCTGCCACGCGGGAGGCCCGGGTTCAATTCCCGGCCGATGCA
Danio rerio (zebrafish) ACATTGGTGGTTCAGTGGTAGATTTCTCGCCTGCCACGTGGGAGGCCCGGGTTCAATTCCCGGCCAATGCA
Strongylocentrotus purpuratus GCATTGGTGGTTCAGTGGTAGAATTCTCGCCTGCCACGCGGGGGACCCGGGTTCAATTCCCGGCCAATGCA
Loxodonta africana (elephant) GCATTGGTGGTTCAGTGGTAGAATTCTCGCCTGCCACGTGGGAGGCCTGGGTTCAATTCCCAGCCAGTTCT
Callithrix jacchus (marmoset) GCATGGGTGGTTCAGTGGTAGAATTCTCGCCTGCCACGCGGGAGTCCTGGGTTCAATCCCCGGCCCACGCA
Arabidopsis thaliana GCACCAGTGGTCTAGTGGTAGAATAGTACCCTGCCACGGTACAGACCCGGGTTCAATTCCCGGCTGGTGCA
Medicago truncatula GCACCAGTGGTCTAGTGGTAGAATAGTACCCTGCCACGCTACAGACCCGGGTTCAATTCCTGGCTGGTGCA
Table 3. AL pentamer proximity (PpAL) for Homo sapiens (HS), Saccharomyces cerevisiae (SC). Methanococcus voltae (Mv) and isolate (Mmi).
Table 3. AL pentamer proximity (PpAL) for Homo sapiens (HS), Saccharomyces cerevisiae (SC). Methanococcus voltae (Mv) and isolate (Mmi).
Molecule Species PpAL=2PAL/s no N ne (se) pAL Mean PpAL
rprotein L18 HS 7.4 13 552 4.9 (2.2) 3.7s 8.3
SC 10 16 557 4.9 (2.2) 5s
Mv 6 12 581 5.1 (2.26) 3s
Mmi 9.8 16 578 5 (2.25) 4.9s
rRNA 5S HS 2 2 117 1 (1) 1s 10.6
SC 2 2 117 1 (1) 1s
Mv 26.2 14 112 0.9 (1) 13.1s
Mmi 12.2 7 111 0.98 (0.99) 6.1s
Gly-tRNA ligase HS 8.8 36 2015 17.7 (4.2) 4.4s 11.05
SC 13 45 2000 17.6 (4.2) 6.5s
Mv 11 37 1751 15.4 (3.9) 5.5s
Mmi 11.4 37 1721 15.1 (3.9) 5.7s
DNA polymerase HS 5.8 21 1286 11.3 (3.36) 2.9s 12.43
SC 9.5 36 1895 16.6 (4) 4.75s
Mv 16.2 59 2472 21.7 (4.6) 8.1s
Mmi 18.2 30 749 6.6 (2.6) 9.1s
Translocase HS 5.4 42 3178 27.9 (5.3) 2.7s 14.5
SC 26.8 130 4856 42.7 (6.53) 13.4s
Mv 19.6 76 2969 26 (5.1) 9.8s
Mmi 6 22 1325 11.6 (3.4) 3s
ATPase HS 7 29 1755 15.4 (3.9) 3.5s 15.3
SC 12.8 42 1850 16.26 (4) 6.4s
Mv 13.6 21 608 5.34 (2.3) 6.8s
Mmi 27.8 97 2978 26.2 (5.1) 13.9s
Helicase HS 12.6 48 2255 19.8 (4.45) 6.3s 18.15
SC 26.6 95 3029 26.6 (5.16) 13.3s
Mv 15.2 57 2465 21.7 (4.65) 7.6s
Mmi 18.2 60 2236 19.6 (4.4) 9.1s
tRNA-Gly HS 81 18 22 0.19 (0.44) 40.5s 84.4
SC 76.4 17 22 0.19 (0.44) 38.2s
Mv 85.5 19 22 0.19 (0.44) 42.8s
Mmi 94.6 21 22 0.19 (0.44) 47.3s
AL 100 22 22 0.19 (0.44) 50s 100
Table 1. Web sites for data sources, accessed on 15 April 2023 [29,30,31,32].
Table 1. Web sites for data sources, accessed on 15 April 2023 [29,30,31,32].
tRNA database http://lowelab.ucsc.edu/GtRNAdb/Lafri3/Lafri3-align.html http://trna.bioinf.uni-leipzig.de/DataOutput/Result
Secondary structure http://trna.ucsc.edu/tRNAviz/
Gene sequence https://www.ncbi.nlm.nih.gov/nucleotide?cmd=search
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated