The complex 3D structures of modern ordered proteins represent the result of lengthy molecular evolution. What then one can say about structures of the primordial proteins? It is clear that the chances for the first polypeptides that appeared in the primordial soup of the primitive Earth to have unique 3D structures are negligibly slim. Instead, with a very high probability, such polypeptides were intrinsically disordered. We can find indirect clues supporting the validity of this hypothesis while looking at some known facts. Although the Earth formed about 4.5 billion years ago and became cool enough to potentially spawn life around 4.2 billion years ago, the first fossils are dated to 3.85 billion years ago, raising a question of what was happening in those years in between. At the beginning of the 20th century, Alexander I. Oparin (1894-1980) [
58] and John Burdon Sanderson Haldane (1892-1964) [
59] proposed a model that constitutes a cornerstone of the theory of molecular evolution according to which some organic molecules could have been synthesized spontaneously from the gases of the primitive Earth atmosphere. Such abiotic production of organic molecules would require reducing atmosphere and ample supply of energy in a form of lightning and/or ultraviolet light. The validity of this idea was demonstrated thirty year later, when Stanley Lloyd Miller (1930-2007) and Harold Clayton Urey (1893-1981) conducted elegant experiments deservedly known now as the Miller-Urey experiments and showed that placing the non-organic compounds, such as water vapor, hydrogen, methane, and ammonia, which were believed to represent the major components of the atmosphere of the primordial Earth into a closed system and running a continuous electric current through the system, to simulate lightning storms believed to be common on the early Earth results in the appearance of various organic molecules including some amino acids [
60,
61]. Importantly, only about half of the modern amino acids was synthesized in these Miller-Urey experiments [
60,
61] suggesting that the first proteins on Earth may have contained only a few amino acids.
In line with these considerations, the biosynthetic theory of the genetic code evolution suggests that the genetic code evolved from a simpler form encoding fewer amino acids [
62], likely in parallel with the invention of biosynthetic pathways for new and chemically more complex amino acids [
63]. Peculiarities of the redundancy of the standard genetic code, where 20 amino acids are encoded by 64 codons, provide some support to the validity of this hypothesis. Here, despite the fact that the redundant codons encoding one amino acid may differ in any of their three positions, only the third position of some of such codons may be fourfold degenerate; i.e., represents a position, where all possible nucleotide changes are synonymous as they do not change the amino acid. If these peculiarities of the modern genetic code reflect its evolution, then it is likely that a doublet code preceded the triplet code, indicating that the third position was not used at all in the early genetic code. This means that this early code used 4×4=16 codons, thereby encoding 16 or fewer amino acids, if a termination codon is taken into account [
64], indicating that evolutionary old and new amino acids can be potentially discriminated. These and many other observations were used by Edward N. Trifonov to propose the following consensus order of the appearance of the 20 amino acids on the evolutionary scene: G/A, V/D, P, S, E/L, T, R, N, K, Q, I, C, H, F, M, Y, and W [
65]. Let’s look at this scale from the view point of protein intrinsic disorder, where residues can be ranged based on their order-promoting (or foldability) potential [
10,
13,
24,
25,
26,
27,
28]. In fact, there are three scales that can provide ranking of the tendencies of amino acid residues to promote order or disorder. These are the Top-IDP scale (W, F, Y, I, M, L, V, N, C, T, A, G, R, D, H, Q, K, S, E, and P) [
23], the DisProt-based scale (C, W, Y, I, F, V, L, H, T, N, A, G, D, M, K, R, S, Q, E, and P) [
66], and the scale based on the average number of contacts per residue in the ordered proteins (W, F, V, I, L, M, V, C, H, R, T, Q, N, S, K, E, D, A, P, and G) [
67].
Figure 1 represents comparison of these scales with the amino acid novelty scale proposed by Trifonov and shows that typically, older residues (e.g., G, D, E, P, and S) have a strong tendency to be disorder-promoting, whereas many newer amino acids (e.g., C, W, Y, and F) tend to be order-promoting.
There are also some other facts that can provide further support to this idea. Since during the early stages of evolution primordial Earth was likely hotter than in our days, more stable codon-anticodon interactions (in the absence additional stabilizing interactions) would be more favorable under these early conditions with presumably higher temperatures [
65]. Therefore, thermostability of the codons (measured as melting enthalpies (kcal/M) of the dinucleotide stacks corresponding to the first and second codon positions [
68]) should have at least some correlation with the amino acid novelty scale.
Figure 3A shows that such correlation is indeed observed, as early amino acids are typically encoded by more thermostable codons. Furthermore,
Figure 3B shows that there is also inverse correlation between the codon thermostability and the disorder-promoting capability of amino acids, with disorder-promoting residues being encoded by more thermostable codons. One can also add another angle here and bring into consideration residue buriability, which provides a quantitative measure of the driving force for the burial of an amino acid residue in proteins and thereby contributes to the conformational stability of ordered proteins [
69].
Figure 3C shows that the codon thermostability is inversely related to the buriability of the residues encoded by these codons, whereas
Figure 3D illustrates the presence of a correlation between the buriability and novelty of residues, where the old residues are expected to be less buriable, whereas high buriability is characteristic to new residues. Finally,
Figure 3E shows that the disorder-promoting residues are less buriable than the order-promoting residues.
Taken together, these observations indicate that the primordial polypeptides were intrinsically disordered, as evolutionary old amino acids, being encoded by more thermostable codons, were less buriable and mostly disorder-promoting. Although it is rather unlikely that these disordered primordial polypeptides possessed high catalytic activity [
70], undoubtedly they played important roles in the origin of life and were crucial players in early evolution as well. In fact, as per the RNA world theory, the enzymatic activity evolution involved a transfer of catalytic power from catalytic RNAs (known as ribozymes, with an exceptional illustrative example being given by a ribosome, which is an RNA enzyme actually catalyzing the formation of the peptide bonds during protein translation, and which was defined as “a creature with a hundred of waggly tails” since its stability is supported by numerous ribosomal proteins, most of which are disordered in the unbound state and fold at binding to ribosome [
71]) to ribonucleoproteins (RNP) and only then to proteins.[
72] Based on these premises, in an organism which was the first to invent protein synthesis, the first proteins would be IDPs with some nonspecific RNA chaperone activities rather than specific catalysts [
70,
73]. However, in the RNA world, where misfolding-prone RNA [
74,
75] was used for both information storage and catalysis [
76], the presence of such disordered RNA chaperones would be highly beneficial to their carriers providing them a significant selective advantage. Furthermore, the transferring of the enzymatic activity from RNAs (ribozymes) to proteins was a logical evolutionary step determined by the higher stability of protein structures than RNA structures and by the dramatic increase in the variability of physicochemical properties of amino acids in comparison with those of nucleotides. Since stable structure represents an important prerequisite for the proper spatial arrangement of catalytic residues, which is needed for the efficient catalysis [
77], transferring the catalytic activities to proteins generated strong evolutionary pressure towards proteins with the well-folded structures.