1. Introduction
Genera of the
Gramineae,
Oryza, does not only provide food for half of the world's population but is also used as a model plant by the scientific community. The genus
Oryza contains about 27 species but is remarkable in the diverse ecological adaptations of its species, in total 11 genome types; the diploid six (AA, BB, CC, EE, FF, and GG), and tetraploid five (BBCC, CCDD, HHJJ, HHKK, and KKLL)[
1,
2]. For thousands of years, Rice has grown uniquely in sustainable high-output agroecosystems. In relation to genera containing other cereals,
Oryza occupies a distinct phylogenetic position in a separate subfamily, the
Ehrhartoideae [
3]. Species in the genus
Oryza and other genera closely related to
Oryza have been extensively studied either because of their agronomically useful traits in rice genetic improvement (wild species in
Oryza and
Porteresia) or because of their economic value as part of cuisine (Zizania) and forage (Leersia)[
4]. The origin and diversification of this tribe, in particular the origin of the genus
Oryza and its divergence, remains largely unclear [
5].
Protein production occurs on Ribosomes in all living cells. Ribosomes exhibit substantial differences across species, although universally conserved[
6,
7]. Some variations are observed in ribosomal RNAs (
rRNAs) which play the role of central interface for hundreds of proteins. These proteins are ribosome assembly factors (AFs) and ribosomal proteins (RPs) which are highly variable in terms of length and sequence in various species [
8,
9,
10]. Interestingly, eukaryotic RNAs have species-specific expansion segments (ESs) in comparison to prokaryotic rRNAs which are the hot spot of variations among different eukaryotic species[
11,
12]. Furthermore, in prokaryotes, ES/ES-like segments were also reported which are not common in eukaryotic species even heterogeneous
rRNAs with differential expression of sequences have also been observed in various eukaryotic species, e.g., Plasmodium, zebrafish, mice, and Homo sapiens[
10,
13].
The duplication of single genes or of whole genomes followed by functional gene diversification are important processes that lead to evolutionary innovation[
14]. The large proportion of genes with no recognizable homolog in the genome of the diploid
Oryza sativa spp. indica suggests that the rice genome had been dupli cated one or several times during evolution [
15] and that
Oryza sativa as vertebrates [
16] and most other angiosperms [
17] may be degenerate, ancient polyploids. The contribution of hybridization, autopolyploidy, or allopolyploidy to the origin of the rice genome could be not estimated yet, because rice genes show unique gradients in GC content, codon, and amino acid usage which confound genome analyses based on comparisons of protein-encoding regions[
18]. However, in some cases, the contribution of interspecies hybridization to the origin of a particular species could be also traced back by inference of the evolutionary history of its conserved
rRNA genes. The highly conserved genes of 16S-18S and 23S-28S RNA genes are usually contained head to tail in repeated tandem units and evolve in a concerted manner. Within most species, they have nearly identical sequences among individual copies due to homogenization processes, which allowed them to treat them as single-copy genes in a phylogenetic context. Follo wing interbreeding between two different species,
xenologous rRNA genes from one parent are usually epigenetically silenced because of nucleolar dominance and probably deleted later. Alternatively,
xenologous rRNA genes may be also converted to one sequence type [
19], or as in
Quercus robur and
Quercus petraea, parental rDNA families may be even maintained in the nucleus[
20].
Genetic distance is the measurement of evolutionary changes between sequences from different organisms and is calculated for a pair of sequences by simply counting the number of nucleotides or amino acids that differ between them (Morgenstern et al. 2015). A small genetic distance between two sequences may suggest a recent common ancestor but is also consistent with a slower rate of sequence change and a more ancient common ancestor. Evolutionary rates depend on a combination of factors: generation time, population size, metabolic rate, the efficacy of DNA repair, and the degree to which mutations are beneficial or deleterious, all of which may vary among species. Often paleontology or biogeography can provide a date for one or more points in a phylogeny, which are then used to “calibrate” the timescale for the rest of the phylogeny[
21]. Remarkably, sample sequencing at different times is the estimation of evolution which requires a faster evolutionary rate and/or negligible sampling times [
22]. Therefore, evolution requires the usual phylogeny assuming that all its branches evolve at the same rate following a static molecular clock of sequence change [
23] calculating the origin time and fossils existence of the organisms.
We are interested in understanding the evolution of
rRNA genes due to the uneven distribution of crossing over which evolves coordinately within but independently between species [
24]. Here, we have undertaken the functional characterization of unusual ribosomal gene sequences from
Oryza species that showed great diversity but are absent in the present genomic sequence. We also estimated the divergence between the
rRNA families in rice. Moreover, we observed that rice rDNA homogenization is not complete and provides an explanation to the introgression of distinct
rRNA gene families into ancestral rice before speciation. Furthermore, our study showed that the large subunit (LSU) of ribosomal genes of all families are expressed depending on N fertilization of plants.
The introduction should briefly place the study in a broad context and highlight why it is important. It should define the purpose of the work and its significance. The current state of the research field should be carefully reviewed and key publications cited. Please highlight controversial and diverging hypotheses when necessary. Finally, briefly mention the main aim of the work and highlight the principal conclusions. As far as possible, please keep the introduction comprehensible to scientists outside your particular field of research. References should be numbered in order of appearance and indicated by a numeral or numerals in square brackets—e.g., [
1] or [
2,
3], or [
4,
5,
6]. See the end of the document for further details on references.
2. Materials and Methods
DNA and RNA extraction
Oryza officinalis and 17
Oryza species,
O. alta (ACC105143, with IRGC accession number),
O. australiensis (ACC100882),
O. glaberrima (TOG5674),
O. grandiglumis (ACC105669),
O. granulata (ACC102119),
O. latifolia (ACC100190),
O. longiglumis (ACC105148),
O. longistaminata,
O. malampuzhaensis (ACC105329),
O. minuta (ACC101141),
O. nivara (ACC105763),
O. officinalis (ACC101399),
O. punctata (ACC105690),
O. rhizomatis (ACC105432),
O. ridleyi (ACC100821),
O. rufipogon (ACC106423),
O. sativa subsp. indica (IR36), were selected for further analysis. Plants including three-way hybrid
Zea mays were grown in unfertilized rice field soil. For some experiments, plants were fertilized with 2 g NH
4NO
3 per kg soil. DNA and
rRNA preparations from ribosomes were extracted from fresh leaves or roots kept in liquid nitrogen according to standard methods [
25].
Gene sequencing
28S rRNA genes corresponding to the positions 184 to 1101 in the Saccharomyces cerevisiae 28S rRNA gene were amplified by PCR with forward primer 28f1 (5´-GAC CCC AGG TCA GGC GGG ACT ACC-3´) and reverse primer 28r1 (5´-GCT ATC CTG AGG GAA ACT TCG GAG G-3´) and sequenced with the ALF Express (Pharmacia, Uppsala, Sweden (SI-1). For specific amplification of a partial RDF2 rRNA tandem unit we used primers NS7 and 98rev. PCR conditions were as follows: initial denaturation at 95°C for 2 min, 30 cycles with denaturation at 95°C for 1 min, annealing at 60°C for 1 min, and extension at 72°C for 1 min, with a final denaturation at 72°C for 10 minutes. PCR products were cloned into the TOPO vector (Cat. No. K4500-40, Invitrogen Co.) according to the manufacturer‘s instructions and sequenced with standard primers. Reverse transcription PCR (RT-PCR) on total RNA or ribosomal RNA were done with RT-PCR beads (Amersham Pharmacia Biotech, Freiburg, Germany) following the manufacturer‘s protocol. Reverse transcription was carried out with primer 25r1. Direct sequencing of cDNA PCR products was carried out with forward primer 283fCy5 (5´-GCA (AG) CC CAA AT (CT) (AG) GG (ACT) G (AG) TAA AC-3´) and reverse primer 406rCy5 (5´-CA (AC) GCA CT (GCT) TTT GAC TCT CTT TTC-3´). For specific amplification of cDNA products of RC1, RC2 and RC3 rDNA families, the following primer pairs for nested PCR were used 98f2 (5´-CGC ACC GTT CGA ACT GTA GTC-3´) and 98r2 (5´-GTT ACA GCG TGG CAC CCC AAG G-3´), 914f (5´-GCC CAA CGT GAA AAT CGG GCA G-3´) and 914r (5´-GTA TCA CTT TGA GCC TCC ACC-3´), 931f (5´-CCC TAA TAA CCG AAT TGT AGT CTG G-3´) and 931r (5´-CTA GAT GGT TCG ATT GGT CTC ATC C-3´), respectively (SI-1). PCR conditions were as given above. Additionally, for O. officinalis and for O. longistaminata genomic libraries were constructed in the lambda ZAP express vector (Stratagene, San Diego, CA) and screened with 5’-digoxigenin labeled primer 28f1. The obtained 94 sequences were deposited in GenBank under accession numbers AF363835 to AF363923, AF375550, AF375551, AY097331, and AY097327 to AY097329.
Probes (RC and RDF) labeled and northern hybridization
Three specific probes RCp1, RCp2, and RCp3 digested from 28S rDNA clones (AF363841, AF363902, and AF363882 representing RC1, RC2, and RC3, respectively) with Ahd I or Bsm I and resulting 225, 261 and 276 bp fragments, respectively. Furthermore, RDF1, RDF2, and RDF3 were digested each with Ahd I or Bsm I and resulting 225 to 276 bp fragments corresponding to nucleotides 7-232, 7-268, or 7-283 of equivalent PCR products. These all were labeled with digoxigenin by random primed labeling. Labeled fragments correspond to variable regions B10-B18 of the secondary structure model of the Saccharomyces cerevisiae 28S rRNA gene. The RDF2 18S probe is a 3’ and 5’ with digoxigenin end labeled primer and corresponds in length to the sequence (SI-2B). Probes were hybridized at high stringency and visualized by CDP-Star (Roche, Mannheim, Germany) on an X-ray film according to the manual’s guide.
Terminated restriction fragment length polymorphism (T-RFLP)
PCR products were obtained from the cDNA templates with primer 98f2 with 5'-labeled Cy5. PCR products purified after agarose gel electrophoresis were digested with restriction enzymes and analyzed on an ALF Express automatic sequencer (Pharmacia, Uppsala, Sweden).
Sequences data analysis
DNA sequences were aligned by the Ribosomal Database II 28S (
http://rdp.cme.msu.edu/cgis/seq_align.cgi?su=LSU) and the MultAlin alignment tools (
http://prodes.toulouse.inra.fr/multalin/multalin.html). The alignment producing the shortest tree was used for (equal weighting) maximum parsimony, neighbor-joining, and maximum likelihood analyses with PAUP version 4.0b2a [
5]. Parsimony jackknifing (1000 replicates) with TBR branch swapping, 10 random sequence additions, and gaps treated as missing data were used to provide estimates for internal support of clades. Neighbor-joining trees were inferred from maximum likelihood distances and 1000 bootstrap re-samplings. Maximum likelihood trees (1000 replicates) were estimated by a heuristic search procedure using the GTR+G+I and a four-category discrete gamma model of among-site rate variation.
Average distances were estimated with the software package PHYLTEST, version 2.0 [
26]. The gamma distribution shape parameter alpha was estimated from the data set using PAUP version 4.0 [
27].
4. Discussion
The relationships of species within the genus
Oryza which consists of 16 diploid and 7 allotetraploid species could not be resolved in 28S
rRNA genes constructed phylogenetic tree. In the diploid species,
O. longistaminata,
O. officinalis, or
O. australiensis from Africa, Asia, and Australia, respectively, three rDNA families RC1, RC2, and RC3 were detected within the same species. As in
Quercus, the intra- and interspecific sequence differences within the clades were usually very low, which was explained for the two oak species by ongoing homogenization within but not between rDNA families [
19]. The same observation is exhibited in rice. Some of the sequences on these branches namely from
O. officinalis,
O. sativa, and
O. alta were identical to sequences from other species (
O. longistaminata,
O. longiglumis, and
O. glaberrima respectively), which may be explained by species interbreeding, or very ancient speciation. In contrast to the
Quercus species, which often hybridized with each other [
36], natural crosses between wild rice species are unknown [
37]. However, natural hybridization is common between wild rice and the worldwide cultivated
O. sativa [
38]. Species interbreeding as the reason for very similar or even identical 28S
rRNA gene sequences in different species of
Oryza can be excluded because comparative genomic mapping with AFLP markers had shown pre viously that all rice species have diverged independently from each other [
39]. Recent speciation can be excluded because the phylogenetic analyses of the AFLP markers clearly indicated that the evolution of
Oryza followed a polyphyletic pattern with eight monophyletic groups of species including those with identical sequences being separated from each other already early in evolution [
40]. Furthermore, the finding of RC1 homologs in the draft genome sequences of O. sativa ssp. Indica [
41] and japonica [
42] confirmed the presence of RC1 in this species. Comparing 28S sequence AF363845 of RC1 rDNA family from O. sativa L. spp. indica IR36 to the draft genome sequence of the same subspecies by means of BLASTN gave a hit with 99.6% identity to the blasted 851 nucleotides in contig 21607 (
http://btn.genomics.org.cn /blast/blast.php?name=rice) and a hit with 98% identity in contig HTC1232751301.1.1 when this sequence was blasted against the genome sequence of
O. sativa L. ssp. japonica (
http://www.tmri.org /index.html). No other high hits (>90% identity) were found, suggesting that RC1 genes are either not high copy genes or have not been sequenced or assembled yet. However, due to the high signal strength observed in Southern hybridization (
Figure 2) at short exposure times (<5 min), the possibility that RC1-3 genes represent single copy genes can be excluded.
Sequence analyses of a partial PCR-amplified RDF2 tandem unit from O. officinalis (SI-2B) identified a unique truncated 18S rRNA gene in standard head to tail orientation upstream of the RDF2 18S rRNA gene. We found two distinct INDELs (insertion/deletion) in the P-site required for tRNA binding, one 48-base-pair (bp) deletion and one 14 bp insertion (SI-2A). Southern (SI-2C) and Northern (SI-2D) hybridization with a specific probe, revealed that this unique truncated 18S rDNA is widespread in rice and occurs even in maize. This result confirms that in the last common ancestor of maize, rice and all other cereals 50 Mya, this unique truncated 18S rRNA gene was already present, supporting the ancient origin of the rDNA families.
These results showed unambiguously, that in the diploid genomes of
O. officinalis,
O. longistaminata,
O. glaberrima,
O. sativa, and
O. rufipogon all three rDNA families are present. Genomic southern hybridization also indicated that the rDNA families are more widespread in the genus
Oryza than the initial PCR results suggested. The high diversity of 28S
rRNA genes is consistent with a high variability in the positions of rDNA loci on the chromosomes in the genus
Oryza [
43]. This variability may be even higher than expected from fluorescence in situ hybridization, because for example in
O. officinalis only three 28S rDNA clusters on chromosomes 4, 9 and 11 have been located [
43] as opposed to the 4 highly diverged 28S rDNA sequences detected in our study.
Our results also showed that the rDNA families are ancient and predate the species split in the genus
Oryza, because 28S gene sequences of several rice species were distributed among several rDNA families. The very little intraspecific sequence variability (
Figure 1) within RC1-3 in combination with the high signal strength in southern hybridization (
Figure 2) suggests high rates of concerted evolution within each rDNA family in spite of a large sequence divergence among families. In the
rRNA multigene family of eukaryotes arrays of paralogous gene copies with low sequence divergence are typically homogenized by concerted evolution to nearly identical sequences within a species. Therefore, the presence of highly diverged copies within a multigene family would indicate that recombination between arrays is somehow suppressed, which may be attributed to a high sensitivity of the mismatch repair system to high sequence divergence [
44] or to a non-telomere location on the chromosome [
45]. This would suggest that diverged members of
rRNA multigene families within a species are always
xenologous, because a paralogous origin would be difficult to explain otherwise. Accordingly, the most plausible explanation for the evolutionary origin of the divergent rDNA families in rice as in
Quercus [
19] is that they were brought together by an ancient hybridization event.
In order to get a clue about the timing of this evolutionary event we tried to detect the
xenologous rDNA families in another grass. The well-established phylogeny on the grass family and molecular-clock-based estimates indicate that maize and rice share their last common ancestor with the vast majority of the grass family and emerged 50 My ago, respectively [
46]. These evolutionary relationships make maize an excellent candidate to evaluate the distribution of the rDNA families of rice within grasses. Rice species with several identical rDNA families in their genome, like
O. officinalis,
O. longistaminata, and
O. australiensis, and possibly also
O. grandiglumis are restricted to Asia, Africa, Australia, and South America ecological region, respectively [
47]. Chang suggested that because of this allopatric distribution, speciation in the genus Oryza might have been driven by the successive breakup of Gondwanaland, which would place the origin of grasses into the Jurassic period (~140-210 My ago) (
Table 1), much earlier than current estimates based on the first appearance of fossil grass pollen grains at the end of the Cretaceous (55-70 Mya) [
48]. Monocots are probably not well preserved in the fossil record in the early Creteous (125-115 million years) [
49], which raises the possibility that similar to vertebrate evolution [
50], divergence times may significantly precede the appearance of relevant groups in the fossil record. The high statistical support for the branching nodes G within the rDNA families of maize-rice (
Figure 1) allows us to infer maize and
Oryza spices existed before the successive break-ups of Gondwanaland (SI-3). Therefore it was a possible Gondwanaland ancestry of rice based on our data.
5. Conclusions
Natural crosses between wild species of
Oryza with different genomic types are unknown. However natural hybridization is common between wild species of rice and the worldwide cultivated
O. sativa. Several evidence indicate that the distribution of the
rRNA families in rice species was not a consequence of interspecific hybridization after speciation of the genus
Oryza. (i) Comparative genomic fingerprints strongly support an independent evolution of
Oryza species [
51]. (ii) Before the onset of the worldwide distribution of
O. sativa about 300 years ago, the geographic isolation of many rice species and short-lived pollen at least in
Oryza sativa represented constraints making cross-con tinental interspecific hybridization within the genus
Oryza by long-range transport of pollen. (iii) Chromosomal incompatibilities between rice species like
O. sativa and
O. officinalis or
O. granulata either produce completely male sterile hybrids in sexual crosses with
O. sativa or totally prevent species interbreeding. Nonetheless, the identity of partial LSU rDNA sequences affiliated with different
rRNA families from different species is striking. If we consider the long span of genetic isolation after continent separation, the presence of identical
rRNA gene sequences in different species appears to be unlikely but could be explained by the exchange of
rRNA families through natural species interbreeding.
Author Contributions
“Conceptualization, Xiyu Tan and Guixiang Peng.; methodology, Muhammad Sajjid and Xiyu Tan.; software, Sidra Kaleem.; validation, Mehmood Jan., Raheel Munir and Arif Ali Khattak.; formal analysis, Abid Ali Abbas.; investigation, Muhammad Afzal and Xiyu Tan.; resources, Yihang Chen and Xiaolin Wang.; data curation, Muhammad Afzal.; writing—original draft preparation, Xiyu Tan.; writing—review and editing, Muhammad Afzal, Sidra Kaleem and Muhammad Sajjid.; visualization, Muhammad Afzal.; supervision, Muhammad Afzal and Zhiyuan Tan.; project administration, Muhammad Afzal and Zhiyuan Tan.; funding acquisition, Zhiyuan Tan. All authors have read and agreed to the published version of the manuscript.”
Figure 2.
Southern hybridization showed the occurrence of the rDNA families RC1, RC2 and RC3 in rice genome with specific probes RCp1, RCp2 and RCp3, respectively; 1, 2, 3, 4, 5, 6 indicated genomic DNA from O. officinalis, O. sativa subsp. indica IR36, O. longistaminata, O. rufipogon from Nepal, O. rufipogon from IRRI and O. glaberrima digested with Bse RI and Xcm I, separately; agarose gel showed the electrophoresis of digested genomic DNA; M, lambda DNA/Pst I Marker, 29 fragments (size in bp): 11501, 5077, 4749, 4507, 2838, 2556, 2459, 2443, 2140, 1986, 1700, 1159, 1093, 805, 514, 468, 448, 339, 264, 247, 216, 211, 200, 164, 150, 94, 87, 72, 15.
Figure 2.
Southern hybridization showed the occurrence of the rDNA families RC1, RC2 and RC3 in rice genome with specific probes RCp1, RCp2 and RCp3, respectively; 1, 2, 3, 4, 5, 6 indicated genomic DNA from O. officinalis, O. sativa subsp. indica IR36, O. longistaminata, O. rufipogon from Nepal, O. rufipogon from IRRI and O. glaberrima digested with Bse RI and Xcm I, separately; agarose gel showed the electrophoresis of digested genomic DNA; M, lambda DNA/Pst I Marker, 29 fragments (size in bp): 11501, 5077, 4749, 4507, 2838, 2556, 2459, 2443, 2140, 1986, 1700, 1159, 1093, 805, 514, 468, 448, 339, 264, 247, 216, 211, 200, 164, 150, 94, 87, 72, 15.
Figure 3.
Northern hybridization showed the co-dominant expression of rDNA families RC1, RC2 and RC3 in rice plants. A, B, C, total RNA extracted from O. officinalis with specific probe RCp1, RCp2 and RCp3 hybridization, respectively; +N, -N indicated O. officinalis grown in soil fertilized and not fertilized. Agarose gels (upper panel) and Northern blots (lower panel).
Figure 3.
Northern hybridization showed the co-dominant expression of rDNA families RC1, RC2 and RC3 in rice plants. A, B, C, total RNA extracted from O. officinalis with specific probe RCp1, RCp2 and RCp3 hybridization, respectively; +N, -N indicated O. officinalis grown in soil fertilized and not fertilized. Agarose gels (upper panel) and Northern blots (lower panel).
Figure 5.
28S rDNA-based cDNA sequence printouts (A) and corresponding terminal restriction fragment length polymorphism (T-RFLP) fingerprint patterns (B) showed the differential expression of 28S rRNA genes in rDNA family RC2. 28S rRNA genes extracted from roots and leaves of O. officinalis grown in soil fertilized (+N) and unfertilized (-N) with combined nitrogen. For T-RFLP analyses restriction sites of Pvu II, Bcl I and Bst XI were underlined in the sequence printout.
Figure 5.
28S rDNA-based cDNA sequence printouts (A) and corresponding terminal restriction fragment length polymorphism (T-RFLP) fingerprint patterns (B) showed the differential expression of 28S rRNA genes in rDNA family RC2. 28S rRNA genes extracted from roots and leaves of O. officinalis grown in soil fertilized (+N) and unfertilized (-N) with combined nitrogen. For T-RFLP analyses restriction sites of Pvu II, Bcl I and Bst XI were underlined in the sequence printout.
Table 1.
Dates of divergence of Oryza 28S rDNA lineages RC1, RC2 and RC3.
Table 1.
Dates of divergence of Oryza 28S rDNA lineages RC1, RC2 and RC3.