1. Introduction
Polyploidy, the whole genome duplication (WGD) event, has been a fundamental and recurrent force in the evolutionary history of angiosperms, profoundly influencing genome structure, function, and organismal complexity [
1,
2]. By doubling or multiplying genome content, polyploidy provides a reservoir for genetic innovation, leading to phenotypic diversification and adaptation [
3,
4]. One key realization in the past decade is the profound impact of polyploidy on transcriptomic landscapes [
5,
6,
7]. Studies have reported substantial transcriptomic rewiring in polyploid species, encompassing biased homoeolog expression, condition-specific gene regulation, and transgressive expression levels. These transcriptional changes often underlie phenotypic and adaptive variations upon polyploidy formation. However, despite extensive research on transcriptome and proteome [
8], the consequences of polyploidy at the translational level remain largely unexplored.
Ribosome profiling, known as Ribo-seq, is a high-throughput sequencing technique that provides a comprehensive snapshot of active translation within a cell by mapping the positions of ribosomes on mRNA molecules [
9]. This approach involves the isolation and deep sequencing of mono-ribosome-protected mRNA fragments, often referred to as ribosome footprints. By analyzing the distribution and abundance of these footprints, researchers can quantitatively assess translation rates, identify translation start sites, and uncover the full repertoire of translated open reading frames (ORFs) [
10]. Beyond its utility in identifying translated regions, Ribo-seq enables the study of translational regulation, including the impact of
cis-regulatory elements, ribosome pausing, proteins, and environmental cues on protein synthesis [
11,
12,
13]. Moreover, by comparing Ribo-seq data with transcriptomic data, translational efficiency (TE), a metric that reflects the coupling of transcription and translation could be measured [
14,
15].
The application of Ribo-seq has rapidly expanded across various plants systems. In
Arabidopsis, for example, global translatome profiling was performed in conjunction with RNA sequencing, revealing tightly regulated translation with poor correlation with transcriptional changes during immune induction, leading to the discovery of novel regulators of the immune response [
16]. Similarly, translational reprogramming was demonstrated a crucial layer of stress responses in crop species including rice (
Oryza sativa) [
17,
18], potato (
Solanum tuberosum) [
14], citrus (
Citrus reticulata) [
19], and maize (
Zea mays) [
15]. Beyond its role in stress responses, Ribo-seq has been instrumental in investigating allele-specific translation in hybrids, revealing the intricate interplay between genetic variation and translational regulation [
13,
20]. These studies have highlighted the potential of translational regulation to contribute to heterosis, where hybrid offspring exhibit superior traits compared to their parents. While some progress has been made in understanding the impact of polyploidy on translational regulation in plants, as exemplified by studies in soybean (
Glycine max) [
21], the role of translational control on gene expression evolution accompanying allopolyploidization remains largely underexplored.
Cotton, a globally significant fiber crop, represents a model system for studying allopolyploidization [
22]. The allopolyploid cotton genome was formed approximately 1-2 million years ago, through hybridization and subsequent genome doubling of A- and D-genome diploid progenitors. Previous research on cotton gene expression evolution accompanying allopolyploidization have primarily focused on transcriptomics [
23,
24] and proteomics [
25,
26], with limited insights from translational regulation. To address this gap and elucidate the impact of allopolyploidization on translational regulation, we conducted a comprehensive analysis of gene expression at both the transcriptional and translational levels in allopolyploid cotton
G. hirsutum (AD
1), and the representatives of its progenitor diploid species
G. arboreum (A
2) and
G. raimondii (D
5). To expand our comparative analysis, we incorporated public RNA-seq and Ribo-seq data from another allopolyploid species
G. barbadense (AD
2) [
27] for comparison. Initially, we compared the correlation and expression variances between transcriptomic and translational datasets, revealing a dominant correlation and larger expression variances in transcriptomes, demonstrating greater differential ranges in transcriptomic data. Subsequently, we compared differential gene expression patterns between transcript and translation levels, finding nearly no genes with completely opposite regulatory trends between these two levels, with only 5.46% showing consistent regulatory trends, primarily enriched in terpene functional genes associated with plant stress resistance. We then compared TE and found that the TE of genes showed a significant negative correlation with transcript levels and that genes with high TE were enriched for specific functions, which was also affected by allopolyploidy. Finally, comparison of expression between homologous A and D subgenomes, allopolyploidy was found to reduce homoeologs expression bias genes at the translational level, and a significant D subgenomic expression dominance at the translational level was detected in AD
2. In conclusion, the results of this study provide insights into the effects of allopolyploidy on gene expression and translation in cotton.
3. Results
3.1. Transcriptional and Translational Profiling of Diploid and Allopolyploid Cottons
A total of 183 million, 130 million, and 226 million Ribo-seq reads were generated for
G. arboreum (A
2),
G. raimondii (D
5), and
G. hirsutum (AD
1) seedling leaves, respectively (
Table S2). After removal of low-quality reads and rRNA, an average of 18 million clean reads remained per sample, representing 30.8% of the total. The average unique mapping rate ranged from11.2% to 39.6%, tetraploid (AD
1) having the lowest rate most likely due to its higher level of sequence homology compared to diploids A
2 and D
5. To facilitate direct comparisons between parental diploids and allopolyploids, we combined the clean reads of A2 and D5 into a ‘Combined Diploid’ dataset (A
2D
5) and used it for subsequent analyses, which exhibited 14.5%-16.5% unique mapping rates, comparable to those of AD
1.
To validate Ribo-seq data quality, we examined RPF (Ribosome Protected Fragment) sizes, which ranged from 26 to 32 nucleotides (
Figure S1), consistent with known cotton Ribo-seq data [
27]. The RPFs exhibited a periodicity of 3 nucleotides (
Figure 1A), a typical characteristics of ribosome profiling [
43]. Pairwise Pearson correlation coefficients were calculated between all samples to examine data reproducibility between replicated experiments, including a previously published dataset from
G. barbadense (AD
2) leaves [
27]. One AD
1 sample (AD
1-2) with relatively low correlation coefficients (R=0.85-0.86) (
Figure 1B) and poor clustering in principal component analysis (PCA) (
Figure 1C) with other replicates, was excluded from subsequent analyses.
For RNA-seq, a total of 131 million, 143 million, and 155 million read pairs were obtained from A
2, D
5, and AD
1 cotton leaf samples, respectively (
Table S3). The clean read pairs obtained per sample ranged 39 to 57 million, with an average unique mapping read rate of 87.4% (
Table S3). Diploid data from A
2 and D
5 were combined into the A
2D
5 dataset as for the Ribo-seq data. Pearson correlation coefficients ranging from 0.79 to 0.99 were observed between transcriptome and translatome data (
Figure 1B). Notably, expression variances were higher in transcriptomes than in translatomes (
Figure 1D; variance of log10(TPM): 0.914 vs 0.686), potentially indicating a wider expression range in mRNA levels than translated proteins.
3.2. Concordantly Decreased Transcription and Translation Accompanying Allopolyploidization
To study gene expression changes during allopolyploidization, we conducted differential gene expression analyses between diploid and tetraploid cotton using RNA-seq and Ribo-seq data (criterion: |log2fold change| ≥ 1 and FDR < 0.05). We identified significant differences as follows (
Figure 2A): comparing AD
1 to A
2D
5, 11,707 (15.6%, 5,711 up-regulated and 5,996 down-regulated) transcriptional changes and 1,622 (2.1%, 472 up-regulated and 1150 down-regulated) translational changes were identified. Comparing AD2 to A2D5, more DEGs were identified at both levels, with 21,506 (28.7%, 10,773 up-regulated and 10,733 down-regulated) transcriptional changes and 6,803 (9.1%, 2,293 genes up-regulated and 4510 down-regulated) translational changes. Between AD
1 and AD
2, 14,370 (19.2%, 6919 up-regulated and 7451 down-regulated) transcriptional changes and 2148 (2.9%, 1259 up-regulated and 889 down-regulated) translational changes were identified. Concordantly, fewer DEGs were identified at the translational level by Ribo-seq. Regarding the four inter-ploidy contrasts (i.e., AD
1 vs A
2D
5 and AD
2 vs A
2D
5 at either transcriptional or translational levels), we identified a total of 7,686 genes exhibiting allopolyploidy changes common to both tetraploids AD1 and AD2 (
Figure 2B, numbers in red). Of these, 189 genes showed common changes only at the translational level, 7,078 genes showed changes only at the transcriptional level, and 419 genes showed changes at both levels. Considering only those exhibiting conserved up- or down-regulated changes relative to A
2D
5, allopolyploid-specific regulation was inferred for 7,393 genes, which were further categorized into eight groups (A-H) based on their expression changes in Ribo-seq and RNA-seq (each being increased, decreased, or no changes) (
Figure 2C).
We specifically focus on the categories exhibiting allopolyploid-specific regulation at the translational level, including categories A, B, C, F, G, and H (582 genes, 7.9%). Among these, category C (53, 0.7%) and F (352, 4.6%) exhibited concordant regulatory directions between transcription and translation in allopolyploids relative diploids, with a prominent bias toward decreased expression (more F than C genes). Category F was enriched for GO terms associated with terpenoid synthesis and metabolism, indicating a negative impact of allopolyploidization on terpenoids at both transcriptional and translational levels. Categories C was not enriched for any GO terms.
Few genes were found exhibiting opposite regulatory directions between transcription and translation, as classified to category A (0 genes) or H (2 genes), nor enriched for GO terms.
Category B (21 genes, 0.3%) and G (155 genes, 2%) include genes with unchanged transcription levels but higher or lower translation in allopolyploids, respectively. Only category G was enriched for GO terms related to ribosome function and protein import into chloroplasts.
3.3. Impact of Allopolyploidy on Translational Efficiency
To isolate the effect of translational change independent of transcriptional changes, we calculated the translation efficiency (TE) values for genes, defined as the ratio of translational to transcriptional expression levels (i.e., log2(RPF/mRNA)). A gene with TE = 0 is considered to have consistent transcriptional and translational products; TE ≠ 0 indicates regulation at the translational level, with TE > 0 suggesting higher and TE < 0 suggesting lower translation relative to transcription.
We detected a total of 65,001 transcriptionally expressed genes (RNA-seq read counts above 1 in at least one biological replicate) in A
2D
5, with 9,810 (15.1%) having TE < 0 and 7,274 (11.2%) having TE > 0. In AD
1, there were 62,343 transcriptionally expressed genes, with 8,664 (13.9%) having TE < 0 and 5,845 (9.4%) having TE < 0 and. In AD
2, there were 63,871 transcriptionally expressed genes, with 9,231 (14.5%) having TE < 0 and 4,645 (7.3%) having TE > 0 (
Figure 3A). These results indicate that at least one-fifth of genes exhibit translational level regulation, with a general trend of more down-regulation than up-regulation of TE. Interestingly, the magnitude of translational level regulation was slightly reduced in allopolyploids compared to diploids.
To explore the relationship between TE and mRNA levels, we analyzed their correlations in each genotype. TE was significantly negatively correlated with transcriptional expression in A
2D
5, AD
1, and AD
2, with Pearson’s R of -0.2, -0.2, and -0.4, respectively (
Figure 3B). This suggested that, similar to observations in maize and potato [
14,
15], the mRNA translation efficiency and ultimately protein synthesis capacities are limited when mRNA levels are high.
We further analyzed the genes with the highest TE (> 6) and the lowest TE (< -6) in in A
2D
5 (1,062 and 801 genes), AD
1 (753 and 978) and AD
2 (808 and 1,936) (
Figure 3C). GO enrichment analysis revealed that genes with TE > 6 in A
2D
5 were primarily enriched for energy metabolism-related terms, including ATPase activity, proton-transporting ATP synthase complex, ribonucleoside triphosphate metabolic process, and photosynthesis (
Figure 3D). Similar GO terms were enriched in AD
1 and AD
2, with additional enrichment for RNA polymerase activity-related terms in AD
2. This suggests that energy metabolism-related genes tend to be translated with high efficiency to increase protein production, a trend consistent before and after allopolyploidy.
For genes with TE < -6 (
Figure 3D), A
2D
5 showed enriched GO terms related to cellular component biogenesis and ciliary were predominantly. In contrast, both AD
1 and AD
2 were predominantly enriched for stomatal opening, cyclin-dependent protein, and GO terms related to serine/threonine protein kinase. This suggests that allopolyploidy has a profound impact on genes with low TE.
3.4. Reduced Amount of Homoeolog Expression Biases at the Translational Level
To investigate the subgenomic contribution to gene expression, 22,889 ortho-homoeolog groups (OGs) were previously inferred between the At and Dt homoeologs in the AD1 reference genome. At the transcriptional level, we observed 23%-37% of OGs exhibiting homeolog expression bias (HEB): A2D5 had 5,285 (23%, 2,574 A-bias vs 2,711 D-bias), AD1 had 5,237 (22.9%, 2,576 vs 2,661), and AD2 had 8,407 (36.7%, 4,165 vs 4,242), with no significant imbalance towards either the A- or D-subgenome. At the translational level, the number of OGs exhibiting HEB decreased to 3%-9%: A2D5 had 2,224 (9.7%, 1,115 vs 1,109), AD1 had 1,013 (4.4%, 484 vs 529), and AD2 had 562 (2.5%, 244 vs 318), with only AD2 showing significantly more D-biases of translational expression.
When comparing TE, there were 1,076 (4.7%, 391 vs 685), 625 (2.7%, 320 vs 305), and 547 (2.4%, 237 vs 310) significant differences between At and Dt in A2D5, AD1, and AD2 respectively. Both A2D5 and AD2 showed significant D -biased translational level regulation.
These results indicated that allopolyploidization may have reduced the subgenomic differences at the translational level, leading to fewer biases at this level. Additionally, a D-genome dominant translational regulation was observed in AD2, may reflecting species-specific translational regulation with respect to subgenomic contribution.
Table 1.
Homoeolog expression bias (HEB) genes on transcriptional and translational levels.
Table 1.
Homoeolog expression bias (HEB) genes on transcriptional and translational levels.
|
|
A2D5 |
AD1 |
AD2 |
|
|
Number |
Proportion |
Number |
Proportion |
Number |
Proportion |
RNA |
Total |
5285 |
23.09% |
5237 |
22.88% |
8407 |
36.73% |
A>D |
2574 |
11.25% |
2576 |
11.25% |
4165 |
18.20% |
A<D |
2711 |
11.84% |
2661 |
11.63% |
4242 |
18.53% |
p-value* |
0.0595 |
0.2402 |
0.401 |
Ribo |
Total |
2224 |
9.72% |
1013 |
4.43% |
562 |
2.46% |
A>D |
1115 |
4.87% |
484 |
2.11% |
244 |
1.07% |
A<D |
1109 |
4.85% |
529 |
2.31% |
318 |
1.39% |
p-value* |
0.8988 |
0.1574 |
0.001799 |
TE |
Total |
1076 |
4.70% |
625 |
2.73% |
547 |
2.39% |
A>D |
391 |
1.71% |
320 |
1.40% |
237 |
1.04% |
A<D |
685 |
2.99% |
305 |
1.33% |
310 |
1.35% |
p-value* |
2.20E-16 |
0.5485 |
0.001801 |