1. Introduction
From the development of the oocyte to the embryo, many dynamic and complicated processes take place as the zygote undergoes several rapid rounds of division and produces a mass of cells within the zona pellucida [
1,
2]. During these dynamic stages, differential gene expression in individual cells is a key determinant of cellular differentiation, function, and physiology [
3,
4]. In recent years, several studies have documented the key developmental processes underlying the formation of blastomeres of 4- and 8-cell embryos of mice [
5], cattle [
6], and goats [
7]. However, the exact mechanism and developmental patterns underlying how the blastomeres of the 2-, 4-, 8-, 16-cell, morula, and blastocyst stages undergo asymmetric division are still unclear. Understanding the molecular mechanism underlying cleavage-stage development is critically important for improving preimplantation genetic diagnosis.
Single-cell RNA sequencing (scRNA-seq) provides an alternative method for studying the cellular heterogeneity of human preimplantation embryos and embryonic stem cells [
8,
9], mouse oocytes [
10], porcine oocyte maturation [
11], and Haimen white goat’s oocyte maturation by generating a readout of the abundance of a transcript within a cell [
12]. Indeed, scRNA-seq applied to mammalian gametes has generated new insights into the composition of transcripts and disease-related functional abnormalities. However, scRNA-seq studies have not thoroughly characterized how the blastomeres of the 2-, 4-, 8-, 16-cell, and especially the morula and blastocyst stages, undergo asymmetric division. Here, we aimed to transcriptionally profile nucleated cells present during the blastomeres of 2-, 4-, 8-, 16-cell and morula stage undergoing asymmetric division to provide a broad profile of composition of transcripts in the cell and their transcriptomes.
Long non-coding RNAs (lncRNAs) are a diverse group of RNAs that are longer than an arbitrary limit of 200 nt and do not encode proteins [
13]. Nevertheless, lnc RNAs can be located in exonic, intronic, and intergenic regions and can regulate gene expression by interacting with other biological macromolecules, such as RNA, DNA, proteins, and other factors, to promote normal cell function [
14,
15]. Compared with the characteristics of protein-coding genes, lncRNAs tend to be less conserved across species and often show lower expression levels and high tissue specificity [
16]. During the development of embryos, some lncRNAs, such as Xist, Tsix, and H19, have significant regulatory functions and can potentially determine the cell’s fate and direction of differentiation during embryogenesis to form different organs or special tissues that contain various cells expressing specific genes [
17,
18,
19].
An increasing number of studies have reported several thousands of annotated lncRNA loci in mammalian oocytes and early embryos [
1,
20]. For example, a total of 2733 novel lncRNAs were found to be expressed in specific developmental stages among 8701 lncRNAs using single-cell sequencing analysis of 124 individual cells from human embryonic stem cell (ESCs) and human preimplantation embryos at different passages. In addition, 5204 novel lncRNAs were obtained from in vivo and somatic cell nuclear transfer mouse preimplantation embryo, suggesting that several lncRNAs and their association with known protein-coding genes might be involved in mouse embryonic development and cell reprogramming [
8,
21]. Another study reported that approximately a quarter of 1600 lncRNA loci expressed during the murine oocyte-to-embryo transition employed promoters from a long terminal repeat retrotransposon class, which exhibited either maternal or zygotic expression and showed signatures of massive expansion in the evolution of the rodent genome [
22]. In bovine early embryos, some lncRNAs play a role in the translational control of target mRNA and are thus important for managing maternal reserves, especially before embryonic genome activation, as these reserves contain the embryonic program [
23]. Despite the fact that various functional lncRNAs are important during embryonic development, the effect of lncRNAs on the blastomeres of the 2-, 4-, 8-, 16-cell, and morula stages of development has not been studied; nevertheless, this subject is in need of more attention and discussion by scientists. lncRNAs play an important role in biological processes, including epigenetic regulation, dose compensation, cell cycle, cell differentiation, proliferation, apoptosis through gene imprinting, chromatin remodeling, transcriptional activation, transcriptional interference, and nuclear splicing regulation.
Here, we conducted an scRNA-seq survey of 24 sheep cells during their development from the oocyte to the blastocyst stages. We then conducted experiments involved in the identification and functional analysis of transcriptome profiles, lncRNAs, single-nucleotide polymorphisms (SNPs) and alternative splicing (AS). Using these different methods, we constructed a profile to characterize cellular and molecular features and systematically analyze the related GO and KEGG profile of cells during sheep development. This large-scale single-cell atlas provides key cellular information and will likely aid clinical studies in the development of more efficient preimplantation genetic diagnosis.
2. Materials and Methods
2.1. Animals and sample collection
All work with animals was completed in accordance and with the approval of the Henan Academy of Agricultural Sciences Institutional Animal Care and Use Committee. Mature sheep were obtained from Hui yuan Sheep Industry Co., Ltd (pu yang, he nan province). We used 15-month-old female sheep (40 kg) for our study. The animals were provided with grass and drinking water and had access to an animal exercise pen. All animals were healthy, showed normal appetite, and had smooth wool. Artificial insemination using semen from one of three fertile bulls was conducted at 12- and 24-hours post-standing heat (Day 0). Donor animals were sacrificed at 30 hours, and in vivo developed oocytes and embryos at the 2- to 16-cell stages were collected by oviductal flushing 2–4 days after estrus. Early morulae, compact morulae, and blastocysts were collected by routine non-surgical uterine flushing on days 5, 6, and 7. All oocytes and embryos were examined and staged under light microscopy. Only morphologically intact embryos meeting the standards of Grade 1 by the International Embryo Transfer Society were included in this study. Embryos were washed twice in D-PBS before being frozen and stored individually in RNAlater (Ambion, Grand Island, NY, USA) in liquid nitrogen.
2.2. RNA isolation, library preparation and sequencing
Firstly, total RNA was isolated from the sheep sample (three biological replicates per sample combined) using single cell RNA kit (Single Cell RNA Purification Kit NGB) was extracted (Norgen Biotek, CA) following the manufacturer's procedure. The single cell RNA quantity and purity were analysis of Bioanalyzer 2100 and RNA 6000 Nano LabChip Kit (Agilent, CA, USA) with RIN number >7.0. Approximately 10 ug of total RNA representing a specific adipose type was subjected to isolate Poly (A) mRNA with poly-T oligoattached magnetic beads (Invitrogen). Following purification, all amplifications were carried out in parallel with positive and no-template controls for quality assurance which using the SMARTer Universal low Input RNA kit (Clontech) for cDNA library. Briefly, RNA is converted to cDNA, and then adapters for Illumina sequencing (with specific barcodes) are added through PCR using only a limited number of cycles. The PCR products are purified (Protocol C), and then ribosomal cDNA is depleted. The cDNA fragments are further amplified with primers universal to all libraries. Lastly, the PCR products are purified once more to yield the final cDNA library. Then, the mRNA is fragmented into small pieces using divalent cations under elevated temperature. Then the cleaved RNA fragments werereverse-transcribed to create the final cDNA library in accordance with the protocol for the mRNASeqsample preparation kit (Illumina, San Diego, USA), the average insert size for the paired-endlibraries was 300 bp (±50 bp). And then we performed the paired-end sequencing on an IlluminaHiseq4000 at the (LC Sceiences,USA) following the vendor’s recommended protocol.
2.3. Quality control and assembly of transcriptome data
Raw data in FASTQ format were first processed through in-house perl scripts. Clean reads were obtained by removing reads containing adapters, reads containing poly-N, and low-quality raw reads. Furthermore, Q20, Q30, and GC contents of the clean data were calculated. All of the downstream sequencing analyses were based on high-quality clean reads. For the RNA-seq data, all clean reads from each sample were aligned to the sheep reference genome (
https://www.ncbi.nlm.nih.gov/genome/83?genome_assembly_id=351950) using TopHat v2.0.949. The distribution of known gene types was analyzed by HTSeq software. The mapped reads of each sample were then assembled by both Scripture (beta2) and Cufflinks (v2.1.1) using a reference-based approach.
2.4. RNA-seq reads mapping
We aligned reads to the UCSC (
http://genome.ucsc.edu/) sheep reference genome using the Tophat package, which initially removes a portion of the reads based on quality information accompanying each read and then maps the reads to the reference genome. Tophat allows multiple alignments to be read (up to 20 by default) and a maximum of two mismatches when mapping the reads to the reference. Tophat builds a database of potential splice junctions and confirms these by comparing the previously unmapped reads against the database of putative junctions.
2.5. Estimation of transcript abundance and differential expression
The mapped reads of each sample were assembled using StringTie. All transcriptomes from the samples were then merged to reconstruct a comprehensive transcriptome using perl scripts. After the final transcriptome was generated, StringTie and Ballgown were used to estimate the expression levels of all transcripts. StringTie was used to determine the expression level of mRNAs by calculating FPKM. Differentially expressed mRNAs and genes were identified if log2 (fold change) >1 or log2 (fold change) <-1 and by statistical significance (p-value <0.05) through the R package Ballgown.
2.6. WGCNA Analysis
The co-expression network analysis was performed using WGCNA (version:1.61) [
24]. First, the soft threshold for network construction was selected. The soft threshold constrains the adjacency matrix to assume a continuous value between 0 and 1 so that the constructed network conforms to the power-law distribution and is closer to the real biological network state. Second, the scale-free network was constructed using the blockwiseModules function, followed by the module partition analysis to identify gene co-expression modules, which groups genes with similar patterns of expression. The modules were defined by cutting the clustering tree into branches using a dynamic tree-cutting algorithm and assigning the modules different colors for visualization [
24]. The module eigengene (ME) of each module was then calculated. ME represents the expression level for each module. Next, the correlation between ME and the clinical trait in each module was calculated, followed by the determination of gene significance.
2.7. Transcript assembly and identification of candidate lncRNAs
First, Cutadapt was used to remove the reads that were contaminated with adaptors [
25], low-quality bases, and undetermined bases. Sequence quality was then verified using FastQC (
http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). We used Bowtie2 (FastQC) and topaht2 to map reads to the genome [
26]. The mapped reads of each sample were assembled using StringTie [
27]. All transcriptomes from the samples were then merged to reconstruct a comprehensive transcriptome using perl scripts. After the transcriptome was generated, StringTie and Ballgown were used to estimate the expression levels of all transcripts. Transcripts that overlapped with known mRNAs and transcripts shorter than 200 bp were discarded. We utilized CPC and CNCI to predict transcripts with coding potential [
28]. All transcripts with CPC scores <-1 and CNCI scores <0 were removed. The remaining transcripts were considered lncRNAs.
2.8. Classification of lncRNAs
The annotated lncRNAs were subdivided into four categories based on their locations relative to the nearest protein-coding genes: (i) lncRNAs that do not overlap protein-coding genes (lincRNAs); (ii) lncRNAs located entirely within a protein-coding locus (intragenic lncRNAs); (iii) lncRNAs partially overlapping a protein-coding gene (overlapping lncRNAs); and (iv) lncRNAs that overlapped exons of a protein-coding transcript on the opposite strand (antisense lncRNAs). Perl scripts were developed to classify these four categories.
2.9. Quantification and differential expression analysis
The relative abundances of both candidate lncRNAs and coding genes in each sample were computed by calculating the FPKM using Cufflinks (v2.1.1). Differentially expressed lncRNAs in comparison groups were identified using the Cuffdiff program. For biological replicates, transcripts with adjusted P-values <0.05 were considered differentially expressed between the two groups.
2.10. Predictions of cis and trans-target genes
To explore the function of lncRNAs, we first predicted the cis and trans-target genes of lncRNAs. We searched for coding, cis-target genes 10k and 100k upstream and downstream, respectively, of candidate lncRNAs and then analyzed their functions. We calculated the expressed correlation between lncRNAs and coding genes with custom scripts and then analyzed their functions through functional enrichment analysis. The trans-target genes and lncRNAs were identified by their expression levels.
2.11. SNP analysis
To further characterize the SNPs, we categorized them as genic or intergenic. Approximately half of the SNPs (47 %) were located in genic regions, and the rest was located in intergenic regions. In addition, nonsynonymous SNPs in the exon region were analyzed to determine whether their amino acid character had changed (e.g., hydrophobic to basic or stop codons), given that compositional changes of the amino acids in proteins can result in changes in structural conformation or enzymatic activities and thus generate phenotypic diversity or critical functional variations. First, 20 amino acids were clustered into several character groups. Non-synonymous SNPs that caused amino acid changes from one group to another were searched. Common SNPs in the eight groups representing each line were then classified.
2.12. AS data collection
Data on 96 melanoma cases with clinicopathological information were obtained to explore changes in AS events. To analyze the AS profiles for each patient, the SpliceSeq tool (version 2.1), a java application, was used to evaluate the splicing patterns of mRNA in the melanoma cohort. The percent spliced in value was calculated to quantify alternative splicing events and ranged between 0 and 1 for seven types of AS events, including exon skip (ES), alternate terminator (AT), and mutually exclusive (ME).
2.13. GO and KEGG enrichment analysis
GO enrichment analysis of the aforementioned groups was conducted using the GOseq R package while correcting for biases in gene length. In addition, KOBAS software and the KEGG database (http://
www.genome.jp/kegg/) were used to analyze the statistical enrichment of target genes of differentially expressed lncRNAs in KEGG pathways. Lower p-values corresponded to more relevant pathways, and corrected p-values of <0.05 were considered significantly enriched by DEGs.
2.14. Statistical analysis
Statistical analysis was performed using SPSS13.0 software. Proportional data were compared using chi-squared analysis or Fisher’s exact tests, and differences were considered significant when p <0.05. The percentage of blastocyst formed represented the number of blastocysts formed divided by the total number of embryos cultured. The percentage of high-quality blastocyst formed represented the number of high-quality blastocysts divided by the total number of blastocysts cultured.
4. Discussion
In recent years, much research has focused on the study of the development of blastomeres of the 4- and 8-cell embryos of mice [
5], cattle [
6], and goats [
7]. However, the exact mechanism and the developmental patterns underlying how the blastomeres of the 2-, 4-, 8-, 16-cell, morula, and blastocyst stages undergo asymmetric division are still unclear. For the first time, we used scRNA-seq to study the transcriptome profiles during sheep development from the oocyte to the blastocyst stages. Our data showed that from the 4-cell to the 8-cell stage, there were no noticeable changes in the transcriptome profile as has been shown for the 4- and 8-cell embryos of mice [
37,
6,
7] (
Figure 1a). However, the first major change was noted between the 8-cell and 16-cell stages, which is similar to the pattern observed in the bovine embryonic genome. The greatest changes in gene expression were observed between the morula and blastocyst stages, which may be explained by a major morula-blastocyst transition that results in expression patterns characteristic of sheep preimplantation development [
11,
38]. Furthermore, 1,092 and 856 DEGs were down- and up-regulated, respectively (
Figure 2).
BTF3 (basic transcription factor 3),
TLR1(toll-like receptor 1) and
SPINT2 (serine peptidase inhibitor, Kunitz type 2) were markedly up-regulated, while
PEX12 (peroxisomal biogenesis factor 12) and
PGK1 (phosphoglycerate kinase 1) were significantly down-regulated between the oocyte and zygote stages. In others reports, BTF3 as one of importance transcription factor was improve in gastric cancer development and progression by enhance transcription [
39]. TLR4 enhances blastocyst attachment to endometrial cells in mice via miR-Let-7a suggested that inflammatory responses are beneficial in the fetomaternal interface and supposedly accelerate the chances for successful implantation. In our study, TLR1 was markedly up expression from oocyte to blastocyst development [
40]. Therefore, the role of inflammatory responses is interesting need further study.GO and KEGG analyses of DEGs were then conducted. At different stages, GO analysis indicated that there was a significant over-representation of elements involved in nuclear speck and cytoplasm. The KEGG analysis revealed that most of the DEGs were primarily involved in the spliceosome, ribosome biogenesis in eukaryotes, and the Epstein-Barr virus infection pathway, RNA transport, ribosome pathway, transferase activity, and others. The above pathways have also been reported in the development of bovine and monkey embryos [
11,
41,
42]. Therefore, the transcriptome profiles during sheep from oocyte to blastocyst development were shown the same pattern.
We then characterized the features of the lncRNAs and their target genes. The lncRNAs in cell samples from the oocyte, fertilized egg, 2-, 4-, 8-, 16-cell, morula, and blastocyst stages had lower expression levels compared with the expression levels observed in protein-coding genes. However, the transcript lengths of lncRNAs were mostly ≥1000 bp, and the mean transcript length of lncRNAs was not significantly different relative to the mean transcript length of protein-coding genes (
Figure 5b). In addition, there were a total of 42 differentially expressed lncRNAs (52 transcript isoforms) (
Figure 6). The major changes in lncRNA expression occurred between the morula and blastocyst stages, of which 19 (23 transcript isoforms) lncRNAs were significantly up-regulated, such as lnc MSTRG.2676, lnc MSTRG.3585, and six (10 transcript isoforms) lncRNAs were down-regulated, such as lnc MSTRG.8262 and lncMSTRG.11966 (
Figure 6). Overall, there were fewer differentially expressed lncRNA transcripts during sheep development compared with the number of differentially expressed transcripts detected in implantation and inter-implantations sites in day-5 pregnant mice [
31]. GO and KEGG analyses of the cis and trans-target genes of the lncRNAs in the eight comparison groups were performed. GO analysis of the cis lncRNA targets revealed the following significantly enriched GO terms: RNA binding, nucleus, and others. The KEGG analysis revealed that the significantly enriched cis pathways were “Huntington’s disease” and “Oxidative phosphorylation” between the morula and blastula groups (corrected p-value <0.05,
Figure 7g and
Supplemental S3-7).
We then studied the distribution of different SNP and indel types (
Table 4). Approximately 10% intergenic and 30% non-coding SNPs have been reported in humans from RNA-seq data [
43]. The high percentages of intergenic SNPs in humans may be partially explained by the incomplete annotation of protein-coding genes and exons in the current version of the rainbow trout reference genome sequence [
36]. We then conducted GO analysis of the associated genes. In the biological process’s category, SNP genes were associated with various cellular processes that were primarily involved in development-related mechanisms, including regulation of the MyD88-dependent toll-like receptor signaling pathway, regulation of metabolic and oxidation-reduction processes, and protein translation. In the molecular function category, SNP-containing genes were associated with binding phosphoprotein, nucleic acid, and actin. In addition, many genes were associated with transferase, motor, oxidoreductase, and structural molecule activities. In the cellular component category, many of the genes were associated with the cytoplasmic compartment, membranes, myosin complex, and extracellular region compartment (
Supplemental S4 and S5), which is consistent with the findings of previous studies [
32].
Finally, we also detected AS events. There are relatively few studies that have reported the numbers of AS events during development from the oocyte to the blastocyst. In this study, splicing events were comprehensively analyzed for 24 single cells based on relevant RNA-seq data. For example, there were a total of 101835 AS events that were detected in 54266 genes, comprising 7003 AE events detected in 2603 genes in the oocyte stage (
Figure 8a). AS can control the transcriptional identity of the maternal transcriptome by the RNA-binding protein, which is essential for the development of fertilized-competent oocytes [
44]. Therefore, our comprehensive analysis of AS suggests that AS plays an important role in sheep development. GO and KEGG analyses showed that there were 13663 genes with one or more AS events at different stages during sheep development based on the UpSet plot (
Figure 10a). The most significant biological process terms were “regulation of transcription, DNA templated,” “transport,” and “proteolysis.” The three most significant cellular component terms were “membrane,” “nucleus,” and “integral component of membrane.” The significant terms for molecular function were “ATP binding,” “nucleotide binding,” and “nucleic acid binding” (
Figure 10b). In a previous study, AS has been shown to play an important role in the protein-coding genes of mouse oocytes and zygotes by RNA-binding protein, and the correct combination of exons through AS ensures that gene isoforms are expressed for the specific context in which they are required [
45,
46]. Furthermore, the “metabolism”, “Genetic information processing” and “human diseases” correlated pathways were mostly genes involved in purine metabolism, RNA transport and pathways in cancer (
Figure 10c). In metabolomic analyses of fetal germ cells in mice on embryonic day (E)13.5 and E18.5, purine metabolism was involved in demonstrate sex- and developmental stage-dependent changes in these processes [
47]. RNA transport in our study were mostly genes involved which also as one of importance AS event is frequent across developmental stages and tissues in embryonic day 8.5, 9.5 and 11.5 mouse embryos and placenta [
48]. Furthermore, We detected 13663 genes with one or more AS events, suggesting that AS may play an important role in sheep development. Furthermore, “ATP binding (1498 genes),” “nucleotide binding (1441 genes),” and “nucleic acid binding (1412 genes)” were the three most enriched categories, suggesting that alternative protein isoforms with distinct functions were expressed. Thus, defects in splicing control might be able to induce losses in function and severe phenotypes and thus require further study[
49].
Figure 1.
Expression profiles during sheep development from the oocyte to the preimplantation stage. (a) Heat map of gene expression profiles correlated with developmental stages. (b) Determination of the soft threshold with the WGCNA algorithm. The approximate scale-free fit index can be attained at the soft-thresholding power of 18. (c) Clustering dendrograms showing 26 modules containing a group of highly connected genes. Each color represents a certain gene module. (d) Principal component analysis. PC1, PC2, and PC3 represent the top three dimensions of the genes showing differential expression among preimplantation blastomeres.
Figure 1.
Expression profiles during sheep development from the oocyte to the preimplantation stage. (a) Heat map of gene expression profiles correlated with developmental stages. (b) Determination of the soft threshold with the WGCNA algorithm. The approximate scale-free fit index can be attained at the soft-thresholding power of 18. (c) Clustering dendrograms showing 26 modules containing a group of highly connected genes. Each color represents a certain gene module. (d) Principal component analysis. PC1, PC2, and PC3 represent the top three dimensions of the genes showing differential expression among preimplantation blastomeres.
Figure 2.
Differentially expressed genes in sheep development from the oocyte to the blastocyst stages.
Figure 2.
Differentially expressed genes in sheep development from the oocyte to the blastocyst stages.
Figure 3.
Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses of differentially expressed genes in sheep development from the oocyte to the blastocyst stages. GO and KEGG analyses between (a) the oocyte and zygote stages, (b) the zygote and 2-cell stages, (c) the 2-cell and 4-cell stages, (d) the 4-cell and 8-cell stages, (e) the 8-cell and 16-cell stages, (f) the 16-cell and morula stages, and (g) the morula and blastocyst stages.
Figure 3.
Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses of differentially expressed genes in sheep development from the oocyte to the blastocyst stages. GO and KEGG analyses between (a) the oocyte and zygote stages, (b) the zygote and 2-cell stages, (c) the 2-cell and 4-cell stages, (d) the 4-cell and 8-cell stages, (e) the 8-cell and 16-cell stages, (f) the 16-cell and morula stages, and (g) the morula and blastocyst stages.
Figure 4.
Pipeline for identification of lncRNAs.
Figure 4.
Pipeline for identification of lncRNAs.
Figure 5.
LncRNA characteristics in sheep development from the oocyte to the blastocyst stages. (a) Comparison of the expression level between lncRNA and protein-coding genes in terms of fragments per kilobase of exon per million fragments mapped (FPKM). The FPKM distribution of lncRNAs in mouse uterus was lower than that of protein-coding genes. (b) Distribution of transcript lengths in the lncRNAs and protein-coding genes. The average size of lncRNA transcripts was generally shorter than that of protein-coding genes. (c) The number of exons in lncRNAs and protein-coding genes. A total of 88.41% of the lncRNAs contained two to four exons, while the majority of protein-coding genes contained more than 10 exons. (d) The number of ORFs identified in the lncRNAs and protein-coding genes using Estscan. As expected, the ORFs of lncRNAs were substantially shorter than the ORFs in protein-coding genes.
Figure 5.
LncRNA characteristics in sheep development from the oocyte to the blastocyst stages. (a) Comparison of the expression level between lncRNA and protein-coding genes in terms of fragments per kilobase of exon per million fragments mapped (FPKM). The FPKM distribution of lncRNAs in mouse uterus was lower than that of protein-coding genes. (b) Distribution of transcript lengths in the lncRNAs and protein-coding genes. The average size of lncRNA transcripts was generally shorter than that of protein-coding genes. (c) The number of exons in lncRNAs and protein-coding genes. A total of 88.41% of the lncRNAs contained two to four exons, while the majority of protein-coding genes contained more than 10 exons. (d) The number of ORFs identified in the lncRNAs and protein-coding genes using Estscan. As expected, the ORFs of lncRNAs were substantially shorter than the ORFs in protein-coding genes.
Figure 6.
The number of differentially expressed lncRNAs in eight comparison groups in sheep development from the oocyte to the blastocyst stages.
Figure 6.
The number of differentially expressed lncRNAs in eight comparison groups in sheep development from the oocyte to the blastocyst stages.
Figure 7.
GO and KEGG analysis with the cis and trans-target genes of lncRNAs. GO and KEGG analyses between (a) oocyte and zygote stages, (b) zygote and 2-cell stages, (c) 2-cell and 4-cell stags, (d) 4-cell and 8-cell stages, (e) 8-cell and 16-cell stages, (f) 16-cell and morula stages, and (g) morula and blastocyst stages.
Figure 7.
GO and KEGG analysis with the cis and trans-target genes of lncRNAs. GO and KEGG analyses between (a) oocyte and zygote stages, (b) zygote and 2-cell stages, (c) 2-cell and 4-cell stags, (d) 4-cell and 8-cell stages, (e) 8-cell and 16-cell stages, (f) 16-cell and morula stages, and (g) morula and blastocyst stages.
Figure 8.
The number of alternative splicing events and associated genes in sheep development from the oocyte to the blastocyst stages. TSS was the most frequent of the eleven types of events. (a) Oocyte stage. (b) Zygote stage. (c) 2-cell stage. (d) 4-cell stage. (e) 8-cell stage. (f) 16-cell stage. (g) Morula stage. (h) Blastocyst stage. AE alternative exon ends, AD alternate donor, AP alternate promoter, AT alternate terminator, ES exon skip, ME mutually exclusive exon, RI retained intron.
Figure 8.
The number of alternative splicing events and associated genes in sheep development from the oocyte to the blastocyst stages. TSS was the most frequent of the eleven types of events. (a) Oocyte stage. (b) Zygote stage. (c) 2-cell stage. (d) 4-cell stage. (e) 8-cell stage. (f) 16-cell stage. (g) Morula stage. (h) Blastocyst stage. AE alternative exon ends, AD alternate donor, AP alternate promoter, AT alternate terminator, ES exon skip, ME mutually exclusive exon, RI retained intron.
Figure 9.
UpSet plot of the three main alternative splicing events at different stages in sheep development from the oocyte to the blastocyst stages. UpSet plot of the interactions of the alternative splicing events associated with genes. (a) UpSet plot of the interactions of SKIP. (b) UpSet plot of the interactions of TSS. (c) UpSet plot of the interactions of TTS. TTS alternative transcription termination site.
Figure 9.
UpSet plot of the three main alternative splicing events at different stages in sheep development from the oocyte to the blastocyst stages. UpSet plot of the interactions of the alternative splicing events associated with genes. (a) UpSet plot of the interactions of SKIP. (b) UpSet plot of the interactions of TSS. (c) UpSet plot of the interactions of TTS. TTS alternative transcription termination site.
Figure 10.
Molecular characteristics of the main alternative splicing events in sheep development from the oocyte to the blastocyst stages. (a) UpSet plot of the main genes at different stages in sheep development from the oocyte to the blastocyst stages. (b) GO analysis. (c) KEGG analysis.
Figure 10.
Molecular characteristics of the main alternative splicing events in sheep development from the oocyte to the blastocyst stages. (a) UpSet plot of the main genes at different stages in sheep development from the oocyte to the blastocyst stages. (b) GO analysis. (c) KEGG analysis.
Table 1.
Numbers of embryos and cells analyzed by single-cell RNA-Seq analysis.
Table 1.
Numbers of embryos and cells analyzed by single-cell RNA-Seq analysis.
Sample |
No. of sample |
No. of single cells |
Oocyte |
3 |
3 |
Zygote |
3 |
3 |
2-cell |
3 |
3 |
4-cell |
3 |
3 |
8-cell |
3 |
3 |
16-cell |
3 |
3 |
Morula |
3 |
3 |
Blastocyst |
3 |
3 |
Total |
24 |
24 |
Table 2.
The numbers of genes detected in sheep from oocyte to blastocyst development.
Table 2.
The numbers of genes detected in sheep from oocyte to blastocyst development.
Stage |
No. of genes (FPKM > 0.1) |
Oocyte |
24729 |
Zygote |
25455 |
2-cell |
26714 |
4-cell |
28948 |
8-cell |
25920 |
16-cell |
24341 |
Morula |
22242 |
Blastocyst |
31157 |
Table 3.
Summary of read filter and alignment.
Table 3.
Summary of read filter and alignment.
Sample |
Raw reads |
Clean reads |
Clean bases |
Error rate (%) |
GC content (%) |
oocyte |
102,383,290 |
86,741,418 |
13.01G |
0.04 |
43.50 |
fertilized egg |
111,921,426 |
87,081,144 |
13.06G |
0.02 |
44.50 |
2-cell |
123,273,652 |
95,679,634 |
14.35G |
0.04 |
44 |
4-cell |
108,522,186 |
96,419,102 |
14.46G |
0.03 |
44 |
8-cell |
142,178,172 |
123,561,104 |
18.53G |
0.02 |
45 |
16-cell |
133,023,048 |
90,477,914 |
13.57G |
0.04 |
43.5 |
morula |
122,556,646 |
90,506,796 |
13.58G |
0.02 |
44.5 |
blastula |
141,488,370 |
105,354,406 |
15.80G |
0.03 |
44 |
Table 4.
Summary of SNP and indel classification for different sets.
Table 4.
Summary of SNP and indel classification for different sets.