Introduction
The way a species interacts with other species and its environment associated with its genetic diversity. A species' ability to adapt to disturbances caused by humans and the natural environment depends on variation as well. The degree of genetic diversity within a species affects that species' capacity to adapt to environmental changes [
1]. The measurement of these variation, however, cannot be limited to the use of traditional neutral markers. The polymorphisms of Major Histocompatibility Complex (MHC) genes are considered to affect the functional plasticity of immune responses to diverse pathogenic stressors, making them excellent candidates to research adaptive evolutionary processes in natural populations [
2,
3]. This trait highlights how the immune system susceptible to environmental stress and how crucial it is for illuminating the mechanisms of adaptive genetic variation required for the long-term survival of species or populations [
4,
5].
The most polymorphic region of the vertebrate genome that evolved under positive and balancing selection is the MHC [
5,
6,
7], a multigene family of the vertebrate adaptive immune system that contains highly polymorphic motifs that are strongly associated with immune response and disease resistance [
8,
9]. MHC genes belong to two main subfamilies, class I and class II, and encode proteins that are necessary for pathogen recognition and presentation to T cells. Functional class II proteins are heterodimers consisting of α and β chains, and DR subclasses are encoded by the
DRA and
DRB genes, respectively [
10]. The amino acid residues that bind directly to the antigen are called antigen binding sites (ABS). ABSs were located at the α1 domain of α1 chain and the β1 domain of β chain, in which MHC polymorphisms in vertebrates mainly occur [
11,
12].
Alpine ungulates are a significant factor in the maintenance of the structure of vegetation and the cycling of nutrients in high-mountain ecosystems, as well as a significant source of food for predators [
13,
14]. However, due to its slow growth, poor rate of reproduction, vulnerability to exploitation by humans, loss of habitat, susceptibility to infectious diseases, and other reasons [
15,
16,
17], Most of them are extremely vulnerable to extinction. Of these, a typical alpine hoofed species of the subfamily Caprinae (family Bovidae), is the Siberian ibex,
Capra sibirica. This species widely habituated in the alpine regions of Central Asia, from northern India through Pakistan and Afghanistan to Russia (Siberia), and eastward to northwest China and western Mongolia [
18]. According to studies, the Siberian ibex, to some extent, suffered threats from various pathogens like lethal bacteria, viruses, ticks, and mites [
19,
20,
21]. Moreover, it shares more than 76 percent of its food with domestic animals in Chinese territory [
22], indicating not only fierce food competition but also a greatly increased risk of becoming infected. Despite the importance of MHC genes for immunological fitness, an assessment of the diversity and occurrence of these genes is still lacking in the Siberian ibex, the globally ‘Near Threatened’ mammal in Central Asia, and locally urgently needs an effective conservation and management programs [
18].
Because of anthropogenic impacts, Siberian ibex populations dropped globally and their range shrunk drastically in the 1970s [
23,
24]. In China, particularly, it had been listed as an endangered species and given Class I protection priority in 1998 [
25]. The population has fortunately started to recover owing to effective conservation and management (creation of protected areas, etc.) by the Chinese government and neighboring nations in recent years, and thus its protection priority was decreased to Class II in 2021 [
26]. Generally, reduced genetic diversity is associated with demographic perturbation. Natural populations of many species that underwent a reduction in size exhibited very limited MHC diversity [
27,
28,
29]. However, both theoretical and empirical studies also showed that a longer timescale of selection maintained higher MHC diversity in a population experienced demographic fluctuations [
30,
31]. It is thus significant to study whether a highly genetically diverged Siberian ibex populations [
32] maintained high MHC diversity during a more than half-century period of recovery.
Therefore, our objectives in this study had three facets. To begin with, we aim to comparatively evaluate the MHC diversity in different Siberian ibex populations in Xinjiang, China, and discuss our results with other species bottlenecked. In addition, we also try to ascertain if the MHC
DRB1 divergence in different populations was in accordance with the results of mitochondrial genes divergence we reported previously [
32]. Finally, to check if the MHC
DRB1 genes in the Siberian ibex that went through population fluctuation resemble the common characteristics of MHC in other vertebrates, such as positive selection, recombination, and trans-species polymorphism, and to clarify the genetic relationships of MHC
DRB1 alleles of the relic species Siberian ibex and its congenerics, including domestic goats. Our results were of importance in understanding the adaptive ability of this species and planning scientific conservation strategies to ensure long-term population development.
Experimental Procedures
The total genomic DNA of fecal samples was extracted using an Omega stool DNA extraction kit (Omega Bio-tec, Georgia, USA), and that of muscle and skin samples was extracted using a Tiangen tissue/blood DNA extraction kit (Tiangen Bio-tec, Beijing, China), following the manufacturer's instructions. After electrophoresis detection, DNA concentration and purity were measured by the Thermo Nanodrop 1000 and stored at 4 ℃ for later use.
Part of MHC class II
DRB1 exon 2 (260 bp, excluding the primer sequences) was PCR amplified using primer pairs of CapDRB1.1F and CapDRB1.2R [
35], because this segment is the most polymorphic region and includes all ABSs necessary for pathogen recognition [
36,
37]. A PCR reaction volume contained 40-150 ng of DNA, 5 pmol of each primer, 12.5 μL of Tiangen's 2 TaqPCR Master Mix, and then adjusted to a final volume of 25 μL with RNase-Free double distilled water. The PCR thermal cycling conditions were as follows: pre-denaturation at 94 ºC for 4 min, followed by 35 cycles of denaturation at 94 ºC for 30 s, annealing at 64 ºC for 30 s, extension at 72 ºC for 45 s, and final extension at 72 ºC for 5 min. The PCR products were verified by 2 % agarose gel electrophoresis and green fluorescence dye imaging under ultraviolet irradiation, and those with the expected band size were sent for Sanger sequencing bidirectionally (Qingke Biology, Xi-an, China).
PCR products assumed to contain more than one sequences were proceeded to cloning and sequencing for allele isolation. PCR products were recovered using a gel recovery kit (Tiangen, Beijing, China), connected to the PMDTM19-T plasmid vector (Takara, Tokyo, Japan), then transformed into Escherichia coli (DH5α) receptive cells. For selection of cells with positive plasmids, the bacteria were grown on LB solid medium containing ampicillin, IPTG and X-gal at 37 ºC overnight. Bacteria containing plasmids with the target PCR product were screened by blue/white selection and direct-colony PCR amplification using M13 forward and reverse primers with the same PCR condition as described earlier. At least 8 clones per sample were bidirectionally sequenced for each individual.
Data Analyses
The nucleotide, amino acid, super type diversity, and pairwise population fixation indices (
FST) of
DRB1 exon 2 for different populations were calculated by DnaSP v.5.10.01 [
39], and the neutral selection was analyzed. MEGA v.6.0 were used to estimate the ratio
ω (
dN/
dS) of non-synonymous (
dN) to synonymous (
dS) substitution rates [
44]; this ratio provides a measure of selective pressure at the level of individual sites [
45]. Values of
ω > 1 indicate positive selection, while
ω = 1 and
ω < 1 indicate neutral evolution and purifying selection, respectively. Values of
dN,
dS and
ω were calculated separately for presumed ABSs deduced according to Reche & Reinherz [
46], non-ABSs, and all sites. HyPhy [
47] implemented in MEGA was used to detect signs of positive selection. In order to examine positive selection across all sites based on maximum likelihood methods, CodeML in PAML 4.9 [
48] was employed as well. The likelihood ratio tests (LRTs) were used to compare the four models: M1a, almost neutral; M2a, positive selection; M7, beta; and M8, beta and
ω, and decide which model best fit our data [
45,
49,
50]. Using LRTs, two nested models (M1a vs. M2a; M7 vs. M8) were compared. Using Bayes Empirical Bayes inference [
51], positively selected locations were found. In addition, using Datamonkey v.2.0 [
52], a web-based server for the HyPhy Package, a mixed-effects model of evolution, MEME [
53] analysis was carried out to find codons that had been subject to positive selection.
Gene recombination analysis of
DRB1 exon 2 sequences was performed in RDP4 [
54]. Specific methods were first used in RDP [
55], GENECONV [
56], MaxChi [
57] and Bootscan [
58], which use default Settings to detect recombination events using Bonferroni correction for multiple comparisons. Recombination events detected by at least three of these methods were then rechecked using all RDP methods available [
55]. In addition, we also use the GARD [
59], provided by the Datamonkey webserver [
60], to detect the signatures of recombination breakpoints. In order to avoid the impact of possible gene replication, conversion, and recombination on phylogenetic analysis, we chose Splitstree4 v.4.14.5 to construct a neighbor network of
DRB1 sequences [
5,
61].
Recombination and Selection on DRB1
It was hardly evident that significant recombination signatures exist in our analyses of the
DRB1 exon 2 sequences of the Siberian ibex. Hence, we used all sequences in the downstream analyses. To evaluate selection pressure, we calculated the
ω ratio of non-synonymous to synonymous substitution rates for positions in the presumed ABSs, non-ABS codons and all codons for three Clades. The
ω ratio value for ABS codons in
C. sibirica DRB1 was greater than one. Our result thus indicates that variation at the ABS codons were generated and maintained by positive selection. Comparatively,
ω value for Clade II was nearly twice of that for clades I and II, indicating that the selection intensity on these clades was different (
Table 3). This was in line with the results of PAML and MEME analyses that provide evidence for positive selections at the single codon level (
Table 4). Likelihood ratio tests (LRT) showed that the M2a and M8 models provided significantly better fits to our data than models without selection (
Table 4). The M2a and M8 models identified 13 and 14 positively selected sites in Clade I, respectively, 13 from each model in Clade III, while only five from each model in Clade II (
Figure 2,
Table 4), with most of the sites occurring in presumed ABS codons. Finally, the MEME analysis showed six codons under positive selection in Clade I, only one in Clade III, and none in Clade II (
Figure 2).
MHC DRB1 Diversity and Divergence
Indirect indicators of the immunological fitness of populations, MHC genes are adaptive genetic markers useful in wild animal populations of concern for protection [
3,
63]. Many species, which went through severe bottlenecks, show very low levels of genetic diversity at the MHC, for example, mountain goats,
Oreamnos Americanus [
64] and Galà pagos penguin,
Spheniscus mendiculus [
65]. Our study on MHC class II
DRB1 exon 2 allowed, for the first time, a comparison of genetic variation among
C. sibirica populations that genetically highly diverged and underwent population reduction in size in Xinjiang, China [
18,
32]. Unexpectedly, we found higher allelic diversity of MHC class II
DRB1 loci in
C. sibirica compared to other congenerics. Likewise, despite a rinderpest epidemic-induced bottleneck, high allelic diversity for the
DRB3 gene was reported for the African buffalo,
Syncerus caffer [
66]. Although the 26 PFAs we detected in 43
C. sibirica individuals (
Figure 1,
Table A1) seem to be lower than the 22 PFAs among 25 samples reported for its domestic counterpart from six different breeds [
67], only seven PFAs were found among 132 individuals of
Capra pyrenaica with two subspecies,
C. p. hispanica and
C. p. victoria [
68]. This high number of alleles is mainly attributed to this species’ sexual segregation and preference for different habitats and diets for both genders [
69,
70]. Though the reduction in size in
C. sibirica may have an impact on the heterozygosity of the MHC
DRB locus, since more than half of the studied individuals (26 out of 43 samples) possess a single PFA (
Table A1).
Individuals of
C. sibirica clade II had low levels of diversity at the allelic, nucleotide, amino acid, and supertype levels relative to those of Clades I and III (
Table A1,
Table 1), indicating that the impact of population declines and/or environmental pathogenic pressures on the different geographic populations was different [
71]. We also cannot exclude the possibility that this difference is due to the low number of samples analyzed; thus, dense sampling is needed for further related studies.
Although we did not find a single allele shared by all three
C. sibirica clades, but found alleles common to two clades. For instance, the alleles
Casi-DRB1*16 (the most frequent one identified in individuals from east and middle Tianshan mountains, and Kunlun Mountains),
Casi-DRB1*17 and
Casi-DRB1*19 were shared by clades III and I, while alleles
Casi-DRB1*02 and
Casi-DRB1*05 were shared by clades III and II (Fig.1 and
Table A1). The radiation of these clades dates back around 6.75 million years ago [
32], indicating preservation of these alleles in
C. sibirica for such a long evolutionary time. Retaining alleles along the evolutionary time is not specific to our study. An allele was conserved in the genus
Meles for nearly 2 million years [
37]; two alleles were even shared by multiple species from different genera in mustelids [
72], diverged more than 11 million years ago [
73]; some MHC allele surprisingly preserved among different family [
5]. This is probably because
C. sibirica populations in Xinjiang, China subjected to same pathogenic burdens for a long evolutionary time, as polymorphism in MHC gene was pathogen driven [
74].
Meantime, we found more population- or clade-specific alleles (
Figure 1,
Table A1), implying high differentiation at this locus. The long radiation time (3.3–6.7 million years) of these populations or clades probably illuminates this phenomenon [
32]. MHC genes also showed genetic differentiation between populations in some mammal species [71, 75-76) The
FST values between Clade I and Clade II and Clade II and Clade III were greater than 0.25 (
Table 2), indicating large genetic differences. What is puzzling is that the
FST value between the Clade I and Clade III was negative. This is because that the differences within populations were greater than the differences between populations [
77]. The negative value of
FST generally interpreted as 0 [
78], which means clade I and III were not differentiated. However, after excluding the shared alleles between these clades in
FST estimation, it turns to be low differentiation (
Table 2). Overall, this in part supports the pattern of mitochondrial DNA [
32].
The number of PFAs identified for our studied individuals shows 1 or 2 loci of the
DRB gene (
Table A1), with a low frequency of 2 loci (5 out of 43 individuals), though. Many species in the genus
Capra, including Alpine ibex (
Capra ibex), Spanish ibex (
Capra pyrenaica), and Himalayan tahr (
Hemitragus jemlahicus), have only one locus of
DRB [
79]. It can be seen that the ancestor of
Capra species is supposed to possess a single locus at the MHC
DRB gene. Despite the small portion of individuals with two loci, they split into clades I and III, respectively. We assume that the one locus likely emerged from the other locus through gene duplication [
6]. Even if no evidence supports the occurrence of recombination events due likely to the shortness of our analytical sequences, intergenic recombination or gene conversion may explain this phenomenon as well [
80], and they might happen twice independently in these two clades. A population genome study on the MHC class II region will help us demonstrate this notion.
Evolution of the DRB1 Gene
Generally, MHC gene polymorphism were generated and retained by gene recombination [
79,
81], gene duplication [
6], balancing selection [
74,
82], and/or positive selection [
37,
74,
82]. In our study, we did not find any significant signature of recombination events, convincing us that gene recombination was not the reason for generation of MHC diversity. Nonetheless, we found more rare alleles than shared or high frequency alleles (
Table A1). This is suggestion of balanced polymorphisms include negative-frequency-dependent selection, where rare alleles are favored. Besides, we also found a notable excess of nonsynonymous over synonymous substitutions at ABSs, in different clades (
Table 3). In our phylogenetic relationship analysis,
DRB1 sequences of
C. sibirica were grouped with the sequences of its counterparts (
Figure 3), suggesting that some alleles are phylogenetically more closely related to the alleles of other species than to those of its own, a typical trans-species polymorphism [
83] which is reported for MHC genes of many species [
5,
37,
74,
82,
84]. All of these were the evidence supporting the presence of long-term balancing selection in the
C. sibirica, considering that the
Capra species were diverged approximately 6 million years ago [
32]. Moreover, the PAML CodeML and MEME analyses identified up to 12 positively selected sites, most of which coincide with the ABSs (
Figure 2 and
Table 4), suggested that the sequence variation of
DRB1 genes was driven by positive selection due to pathogenic burdens [
19,
20,
21]. In sum, our results together indicate that selection was the main force shaping and maintaining
DRB1 gene polymorphism in
C. sibirica.
It is worthy to mention that we as well as observed an exceeded nonsynonymous relative to synonymous substitutions at the none-ABSs in all clades of
C. sibirica (
Table 3), which is in line with the positive selection analyses that showed several positively selected sites out of ABSs (
Figure 2 and
Table 4). This is consistent with the results of Abduriyim et al. [
37] in a species of Canidae. Considering that all MHC studies deduce the ABS locations based on human MHC structure [
46], the actual location of ABSs in the MHC Class II DR β-chain of
C. sibirica, radiation from humans took place as far back as 95 million years [
85], may be different. This leaves an open question if ABSs of MHC molecule in all mammals were overlapped.