Mutation Trajectory of Omicron SARS-CoV-2 Virus

Tomokazu Konishi; Toa Takahashi

doi:10.20944/preprints202402.1283.v1

Submitted:

22 February 2024

Posted:

22 February 2024

You are already at the latest version

Part of the Following Collection

Preprints on COVID-19 and SARS-CoV-2

Abstract

Since 2019, the SARS-CoV-2 virus has caused a global pandemic, resulting in widespread infections and ongoing mutations. Analysing these mutations is essential for predicting future impacts. Unlike influenza mutations, SARS-CoV-2 mutations displayed distinct selective patterns that were concentrated in the spike protein and small ORFs. In contrast to the gradual accumulation seen in influenza mutations, SARS-CoV-2 mutations lead to the abrupt emergence of new variants and subsequent outbreaks. This phenomenon may be attributed to their targeted cellular substances; unlike the influenza virus, which has mutated to evade acquired immunity, SARS-CoV-2 appeared to mutate to target individuals who have not been previously infected. The Omicron variant, which emerged in late 2021, demonstrates significant mutations that set it apart from previous variants. The rapid mutation rate of SARS-CoV-2 has now reached a level comparable to 30 years of influenza variation. The most recent variant, JN.1, exhibits a discernible trajectory of change distinct from previous Omicron variants.

Keywords:

mutations

;

conservation

;

variants

;

host specificity

;

JN.1

;

principal component analysis

Subject:

Biology and Life Sciences - Virology

1. Introduction

COVID-19, caused by the SARS-CoV-2 virus, is an unprecedented infectious disease responsible for numerous cases and fatalities worldwide [1]. The virus has undergone successive mutations [2], and understanding these mutations is crucial for predicting its future behavior.

The purpose of the study is twofold. Firstly, to understand the characteristics of the Omicron variant and how it has evolved. Previously, the WHO has distinguished variants, particularly those with high transmissibility and widespread infections, by assigning them Greek letters starting with Alpha. However, the Omicron variant, which emerged at the end of 2021, has proven to be more transmissible than any previous variant, persisting and dominating the virus's brief history for about half of its duration.

Secondly, to compare the changes in the SARS-CoV-2 virus with those of influenza H1N1 virus. Influenza viruses, which are similarly highly transmissible and cause annual outbreaks, are often equated with COVID-19. However, is this a valid comparison? Or might these have entirely different natures? Those were investigated by using principal component analysis (PCA) [3].

PCA offers several compelling advantages over more prevalent clustering methods. Its primary strength lies in its ability to provide a comprehensive overview. While clusters primarily indicate the proximity between adjacent samples, PCA offers a holistic perspective, revealing the inter-group distances and the degree of divergence exhibited by new variants. Furthermore, PCA elucidates sequential differences between individual variants through PC_item, a level of detail often obscured in clustering analyses. Importantly, PCA results are highly reproducible across calculations, ensuring consistency in outcomes. Additionally, PCA facilitates the straightforward positioning of new samples along the identified axes, a task that necessitates recalibration in clustering methods for each new sample. This attribute proves advantageous for classification tasks.

2. Materials and Methods

2.1. PCA

Firstly, sequences were converted into a computable format using Boolean transformation. Subsequently, the matrix M was constructed by computing the difference between the mean sequence and individual samples. Singular value decomposition was then applied to M to extract principal components (PCs) for both samples and items. This methodology operates under the assumption that accumulated mutations determine the trajectory of sequence alterations. PCA identifies these trajectories and assigns orthogonal axes to them to ensure their independence. The deviation of each mutation from the mean sequence along a specific axis is calculated, resulting in the assignment of a PC_item. Consequently, the PC_sample of a given sample is determined by the PC_item values of the entire sample along that axis, representing its distance from the mean. Samples with similar mutations exhibit comparable values along a particular axis.

Mathematically, singular value decomposition decomposes the matrix data into direction and distance components, which are expressed as M=UDV^*. Here, M is a matrix where each row represents the aligned sequence data, and each column represents a single item—nucleotide base for nucleotide sequences or amino acid residues for amino acid sequences. U and V are unitary matrices representing the direction associated with the sample and item, respectively, and D is a diagonal matrix indicating distance. Conceptually, principal components (PCs) can be understood as a rotation of M by these unitary matrices, or as representing the distance D in the directions U and V; PC_sample = MV = UD, and PC_item= M^*U = VD. The computations were performed using R [4], employing the same PCA methodology and R-code as those outlined in a previous publication [3].

Mutations are categorized across multiple axes, with the significance of each axis intensifying as a substantial number of mutations accumulate. This significance is quantified by the extent to which the axis encompasses the mutations, termed as contribution. Usually, PC_sample is normalized by the number of items and PC_item by the number of samples to facilitate comparison with different sizes [13]. However, in this study, PC_sample was not normalized. Therefore, the numbers represent the changed amino acids as they are.

2.2. Data Curation

Sequence data were obtained from GISAID [5], with weekly systematic selection to minimize country overlap. Alignment was performed using the DECIPHER function [6]. For the PCA of the S proteins, the axes were found by sorting the collection dates and arranging them by month, sampling 30 for each month, to avoid bias due to time of year. For the nucleic acid sequences, axes created previously, when the Omicron variant had just emerged, were used [2]. PC was computed by fitting these axes to individual data points. Detailed calculations, including PC_sample and PC_item, are provided in the supplementary materials. Protein analyses were conducted using translated segments, with influenza virus data sourced from prior publications [7]. R codes for protein analyses are included in this article.

PCA does have a limitation: mutations not aligned with the axes are not visible, even though a higher PC may show many mutations, this may not be the case for all mutations. Therefore, it is essential to consider and examine mutations down to lower PCs. Supplements are available for this purpose.

2.3. 3D Structure of Proteins

For the 3D model of the spike protein, we selected 7DWZ [8] from the RCSB Protein Data Bank [9] due to its closest sequence resemblance and the absence of a His-tag at the C-terminus [10]. Using ATOM data from this structure, we extracted the α-carbon position of each amino acid and connected them with lines using the rgl function in R [11]. Amino acids corresponding to mutations were identified based on significant changes (>0.005) by selecting the axis with the greatest alteration for each from the PC_sample and determining the standard deviation (SD) from the PC_item values along the same axis. All mutations exceeding a large SD across all sequences (>0.036) were included. These thresholds were identified as inflection points in the normal probability plot.

3. Results

3.1. Mutations Observed from Nucleotide Sequence

How has SARS-CoV-2 mutated? Previous variants up to Omicron diverged into three lineages [12]. With the exception of the earliest ones, the trajectory was predominantly unidirectional. However, the emergence of the Omicron variant resulted from a significant mutation in a direction distinct from that of its predecessors (see Figure 1A). Previous mutations are nearly linearly represented in PC2, with Delta appearing positive and Lambda the most negative. In contrast, Omicron is situated at an almost perpendicular deviation from the trajectory, followed by variants delineated by PC1. A higher level of PC indicates a greater magnitude of mutation. The perpendicular deviation is attributed to the emergence of mutations completely unrelated to those observed in prior variants.

Examination of PC1 in chronological order revealed the sudden emergence of the BA.1 variant (Figure 1B). Ideally, such an event should result from a gradual accumulation of mutations; however, due to mutations occurring in unsequenced regions, intermediate stages were unrecorded [2]. Following the date indicated by the dashed vertical line (1st December 2021), the prevalence of preceding variants swiftly declined, giving way to the dominance of BA.1. Some of these variants feature insertions in the spike protein (Spike ins214EPE). Subsequently, the BA.1 outbreak also subsided, making room for other Omicron variants (Figure 1B). Notably, some of the latest JN.1 variants exhibit insertions (Spike ins16MPLF). Although deletions are commonly observed in the short history of SARS-CoV-2 variations, insertions are relatively rare. It is not known which mutations preceded JN.1, but this is closer to earlier variants than Omicrons (Figure 1A). Many of the sequence features exhibited by BA.1 and shared among Omicrons have been lost from JN.1.

A significant disparity existed between the mutational patterns observed for SARS-CoV-2 and influenza. Unlike influenza (Figure S1), mutations in SARS-CoV-2 do not accumulate gradually. Instead, they appear to manifest abruptly and substantially, as depicted in Figure 1, where variants such as BA.1, XBC.1.3, JN.1, and other Omicrons suddenly emerge. This trend is also consistent across the other axes (Figure S3). Additionally, in SARS-CoV-2, the contribution of the primary PC axis is minimal (Figure S2), suggesting that the individual mutations occur independently and lack coherence. In contrast, for the hemagglutinin of the influenza virus, PC1 shows a continuous increase (Figure S1), with a concentration of high contribution rates to higher PC observed (Figure S2).

3.2. Sites of Mutation in the Genome

The standard deviations observed across each nucleotide are shown in Figure 2. When a base is conserved, it registers to zero, and with increased variability, the deviations become larger. Notably, the largest open reading frame (ORF), 1ab, demonstrated relative stability but in some segments, it mutated frequently. Certain ORFs, such as ORF3a and particularly E, exhibited minimal mutability, and a portion of the N region displayed conservative behavior. Conversely, smaller ORFs, such as ORFs M, 6, 7b, and 8, manifested high dynamism. Of these, spike glycoprotein (S) undergoes the most frequent mutations, with nearly all bases exhibiting variation. This is in stark contrast to influenza, where all ORFs mutate at uniform rates [7]. It is noteworthy that some of these mutations are silent and do not alter the encoded amino acids.

3.3. Mutations Observed in Spike Protein

Alterations of the spike proteins are shown in Figure 3, Figures S4 and S5. As the variability in the mutations diminished by excluding silent mutations, the contribution became more focused on PC1 (Figure S2). Notably, BA.1 resembled the previous variants more closely than in the result of the nucleotide sequence. Presumably, this difference was due to when the axes were set; axes used for the nucleotide sequence were less sensitive to new Omicron variants. Similar to nucleotides, protein mutations manifest abruptly, resulting in discrete clusters, as exemplified by the XBC.1.3 (Figure S4). Remarkably, instances of stacking of mutations were infrequent; however, an exception was evident in PC8 (stacking from BA.2 to BQ.1.2, depicted in Figure S4B), indicating that BA.1 did not overlay these alterations. Previous variants also exhibited analogous shifts along this axis; however, mutations in subsequent variants diverged from this pattern. It is worth mentioning that JN.1 occupies a unique position in PCs 3 and 4 because the mutation responsible for its emergence is fundamentally distinct from preceding mutations (Figure 3B). This became visible because of the new axes, but it is noteworthy that the changes have been large enough to occupy such a high level of PCs and some lower PCs (Figure S5).

It is important to note that the range width of PCs is almost the same for influenza (Figure S1) and SARS-CoV-2, although influenza has a higher concentration of contribution to higher PCs. Hemagglutinin is shorter than the S protein by about half the length, so the proportion altered is higher in the former. However, a similar number of amino acid variations were observed. Influenza has accumulated for 35 years, while SARS-CoV-2 is only one-tenth. This indicated that the latter mutated rapidly.

3.4. Mutated Sites in Spike Proteins in the 3D Structure

The positions that have changed in the 3D structure are shown (Figure 4 and Figure S6). In SARS-CoV-2, most mutations are on the outer side of the spike, but some are on the inner side. In particular, as can be seen in 3D, some residues remain unchanged on the outside, although this may be due to the conservative nature of binding to ACE2 (see Figure S6, which can be rotated in and out). In contrast, in hemagglutinin, the inner residues that are thought to be important for subunit binding are not mutated. Many of the outer residues were altered before being replaced by the Pdm09 strain.

4. Discussion

The Omicron variant exhibited significant mutations at sites not previously mutated in earlier strains (Figure 1), resulting in a different direction of change. This likely facilitated infection of previously inaccessible people, leading to a surge in patients. JN.1 similarly emerged with substantial changes at previously unmutated sites (Figure 3), with many mutations inherited from BA.1 lost, indicating a new direction of mutation.

COVID-19 and influenza were quite different in their epidemics. Influenza saw a continuous evolution of a single variant from 1975 to 2009, with mutations accumulating over time. If mutations accumulate in this manner (similar to a random walk), each PC appears as a sin wave, PC1 as a half-wavelength, PC2 as a cycle, PC3 as 1.5 cycles, and so on; lower levels of PCs cover recurrent mutations and frequently replaced mutations. This scenario was observed in influenza [7]. PC1 monotonically increased (Figure S1A), and PC2 demonstrated mutations and reversions, resulting in a clockwise rotation of PC1 and PC2 when compared annually (S1B). Perhaps influenza infects nearly everyone (though only a small portion become symptomatic) due to its ability to bind to common sialic acid receptors in humans. Therefore, there is basically only one variant of influenza that is prevalent each year. Many people will be immune, so the variant cannot spread again the following year. The following year it will be prevalent elsewhere and mutate a little. After a few years, when enough mutations have accumulated, it returns and causes an epidemic again. Figure 4B and Figure S1 show the result of this accumulation after about 35 years.

In contrast, a certain variant of SARS-CoV-2 infects far fewer individuals [14,15]. For instance, the XBC.1.3 strain had a short-lived prevalence with a limited number of cases (Figure 1B). Consequently, infection rates fluctuate, and the number of cases per wave remains substantially smaller relative to the total population. These minor outbreaks characterize individual variants, leading PCA to detect them along one of its axes. Consequently, the contribution of each axis is markedly small (Figure S2). This is likely due to the spike protein of SARS-CoV-2 binding to ACE2, a protein presented by human cells; naturally, ACE2 exhibits polymorphism, affecting the efficiency of viral binding and resulting in multiple strains circulating simultaneously [16,17]. As each variant undergoes mutations independently, these mutations do not necessarily accumulate. New variants appear suddenly, irrespective of previous variants (Figure 1, Figure 2 and Figure 3, S). Consequently, many individuals may lack immunity to a particular mutation, increasing the likelihood of recurrent appearances of the mutation. Moreover, ample room for further mutations remains evident (Figure 4), suggesting a continued potential for this protein to evade immunity.

SARS-Cov-2 mutates so quickly that primers for PCR to detect it are sometimes disabled [10]. Additionally, mRNA vaccines, which were initially effective, quickly lost their efficacy [14]. The rapid and continuous mutations observed especially in the S protein (Figure 2A) show that it is impractical to target the protein that mutates at such a fast pace for detection or immune response purposes. There are health concerns associated with repeated vaccinations [18-20], and this has been confirmed by an increase in IgG4 [21-23]. Alternatively, although the effectiveness of suppressing infection may be lower, a vaccine targeting ORF3a, E, and certain regions of N may prevent severe illness, with longer-lasting efficacy. All ORFs of influenza mutate at the same rate [7], so every viral component could be a target for immunity. Of course, similar effects could be achieved using attenuated viruses derived from animal variants, which could be produced without sophisticated technology [24].

The differences in mutations between influenza and SARS-CoV-2, such as the presence of conserved ORFs and the lack of mutation accumulation, suggest distinct selective pressures governing their evolution. Influenza faces selection pressure aimed at evading acquired immunity, allowing only sufficiently mutated variants to drive subsequent outbreaks. Consequently, mutations accumulate. In contrast, humans had no immunity against SARS-CoV-2; only those who had received the mRNA vaccine were immune to the S protein. Thus, the S protein underwent concentrated mutations. Above all, the change in this protein allowed people with a different ACE2 to be infected, which would have created a selection pressure. So several variants were prevalent at the same time, and new mutations occurred independently of each other. This is probably why the mutations have not accumulated. The protein still has a lot of room for mutation (Figure 4). Likely, epidemics will continue, mainly among people who have not been previously infected. Until a convergence of the epidemic is confirmed, public health measures should not be abandoned.

The JN.1 variant, which emerged relatively recently, demonstrates distinct characteristics compared to earlier iterations of Omicron. Notably, it harbors a mutation not found in either the historical record of Omicron or its precursor variants (Figure 3B). Moreover, JN.1 abolishes many mutations typical of BA.1 and other Omicron variants, showing greater similarity to earlier variants (see Figure 1A). Previous observations have indicated low virulence in early Omicron variants [14], likely due to these mutations. Indeed, an increase in hospitalizations has been observed since the emergence of JN.1 [25], possibly linked to the abolition of these mutations. Some report no significant difference in symptoms [26], while others report significantly higher infectivity [27]. This variant appears to have surpassed previous Omicron outbreaks, following a trajectory similar to that of BA.1 (see Figure 1B). If derivatives of this variant continue to replace Omicron and trigger outbreaks, it may be prudent to designate it as Pi for precautionary purposes.

Supplementary Materials

The following supporting information can be downloaded at the website of this paper posted on Preprints.org. Figure S1: Changes in haemagglutinin; Figure S2: Contribution of PC; Figure S3: PCA of nucleotide sequences; Figure S4: PCA of S protein; Figure S5: PCA of amino acid sequences, S protein; Figure S6: Proteins in freely movable 3D displays; Data: PC of nucleotides, PC of S protein.

Author Contributions

Conceptualization, calculation, and writing, TK; data curation and validation, TT.

Funding

This study was supported by the Akita Prefectural University Student-led Research Program (TT).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data are available from GISAID. The results obtained are available in Supplement. The R code is also available in the cited papers and from Github https://github.com/TomokazuKonishi/direct-PCA-for-sequences.

Conflicts of Interest

The authors declare no conflicts of interest.

References

WHO. Coronavirus disease (COVID-19) Pandemic. Available online: https://www.who.int/emergencies/diseases/novel-coronavirus-2019 (accessed on 23 May).
Konishi, T. Mutations in SARS-CoV-2 are on the increase against the acquired immunity. PLoS ONE 2022, 17, e0271305. [Google Scholar] [CrossRef] [PubMed]
Konishi, T.; Matsukuma, S.; Fuji, H.; Nakamura, D.; Satou, N.; Okano, K. Principal Component Analysis applied directly to Sequence Matrix. Scientific Reports 2019, 9, 19297. [Google Scholar] [CrossRef] [PubMed]
R Core Team. R: A language and environment for statistical computing; R Foundation for Statistical Computing: Vienna, Austria, 2020; Available online: https://cran.r-project.org/ (accessed on 19 February 2024).
Elbe, S.; Buckland-Merrett, G. Data, disease and diplomacy: GISAID's innovative contribution to global health. Glob Chall 2017, 1, 33–46. [Google Scholar] [CrossRef] [PubMed]
Wright, E.S. DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment. BMC Bioinformatics 2015, 16, 322. [Google Scholar] [CrossRef] [PubMed]
Konishi, T. Re-evaluation of the evolution of influenza H1 viruses using direct PCA. Scientific Reports 2019, 9, 19287. [Google Scholar] [CrossRef] [PubMed]
Yan, R.H., Zhang, Y.Y., Li, Y.N., Ye, F.F., Guo, Y.Y., Xia, L., Zhong, X.Y., Chi, X.M., Zhou, Q. S protein of SARS-CoV-2 in the active conformation. Available online: https://doi.org/10.2210/pdb7DWZ/pdb (accessed on 19 February 2024).
Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The Protein Data Bank. Nucleic Acids Research 2000, 28, 235–242. [Google Scholar] [CrossRef] [PubMed]
Yan, R., Zhang, Y., Li, Y., Ye, F., Guo, Y., Xia, L., Zhong, X., Chi, X., Zhou, Q. Structural basis for the different states of the spike protein of SARS-CoV-2 in complex with ACE2. Cell Res 2021, 31, 717–719. [CrossRef] [PubMed]
Murdoch, D.; Adler, D.; Nenadic, O.; Urbanek, S.; Chen, M.; Gebhardt, A.; Bolker, B.; Csardi, G.; Strzelecki, A.; Senger, A.; et al. rgl: 3D Visualization Using OpenGL. Available online: https://cran.r-project.org/web/packages/rgl/index.html (accessed on 19 February 2024).
Konishi, T. Continuous mutation of SARS-CoV-2 during migration via three routes. PeerJ 2022, 10, e12681. [Google Scholar] [CrossRef]
Konishi, T. Principal component analysis for designed experiments. BMC Bioinformatics 2015, 16 (Suppl. 18), S7. [Google Scholar] [CrossRef] [PubMed]
Konishi, T. A Comparative Analysis of COVID-19 Response Measures and Their Impact on Mortality Rate. COVID 2024, 4, 130–150. [Google Scholar] [CrossRef]
Konishi, T. COVID-19 Epidemics Monitored Through the Logarithmic Growth Rate and SIR Model. J Clin Immunol Microbiol 2022, 3, 1–45. [Google Scholar] [CrossRef]
Suryamohan, K.; Diwanji, D.; Stawiski, E.W.; Gupta, R.; Miersch, S.; Liu, J.; Chen, C.; Jiang, Y.-P.; Fellouse, F.A.; Sathirapongsasuti, J.F.; et al. Human ACE2 receptor polymorphisms and altered susceptibility to SARS-CoV-2. Communications Biology 2021, 4, 475. [Google Scholar] [CrossRef] [PubMed]
Sano, E.; Deguchi, S.; Sakamoto, A.; Mimura, N.; Hirabayashi, A.; Muramoto, Y.; Noda, T.; Yamamoto, T.; Takayama, K. Modeling SARS-CoV-2 infection and its individual differences with ACE2-expressing human iPS cells. iScience 2021, 24, 102428. [Google Scholar] [CrossRef]
Gao, F.X.; Wu, R.X.; Shen, M.Y.; Huang, J.J.; Li, T.T.; Hu, C.; Luo, F.Y.; Song, S.Y.; Mu, S.; Hao, Y.N.; et al. Extended SARS-CoV-2 RBD booster vaccination induces humoral and cellular immune tolerance in mice. iScience 2022, 25, 105479. [Google Scholar] [CrossRef] [PubMed]
Gruell, H.; Vanshylla, K.; Tober-Lau, P.; Hillus, D.; Schommers, P.; Lehmann, C.; Kurth, F.; Sander, L.E.; Klein, F. mRNA booster immunization elicits potent neutralizing serum activity against the SARS-CoV-2 Omicron variant. Nature Medicine 2022. [Google Scholar] [CrossRef] [PubMed]
Harvey, W.T.; Carabelli, A.M.; Jackson, B.; Gupta, R.K.; Thomson, E.C.; Harrison, E.M.; Ludden, C.; Reeve, R.; Rambaut, A.; Peacock, S.J.; et al. SARS-CoV-2 variants, spike mutations and immune escape. Nature Reviews Microbiology 2021. [Google Scholar] [CrossRef] [PubMed]
Irrgang, P.; Gerling, J.; Kocher, K.; Lapuente, D.; Steininger, P.; Habenicht, K.; Wytopil, M.; Beileke, S.; Schäfer, S.; Zhong, J.; et al. Class switch toward noninflammatory, spike-specific IgG4 antibodies after repeated SARS-CoV-2 mRNA vaccination. Science Immunology 2023, 8, eade2798. [Google Scholar] [CrossRef] [PubMed]
Kiszel, P.; Sík, P.; Miklós, J.; Kajdácsi, E.; Sinkovits, G.; Cervenak, L.; Prohászka, Z. Class switch towards spike protein-specific IgG4 antibodies after SARS-CoV-2 mRNA vaccination depends on prior infection history. Scientific Reports 2023, 13, 13166. [Google Scholar] [CrossRef] [PubMed]
Uversky, V.N.; Redwan, E.M.; Makis, W.; Rubio-Casillas, A. IgG4 Antibodies Induced by Repeated Vaccination May Generate Immune Tolerance to the SARS-CoV-2 Spike Protein. Vaccines (Basel) 2023, 11. [Google Scholar] [CrossRef] [PubMed]
Konishi, T. SARS-CoV-2 mutations among minks show reduced lethality and infectivity to humans. PLOS ONE 2021, 16, e0247626. [Google Scholar] [CrossRef] [PubMed]
National Institute of Infectious Diseases, Japan. New Coronavirus Infectious Disease Surveillance News/Weekly Report: Understand the status of outbreak trends. Available online: https://www.niid.go.jp/niid/ja/2019-ncov/2484-idsc/12015-covid19-surveillance-report.html (accessed on 19 February 2024).
Kathy, K. 3 Things to Know About JN.1, the New Coronavirus Strain. Yale Medicine, News. Available online: https://www.yalemedicine.org/news/jn1-coronavirus-variant-covid (accessed on 22 February 2024).
Kaku, Y.; Okumura, K.; Padilla-Blanco, M.; Kosugi, Y.; Uriu, K.; Hinay, A.A., Jr.; Chen, L.; Plianchaisuk, A.; Kobiyama, K.; Ishii, K.J.; et al. Virological characteristics of the SARS-CoV-2 JN.1 variant. The Lancet Infectious Diseases 2024, 24, e82. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Mutations observed from nucleotide sequence. The axes were found at the end of 2021 when BA.1 emerged. (A) PC1 and PC2. BA.1 and Omicron were largely separated by mutations not seen in the previous variants, forming PC1. Some features of BA.1 were inherited by subsequent Omicron variants. However, XBC1.3 and JN.1 have reverted from the shared mutations and are closer to the former variants. (B) Time course of PC1. BA.1 appeared after 1st December 2021, indicated by the dotted line, and the former variants were almost absent. This was then replaced by a population of Omicron variants, which eventually moved to JN.1.

Figure 2. The standard deviation of how much each base has mutated since the beginning of the epidemic: (A) Large ORFs. (B) Small ORFs. In the spike protein, most bases are mutated, especially at the N-terminus. Some smaller ORFs are also highly altered. However, 3a and E were conservative.

Figure 3. Mutations were observed in the amino acid sequence of the S protein. (A) PC1 and 2, note the position of BA.1 (B) PC 3 and 4, see that JN.1 differs from others in the directions presented here.

Figure 4. The site of mutation is represented on the 3D model of the protein: (A) S protein of SARS-CoV-2, (B) haemagglutinin of influenza H1N1. Each protein is made up of the association of three identical subunits, shown in different colours; mutations are shown in blue on one subunit. For more details, see Figure S6, which can be moved.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.