Materials and Methods
This study was approved by the ethical committee (Application number: 16727) at the Directorate General of Health Duhok, Iraq. Once the cases completed routine SARS-CoV-2 diagnostic testing at Covid-19 Center and Duhok Burn Hospital (Covid-19 Hospital), the SARS-CoV-2 positive samples were subjected to next-generation sequencing.
Clinical Sample and Processing
A nasopharyngeal swab collected from 160 patients accompanied by a respiratory tract infection in Duhok received at the COVID-19 Center for research and diagnosis, University of Duhok in December 2021 in the early five waves was applied to this study. According to the instructions provided by the manufacturer, a QI amp RNA extraction kit was used to extract viral nucleic acids. 40 samples were selected to RNA extraction and quantitative detection using the QIAprep and amp Viral RNA UM Kit for real-time PCR-based virus identification. As required by the sequencing laboratory, the Ct value of the selected samples was (< 20). Following RNA extraction from the viral transport medium using a QIAamp Viral RNA Mini Kit, then samples were shipped on dry ice to the USA for whole-genome sequencing.
Genomic Sequencing
After the samples arrived in the USA, They were tested again for identification of targets and RNA integrity. Reverse transcription and cDNA synthesis were performed using Qiagen's (Germany) RT Kit and nanomere primers. Using an Illumina sample preparation kit, the individual samples were indexed and tagged. Using illumina Miseq instruments iVar 1.3.1 an and Coverage: 2423.56x (Illumina, San Diego, CA, USA), next-generation sequencing was carried out following the manufacturer's instructions. BWA-MEM alignment technique was used to assemble and align the short-read sequencing with the reference genome (NC 045512.2). The whole-genome sequence was obtained and submitted to the GISAID database, where it was given the assigned accession numbers. EPI_ISL_12604438 , EPI_ISL_12604442 ,EPI_ISL_12604444 ,EPI_ISL_12604448 ,EPI_ISL_12604451,EPI_ISL_12604457 EPI_ISL_12604460 ,EPI_ISL_12604463, EPI_ISL_12604471,EPI_ISL_12604476 EPI_ISL_12604477, EPI_ISL_12604478 ,EPI_ISL_12604481, EPI_ISL_12604482 ,EPI_ISL_12604483,EPI_ISL_12604487 ,EPI_ISL_12604488,EPI_ISL_12604489, EPI_ISL_12604490 , EPI_ISL_12604495, EPI_ISL_12604496,EPI_ISL_12604501 ,EPI_ISL_12604502,EPI_ISL_12604503 ,EPI_ISL_12604507, EPI_ISL_12604508 EPI_ISL_12604509, EPI_ISL_12604510 ,EPI_ISL_12604514 ,EPI_ISL_12604516 ,EPI_ISL_12604517, EPI_ISL_12604521 , EPI_ISL_12604526, EPI_ISL_12604527 ,EPI_ISL_12604528,EPI_ISL_12604532, EPI_ISL_12604845, EPI_ISL_12604846 ,EPI_ISL_12604847, EPI_ISL_12604848 .
The genome sizes were 29,488, 29,550,29,559,29,562,29,565 bp when compared to the Wuhan sequence (NC_045512.2).
Results and Discussions
Early January 2020 had seen the discovery of a novel coronavirus as the source of many pneumonia cases that had been reported from China in late December 2019 but were of unknown origin (16). The virus was later identified as the cause of Coronavirus Disease 2019 and recognized as the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (COVID-19). The virus has spread worldwide despite significant efforts to restrict the disease in China, and The World Health Organization classified COVID-19 as a global pandemic (WHO) in March 2020 (17). All CoV have the characteristic that single-stranded RNA makes up their genomes with positive polarity, which means that the RNA's base sequences are oriented in a manner that is identical to the later messenger RNA (mRNA). The CoV genome is the biggest RNA genome known to exist, measuring 26.4–31.7 kilobases (18).
In April 2022, we uploaded our isolated virus strain's genome sequences between four and five waves for genome analysis to the GISAID databases. According to the WHO coronavirus dash-board of the reported cases in Iraq, the samples of the latest fifth waves belonged to the Omicron variant of concern (pango_liniage BA.2) that is only discovered in this study and had never been identified in the previous studies in the country.
Figure 4.
Daily confirmed cases of SARS-COV-2between 2021 and 2022 on the WHO Coronavirus dash-board in Iraq.
Figure 4.
Daily confirmed cases of SARS-COV-2between 2021 and 2022 on the WHO Coronavirus dash-board in Iraq.
To find mutations, Sequences from SARS-CoV-2 isolates were compared to a Wuhan reference sequence (NC 045512.2). The position of the mutations was then predicted using the Nextclade (version 2.7.1) software. This study's analysis of the Omicron VOC of SARS-CoV-2 sequences revealed that the S-gene had the most changes, proceeded by the ORF1ab, N, M, ORF6, ORF3a, ORF9b, and E genes. Among the genes, ORF3a, ORF6, E-gene, and ORF9b had the least mutations.
Since spike glycoprotein is the primary target for both therapy and diagnosis, it is also the essential protein that defines viral host affinity and pathogenesis. The structural protein that is encoded by the S-gene with the greatest mutations acts as a protein that binds to viruses for host cell receptors and recognizes the host range (19). Over 90% of the neutralizing antibodies in COVID-19 convalescent plasma, which is subunit S1, are anti-RBD, the S1 viral protein is the most immunodominant one (20). There are a total of 38 non-synonymous mutations in this gene. including, T 76 I, T95I, G142D, Y 145 D, L 212 I, V 213 G, G339D, R346K, S371L, S373P, S375F, T376A, R408S, K417N, N440K, G446S, S477N, T478K, Q493R, G496S, Q498R, N501Y, Y505H, T547K, D614G, H655Y, N679K, P681H, N764K, D796Y, N856K, Q954H, N969Kand L981Fdel211/211 del3674/3676 del69/70 del142/144 del22/23 del68 /70, del142 /145, and del211 /212 deletions are also present. According to reports, this gene's mutation potential is higher than that of other genomic locations (21). The global prevalence of the S protein D614G variant has progressively increased over time and is currently present in about 74% of all known variants, according to the GISAID SARS-CoV-2 database as of June 25, 2020. One of the most common SARS-CoV-2 mutations, On January 24, 2020, a mutation called D614G, which changes the amino acid glycine (G) with a nonpolar side chain from the amino acid aspartate (D) with a polar negative charged side chain, was first identified in China. (22). It was discovered that D614G enhances the infectivity, transmission rate, and effectiveness of cellular entry for the SARS-CoV-2 virus across a wide spectrum of human cell types as a result of positive natural selection (23,24,25). One of the earliest and most common mutations is D614G, where glycine is substituted for aspartic acid at position 614 (G614). The RBM domain mutation D614G has been demonstrated to increase the density of S-proteins on the viral surface, increasing infectivity (26). The D614G mutation is present in the majority of circulating VOCs, such as Alpha (B.1.1.7), Beta (B.1.351), Delta (B.1.617.2), Gamma (P.1), and the more modern delta plus (AY lineage) and Omicron (B.1.1.529) variants (27). This is associated with elevated rates of transmission, infection, and viral evasion from neutralizing antibodies (28) .in this study It was discovered that some of the other frequently occurring RBD changes improved ACE2 binding including N501Y, S477N and E484K and same substitutions seen in the majority of VOCs, which are associated with greater transmissibility (29 ).in the existing change of VOC, the T478K, Q498R and Q493K mutations have demonstrated to increase the electrostatic potential, boosting the RBD-ACE2 binding affinity (29). Immune escape has also been connected to the existing change E484K (30). A variety of VOCs in the previous study, including B.1.617.2 (E484K/E484Q B.1.351 (E484K), P.1, and B.1.1.529, have been shown to have E484 replacements (E484A) (31).
It's interesting to note that during the evolution of SARS-CoV-2, the S1-NTD in the present study has additionally organized many mutations, including deletions idel69/70, and del 142/144. Similarly, the frequently deleted regions in the NTD are those at positions 69–70, 141–144, 146, 210, and 243-244 reported in the previous study, most NTD mutations were found to change antigenicity or eliminate epitopes, enabling immune escape. (32).
Most mutations are likely to have an impact on virus entry because the S2 remains stable across CoVs and has a low mutation rate. Additionally, it is less antigenic than S1, perhaps as a result of the substantial N-linked glycosylation, and is consequently not subject to as much selective pressure (33). The BA.1.BA.1.1 and BA.1.18 (Omicron variant) in the present study unexpectedly display several S2 changes, including D796Y, N856K, Q954H, N969K, and L981F. Similarly, the previous study reported that the Omicron variant B.1.1.529 has the following S2 substitutions: D796Y, N856K, Q954H, N969K, and L981F. (34). The global sustainability dissemination of B.1.1.529 shows that these mutations have a benefit. However,
Table 1.
The Entire genome sequences (40 Sequences) were aligned to the SARS-CoV-2 reference genome (NC 045512.2), using the program Nextclade version (Version 2.8.0).
Table 1.
The Entire genome sequences (40 Sequences) were aligned to the SARS-CoV-2 reference genome (NC 045512.2), using the program Nextclade version (Version 2.8.0).
Mutation |
Position |
Nucleotide change |
Code |
Amino acid Change |
Type of Mutation |
ORF1a (266...13468) |
|
|
|
|
|
|
444 |
GTT > GCT |
V 60 A |
Valin>Alanine |
Non-synonymous SNV |
|
593 |
CAT > TAT |
H 110 Y |
Histidine>Tyrosine |
Non-synonymous SNV |
|
670 |
AGT > AGG |
S 135 R |
Serine>Arginine |
Non-synonymous SNV |
|
1415 |
CTT > TTT |
L 384 F |
Leucine>Phenylalanine |
Non-synonymous SNV |
|
2790 |
ACT > ATT |
T 842 I |
Threonine>Isoleucine |
Non-synonymous SNV |
|
2832 |
AAG > AGG |
K 856 R |
Lysine>Arginine |
Non-synonymous SNV |
|
2883 |
TGT > TAT |
C 873 Y |
Cisteine>Tyrosine |
Non-synonymous SNV |
|
3896 |
GTT > TTT |
V 1211 F |
Valine>Phenylalanine |
Non-synonymous SNV |
|
4184 |
GGT > AGT |
G 1307 S |
Glycine>Serine |
Non-synonymous SNV |
|
4893 |
ACA > ATA |
T 1543 I |
Threonin>Isoleucine |
Non-synonymous SNV |
|
5007 |
ACG > ATG |
T 1581 M |
Threonin>Methionine |
Non-synonymous SNV |
|
510 - 518 |
ATG > -TG |
del82/84 |
del82/84 |
Non-frame shift deletion |
|
519 |
ATG > -TG |
M 85 V |
Methionine>Valine |
Non-synonymous SNV |
|
6176 |
GAT > AAT |
D 1971 N |
Aspartic acid>Asparagine |
Non-synonymous SNV |
|
6513 - 6515 |
|
del2083/2083 |
del2083/2083 |
Non-synonymous deletion |
|
6516 |
TTA > -TA |
L 2084 I |
Leucine>Isoleucine |
Non-synonymous SNV |
|
7036 |
TTA > TTT |
L 2257 F |
Leucine>Phenylalanine |
Non-synonymous SNV |
|
7488 |
ACT > ATT |
T 2408 I |
Threonine>Isoleucine |
Non-synonymous SNV |
|
8393 |
GCT > ACT |
A 2710 T |
Alanine>Threonin |
Non-synonymous SNV |
|
9344 |
CTT > TTT |
L 3027 F |
Leucine>Phenylalanine |
Non-synonymous SNV |
|
9474 |
GCT > GTT |
A 3070 V |
Alanine>Valine |
Non-synonymous SNV |
|
9534 |
ACT > ATT |
T 3090 I |
Threonine>Isoleucine |
Non-synonymous SNV |
|
9866 |
CTT > TTT |
L 32201 I |
Leucine>Isoleucine |
Non-synonymous SNV |
|
10029 |
ACC > ATC |
T 3255 I |
Threonin>Isoleucine |
Non-synonymous SNV |
|
10323 |
AAG > AGG |
K 3353 R |
Lysine>Arginine |
Non-synonymous SNV |
|
10449 |
CCC > CAC |
P 3395 H |
Proline>Histidine |
Non-synonymous SNV |
|
11405 |
GTC > TTC |
V 3714 F |
Valine>Phenylalanine |
Non-synonymous SNV |
|
11285-11293 |
|
del3674/3676 |
del3674/3676 |
Non-frame shift deletion |
|
11537 |
ATT > GTT |
I 3758 V |
Isoleucine>Valine |
Non-synonymous SNV |
|
12534 |
ACT > ATT |
T 409 I |
Threonine>Isoleucine |
Non-synonymous SNV |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ORF1b (13468...21555) |
|
|
|
|
|
|
13756 |
ATA > GTA |
I 97 V |
Isoleucine>Valine |
Non-synonymous SNV |
|
14408 |
CCT > CTT |
P 314 L |
Proline>Leucine |
Non-synonymous SNV |
|
14821 |
CCA > TCA |
P 452 S |
Proline>Serine |
Non-synonymous SNV |
|
15641 |
AAT > AGT |
N 725 S |
Asparagine>Serine |
Non-synonymous SNV |
|
15982 |
GTA > ATA |
V 839 I |
Valine>Isoleucine |
Non-synonymous SNV |
|
16744 |
GGT > AGT |
G 1093 S |
Glycine>Serine |
Non-synonymous SNV |
|
17410 |
GGT > TGT |
R 1315 C |
Arginine>Cisteine |
Non-synonymous SNV |
|
18163 |
ATA > GTA |
I 1566 V |
Isoleucine>Valine |
Non-synonymous SNV |
|
18433 |
GAT > CAT |
D 165 H |
Aspartic acid>Histidine |
Non-synonymous SNV |
|
19999 |
GTT > TTT |
V 2178 F |
Valine>Phenylalanine |
Non-synonymous SNV |
|
20003 |
GAT > GGT |
P 2179 G |
Proline>Glycine |
Non-synonymous SNV |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
S (21563...25384) |
|
|
|
|
|
|
21765 - 21770 |
TACATG > - - - |
del69/70 |
del69/70 |
Non-synonymous deletion |
|
21789 |
ACT > ATT |
T 76 I |
Threonine>Isoleucine |
Non-synonymous SNV |
|
21846 |
ACT > ATT |
T95I |
Threonine>Isoleucine |
Non-frame shift deletion |
|
21987 |
GGT > GAT |
G142D |
Glycine>Aspartic acid |
Non-synonymous SNV |
|
21987 - 21995 |
|
del142/144 |
del142/144 |
Non-frame shift deletion |
|
21996 |
TAC > -AC |
Y 145 D |
Tyrosine>Aspartic acid |
Non-synonymous SNV |
|
22194 - 22196 |
AAT > A-- |
del211/211 |
del211/211 |
Non-synonymous deletion |
|
22197 |
TTA > -TA |
L 212 I |
Leucine>Isoleucine |
Non-synonymous SNV |
|
222000 |
GTG > GGG |
V 213 G |
Valine>Glycine |
Non-synonymous SNV |
|
22578 |
GCT > GAT |
G339D |
Glycine>Aspartic acid |
Non-synonymous SNV |
|
22599 |
AGA > AAA |
R346K |
Arginine>Lysine |
Non-synonymous SNV |
|
22673 |
T > C |
S371L |
Serine>Leucine |
Non-synonymous SNV |
|
22674 |
C > T |
S 373 P |
Serine>Proline |
Non-synonymous SNV |
|
22686 |
TCC > TTC |
S 375 F |
Serine>Phenylalanine |
Non-synonymous SNV |
|
22688 |
ACT > GCT |
T 376 A |
Threonine>Isoleucine |
Non-synonymous SNV |
|
22786 |
AGA > AGC |
R408S |
Arginine>Serine |
Non-synonymous SNV |
|
22813 |
AAG > AAT |
K 417 N |
Lysine>Asparagine |
Non-synonymous SNV |
|
22882 |
AAT > AAG |
N440K |
Asparagine>Lysine |
Non-synonymous SNV |
|
22898 |
GGT > AGT |
G446S |
Glycine>Serine |
Non-synonymous SNV |
|
23013 |
GAA > GCA |
E 484 A |
Glutamic acid > isoleucine |
Non-synonymous SNV |
|
22992 |
AGC > AAC |
S477N |
Serine>Asparagine |
Non-synonymous SNV |
|
22995 |
ACA > AAA |
T478K |
Threonine>Lysine |
Non-synonymous SNV |
|
23040 |
CAA > CGA |
Q493R |
Glutamine>Arginine |
Non-synonymous SNV |
|
23048 |
G > A |
G496S |
Glycine>Serine |
Non-synonymous SNV |
|
23055 |
A > G |
Q498R |
Glutamine>Arginine |
Non-synonymous SNV |
|
23063 |
AAT > TAT |
N501Y |
Asparagine>Tyrosine |
Non-synonymous SNV |
|
23075 |
TAC > CAC |
Y505H |
Tyrosine>Histidine |
Non-synonymous SNV |
|
23202 |
ACA > AAA |
T547K |
Threonine>Lysine |
Non-synonymous SNV |
|
23403 |
GAT > GGT |
D614G |
Aspartic acid>Glycine |
Non-synonymous SNV |
|
23525 |
CAT > TAT |
H655Y |
Histidine>Tyrosine |
Non-synonymous SNV |
|
23599 |
T > G |
N679K |
Asparagine>Lysine |
Non-synonymous SNV |
|
|
|
|
|
|
|
23604 |
CCT > CAT |
P681H |
Proline>Histidine |
Non-synonymous SNV |
|
23854 |
AAC > AAA |
N764K |
Asparagine>Lysine |
Non-synonymous SNV |
|
23948 |
GAT > TAT |
D796Y |
Aspartic acid>Tyrosine |
Non-synonymous SNV |
|
24130 |
ACC > AAA |
N856K |
Asparagine>Lysine |
Non-synonymous SNV |
|
24424 |
CAA > CAT |
Q954H |
Glutamine>Histidine |
Non-synonymous SNV |
|
24469 |
AAT > AAA |
N969K |
Asparagine>Lysine |
Non-synonymous SNV |
|
24503 |
CCT > TTT |
L981F |
Leucine>Phenylalanine |
Non-synonymous SNV |
ORF3a (25393…26220) |
|
|
|
|
|
|
|
25471 |
GAT > TAT |
D 27 Y |
Aspartic acid>Tyrosine |
Non-synonymous SNV |
|
|
26060 |
ACT > ATT |
T 223 I |
Threonine>Isoleucine |
Non-synonymous SNV |
|
|
|
|
|
|
|
M (26523... 27191) |
26530 |
GAT > GGT |
D 3 G |
Aspartic acid>Glycine |
Non-synonymous SNV |
|
26577 |
CAA > GAA |
Q 19 E |
Glutamine>Glutamic acid |
Non-synonymous SNV |
|
26709 |
GCT > ACT |
A 63 T |
Alanine>Threonin |
Non-synonymous SNV |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ORF6 (27202…27387) |
27269 |
AAA > -AA |
K 23 * |
K23* |
Non-synonymous SNV |
|
27266 - 27268 |
TTA > - - - |
del22/23 |
del22/23 |
Non-frame shift deletion |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ORF9b (28284…28577) |
28311 |
CCC > TCC |
P 10 S |
Proline>Serine |
Non-synonymous SNV |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
N (28274…29533) |
28881 |
AGG > AAA |
R 203 K |
Arginine>Lysine |
Non-synonymous SNV |
|
28882 |
AGG > AAA |
R203 K |
Arginine>Lysine |
Non-synonymous SNV |
|
28883 |
GGA > ACG |
G 204 R |
Glycine>Arginine |
Non-synonymous SNV |
|
28311 |
CCC > CTC |
P 13 L |
Proline>Leucine |
Non-synonymous SNV |
|
28725 |
CCT > CTT |
P 151 L |
Proline>Leucine |
Non-synonymous SNV |
|
29000 |
GGC > AGC |
G 243 S |
Glycine>Serine |
Non-synonymous SNV |
|
29005 |
CAA > CAC |
Q 244 H |
Glutamine>Histidine |
Non-synonymous SNV |
|
29510 |
AGT > CGT |
S 413 R |
Serine > Arginine |
Non-synonymous SNV |
It is yet unknown how these alterations may affect the pathogenicity of the virus and the polyclonal mAb response.
The ORF1a/b gene is a key target for SARS-CoV-2 nucleic acid assays. It presents the non-structural proteins (nsp1–16) required for the reproduction, maintenance, and repair of the viral DNA (35).
RdRp (nsp12), 3-chymotrypsin-like protease (3CLpro), nsp5, also known as major protease or Mpro, and papain-like proteinase protein are a few of these proteins that antiviral medications used to treat COVID-19 use as targets. (PLpro, nsp3) (36)
The ORF1ab gene produces an RNA-dependent RNA polymerase enzyme and a helicase protein that are both necessary for viral replication (37). The most frequently altered non-structural protein of ORF1ab was nsp3, which also had a deletion at the amino acid sites del2083/2083, A 2710 T, and K 856 R variations. Variant T 3255 I and I 3758 V with non-frame shift deletion (del3674/3676) were found on nsp4 and nsp6, respectively. Transmembrane proteins NSP3, NSP4, and NSP6 have functions in modulating host immunity and enhancing the functionality of host cell organelles for virus reproduction (38). The largest NSP and a crucial part of the virus's transcription and replication are NSP3, which is present at this stage. The host cell membrane is where the transcription and replication of the virus genome take place (39).
The N (nucleocapsid) is a viral protein or gene of significance for diagnostics (nucleic acid and antigen detection) and unique vaccine formulation (40). It is essential for viral assembly, budding, and the recipient cell's reaction to viral infection. Its function is to maintain the genome's structure inside the membrane (41). R203K and G204R mutations are the most frequent types seen in N-protein in the present study (42). The previous study reported that the SARS-CoV-2 variants have improved virulence and transmission due to R203K and G204R mutations. In addition to spike protein changes, nucleocapsid protein mutations are crucial for the pandemic virus' ability to transmit (43). The N-gene in the omicron contains a significant number of deletions, which have been observed to affect diagnostics, primarily the primer binding of a few commercially available kits. It is yet unclear how these alterations affect the pathogenicity of viruses. However, the accessory proteins ORF6 and ORF9b function to inhibit innate immunity, signaling pathways, and interferon (IFN) expression by concentrating on the MAVS adapter associated with mitochondria (44). The previous study results reported that the existing discovered ORF9b gene mutation was>85% prevalent across all Omicron (n = 70) sequences (45).
In comparison to other VOCs, the SARS-CoV-2 genome's high mutation rate, particularly on the spike protein, may promote viral transmission and immunological evasion. Additionally, the development of novel vaccines that incorporate the Omicron variety as a potential reference strain became necessary due to an assortment of these large mutations on the immunogenic epitopes of Spike protein. In the meanwhile, more research is needed to determine the infectivity and efficacy of the current vaccines against Omicron.
Figure 5.
3D visualization of Spike glycoprotein (PDB: 6acj, EM 4.2 Angstrom) in complex with host cell receptor ACE2in this study.
Figure 5.
3D visualization of Spike glycoprotein (PDB: 6acj, EM 4.2 Angstrom) in complex with host cell receptor ACE2in this study.