1. Introduction
The Tick-borne Encephalitis Virus (TBEV) is a pathogen found in large parts of northern Europe and Asia. It is transmitted via bites from infected ticks and can cause systemic and neurological infection with mortality rates ranging from >2 % to 34 % depending on subtype [
1,
2,
3]. Available vaccines protect against severe disease. However, booster doses are recommended every three to five years for healthy individuals below 50 years of age, while elderly and immunosuppressed individuals require a more extended vaccine program. Rare breakthrough infections and vaccine failures occur despite adherence to the vaccination schedule [
4,
5,
6].
TBEV belongs to the family
Flaviviridae, which consists of small, enveloped viruses with positive-sense, single-stranded RNA genomes. The TBEV genome translates to a polypeptide that is further processed into seven non-structural and three structural proteins: the capsid (C) protein, the membrane (M) protein and the envelope (E) protein [
7,
8]. The E protein consists of three domains: domain I (ED I), which stabilizes the protein and contain the E 150-loop which is important for viral particle maturation [
9,
10]; domain II (ED II), which contains a conserved fusion loop essential for membrane fusion [
11]; and domain III (ED III), which has an immunoglobulin-like fold and serves as a structural target for neutralizing antibodies [
12]. Three linear B-cell epitopes have been identified on the E protein: one located in ED I (aa 162-179) and two in ED II (aa 51-58 and 222-239) [
13]. Structural antibody epitopes, comprising discontinuous segments of the E protein or comprising residues from nearby E proteins in the quaternary structure, are also present [
14,
15,
16].
Starting at N154 (TBEV numbering), ED I contains one conserved consensus site (N-X-T/S (X≠P)) for N-linked glycosylation. The N-linked glycan at N154 has been connected to important functions in viral entry, infectivity, and pathogenesis of TBEV, as well as in other flaviviruses [
17,
18,
19,
20,
21]. Despite its importance, few studies have determined the structure of the N-linked glycan [
22]. The E protein also contains 67 sites for potential O-linked glycosylation, but to our knowledge, there are no data showing O-linked glycosylation. Consequently, we mapped the glycans of the E protein from a clinical isolate of TBEV grown in human adenocarcinomic alveolar basal epithelial (A549) cells. We identified three novel O-linked glycans on the E protein and determined the composition of the N-linked glycan structures at position N154.
3. Results and Discussion
The clinical TBEV isolate, F7203, was grown in the A549 cell line. Extra- and intracellular viral particles were harvested and lysed before purification of the E protein on an immunosorbent column coated with a monoclonal antibody targeting an epitope within ED III. Proteolytic cleavage of the isolated E proteins with trypsin resulted in a peptide coverage of 70 %, which were analyzed for glycan modifications using liquid chromatography tandem-mass spectrometry (LC-MS/MS) (
Figure 1A). The peptide coverage included the linear B-cell epitopes [
13], the consensus site for N-linked glycosylation N154-T156, and 54 out of 67 possible sites (80 %), i.e. Serine (S) or Threonine (T), for O-linked glycosylation.
The N-linked consensus site N154-T156 was highly occupied, with 90 % of the peptides carrying N-linked glycan modifications at N154. Of these peptides, 42 % carried an oligomannose type glycan (
Figure 1B), with Man
6GlcNac
2 being the most abundant (present on 12 % of analyzed peptides). High degrees of oligomannose structures have been observed on, among others, the human immunodeficiency virus (HIV) envelope protein gp120, which might indicate a low degree of glycan processing [
27]. In gp120, the densely located N-linked glycans sterically hinder enzymatic elongation, however the oligomannose structures of the single N-linked glycan of the E proteins might instead be important for correct protein folding during maturation [
28] or for interaction with receptor proteins, regulating infectivity and pathogenesis [
17,
18,
19,
20,
21]. Interestingly, a large fraction of the oligomannose structures contained -8 or -9 mannose residues. These types of structures are the major glycoforms prior to the exit from the endoplasmic reticulum (ER) during glycan synthesis and are subsequently trimmed down in the trans golgi network (TGN) before elongation and generation of complex type N-linked glycan structures [
29]. During the sample preparation, we harvested both intra- and extracellular viral particles prior to the isolation of the E protein. Thus, it is possible that a significant part of the isolated E protein was captured before it was fully processed by the glycosylation machinery. We did not calculate the ratio between intra- and extracellular viral particles before isolating the E protein and it is not possible to determine if there is a selection bias regarding the glycosylation status of the E protein in the immunosorbent column. In addition, flavivirus particles show a large heterogeneity upon release from the infected cell, both with respect to the state of maturation as well as internal protein architecture [
30]. This suggests that the protein maturation process, within the ER and the TGN, is subjected to considerable disruptions which could also impact the glycosylation.
Complex type N-linked glycan structures were found on 30 % of the peptides, mainly of the composition HexNac
4Hex
5Fuc
1 (present on 21 % of the analyzed peptides) (
Figure 1B). A fucose group was identified on 28 % of the complex type glycans, and manual inspection of the MS spectra suggest core fucosylation of the innermost GlcNAc. The addition of a core fucose is synthesized by the enzyme fucosyltransferase 8 (FUT 8) which has been shown to be upregulated in hepatitis B virus transfected hepatoma cells [
31,
32], and over expression of FUT8 was correlated with increased binding of Hepatitis B virus (HBV)-like particles to the cells [
32]. Since viral infection can alter gene expression in the host cell to create glycan compositions in favor for viral spread and replication, the high degree of core fucosylation might be advantageous for the infectivity of TBEV [
33,
34,
35].
Only 5 % of the N-linked glycans carried one or more sialic acid residues, indicating limited access by sialyltransferases. Sialic acid plays an important role in the recognition of pathogens by macrophages via the sialic acid-binding immunoglobulin-like lectin sialoadhesin (Siglec-1), found on macrophage subsets [
29]. Siglec-1 mainly recognizes α2-3 linked sialic acids, and binding leads to increased uptake of viral particles and enhanced infection, as shown for HIV [
29,
36]. It has been shown that TBEV infects macrophages and a possible connection to Siglec-1 warrants further investigation [
37].
13 % of the peptides covering the N-linked consensus site carried N-linked glycans of paucimannose type (Man
3-4GlcNac
2(Fuc
1-2)). Paucimannose decorates the E protein when the virus is grown in tick cells [
22] and is commonly present in invertebrates. Vertebrates, however, typically extend the glycan precursor into complex-type N-linked glycans [
38]. Previous studies have also identified paucimannosidic structures in tumor cells [
39,
40], which could explain our findings. Almost 5 % of the N-linked glycans were assigned as “other” in
Figure 1B. These structures can either comprise hybrid-type structures, or a combination of N-linked structures and O-linked structures present on the same peptide. PNGaseF-treatment prior to reanalysis would remove the N-linked glycans, enabling identification and analysis of O-linked glycoforms. However, the limited quantity of purified E protein at our disposal prevented further differentiation of these glycoforms.
Three O-linked glycans were found on the E protein (
Figure 1A). First, manual inspection of the fragment ion spectra suggests a single HexNAc monosaccharide present in the region covering amino acids C74-R94, containing three potential sites for O-linked glycosylation (T76, T81 and T90) (
Figure S1). Our methodological approach did not permit quantification of glycan structures that were present on less than 1 % of the identified peptides. Therefore, we can only conclude that this O-linked glycan is of low abundance in our E protein preparation. However, despite this site being mainly non-glycosylated our data implies that the glycan is present under certain conditions, but the significance of it, and under what conditions it appears remain unknown. Secondly, a confirmed O-linked glycan was identified on peptides covering amino acids V143-R160, containing two potential sites for O-linked glycosylation (T147 and S158). These sites are close to the N-linked glycan at N154. Of the total number of analyzed peptides covering T147 and S158, 1.1 % carried a single N-acetylgalactosamine (GalNAc). The presence of an O-linked glycoform was confirmed by a high-intensity oxonium ion at m/z 126 during manual inspection of the ion spectra (
Figure S2). As the O-linked site on peptides covering amino acids V143-R160 is close to the N-linked site N154, it is possible that this glycan may impact viral entry, infectivity, or pathogenesis either alone or by interaction with the N-linked glycan. This O-linked glycan is also near one of the three B-cell epitopes (covering amino acids 163-180) [
13]. Our group has previously shown that antibody reactivity to a peptide decorated with a single GalNAc varies dependent on the site of the modification and that specific sugar residues can both enhance and reduce antibody reactivity [
41,
42]. The third identified O-linked glycan has two possible sites, S285 or T289. A single GalNAc, as determined by manual inspection of the ion spectra, was detected on peptides covering the amino acids (S285-K296) (
Figure S3). Comparable to the O-linked glycan on sites T76, T81 or T90, also this glycan is low-abundant (present on less than 1 % of the analyzed peptides).
Next, we compared the amino acid sequences of 762 samples (European, Siberian and Far Eastern TBEV subtypes) deposited at GenBank, of which, 396 were full length protein E classified as European strain (
Table S2 and
Table S3). As expected, the N-linked consensus site N154 was conserved in all samples, exhibiting a 100 % identity among all 762 analyzed sequences. There was also a high degree of conservation among the amino acids which we identified as potential carriers of O-linked glycan structures. T76, T90 and T289 showed 100 % identity in all sequences, and T147 was in four cases replaced with S147, thereby still permitting O-linked glycosylation. T81, S158 and S285 showed 94.5 %, 98.8 % and 99.5 % identity among all strains respectively. Among the European strains, all potential glycosites showed 100 % identity, except for T81 (96.0 %) and S158 (98.7 %). In summary, it appears that most of the potential O-linked glycan sites are conserved among all analyzed protein E sequences. For each O-linked glycan identified, there is at least one potential site with 100 % identity.
To further investigate the potential impact of the O-linked glycans we performed structural analysis of the E protein (PDB ID: 7QRE [
10]) to the Glycoprotein Builder from GLYCAM Web [
26], after removing the precursor membrane protein fragment and non-standard residues using ChimeraX [
25]. We tested all potential sites for O-linked glycosylation as well as the single site for N-linked glycosylation, by adding single GalNAc residues or the most frequently observed N-linked oligomannose structure (Man
6GlcNac
2) to N154, with the E 150-loop in its open configuration (
Figure 2). Sites T90 and T289 could not harbor GalNAc residues in the model due to a low calculated solvent accessible surface area. Interestingly, T147 and S158 constitute parts of the E 150-loop [
10] and both sites are accessible for addition of GalNAc in this model.
When analyzing site occupancy of O-linked glycans within the E protein of PDB ID 1SVB [
16,
30] where the E 150-loop is in its closed conformation, the T147 site could not harbor the GalNAc residue. A closer inspection of the E 150-loop indicates that it would sterically hinder O-linked glycosylation at site T147 in its closed conformation (
Figure 3 and
Figure S4).
Despite only 1.1 % of the peptides carrying an O-linked glycan close to position N154 (T147 and S158), and an even lower number of peptides carrying O-linked glycans at the other positions within the E protein, this type of post translational modification may have implications for viral pathogenesis. The high number of viral particles in the blood during TBEV viremia means that a significant number of viral particles may harbor O-linked glycans [
43]. Interestingly, the sequence comparison showed that T147 was highly conserved, but the threonine was exchanged for a serine in four cases, possibly indicating selective pressure to maintain the ability for O-linked glycosylation. Moreover, the prediction tool indicated that T147 is only accessible for the addition of an GalNAc residue when the fusion loop is in the open conformation. We consider three possible scenarios that are compatible with O-linked glycosylation of site T147 or S158; (i) Addition of an GalNAc residue at T147 occur early in the TGN which may help stabilize the fusion loop in an open configuration. In this scenario the GalNAc needs to be cleaved before the fusion loop is closed in a pH-dependent manner [
10]. (ii) Addition of an GalNAc residue occurs at site S158 which may or may not have an impact on the E 150-loop. (iii) We harvested the E protein from infected cells, and it cannot be ruled out that a proportion of the peptides come from misfolded proteins that are destined for degradation. Thus, the glycosylation that we observe at site T147 or S158 may be an artefact. At this stage, we cannot speculate as to which of these scenarios is most likely, but the N-linked glycan at position N154 at the distal tip of the E150-loop is important for viral infectivity and may affect cell tropism, including crossing the blood-brain-barrier [
17,
21]. The mechanism has however not been fully uncovered, and the impact of the composition of the N-linked glycan and the presence of an adjacent O-linked glycan warrants further investigation.
Moreover, the identified low-abundance O-linked glycans at sites T76, T81 or T90 and at sites S285 or T289 are within or close to ED II. This domain contains the highly conserved fusion loop at aa 98-110 [
44], which mediates membrane fusion and dimer formation by interactions with precursor to M protein (prM), and is the main target for cross-reacting antibodies [
45,
46,
47]. Post translational modification by glycosylation of B cell epitopes can potentially alter antibody recognition. Since the glycans on viral proteins are synthesized by the host cell glycosylation machinery, they could be considered “self” by the immune system, and therefore do not induce an immune response. Instead, these large self-like glycans might prevent binding of neutralizing antibodies due to physical hindrance, thereby limiting the protective effect of the antibody epitope [
48]. However, small glycans, such as a single GalNAc, can enhance antibody recognition of a peptide. For example, 70 % of the tested sera from patients infected with herpes simplex virus type 2 (HSV-2) have IgG directed toward a heptamer within glycoprotein G, carrying a single GalNAc modification at T504. This serum reactivity is lost when the glycan moiety is absent [
49]. Additionally, depending on the site of the glycan modification, individual sera can confer diverse responses towards the same glycosylated epitope [
41]. Altogether, this implies that glycosylation can influence antigen processing and presentation by antigen-presenting cells (APCs), resulting in a shifted immunodominance where the antibody repertoire is directed towards certain epitopes to a higher degree than others [
50]. Also, many of the vaccines used against TBEV are based on inactivated virus produced in Chick Embryo Fibroblasts (CEF), and the viral particles hence likely contain glycan signatures of the CEF cells. A recombinant vaccine, with a glycosylation profile significantly different from the actual glycosylation present on the viral particle might confer a lower protection than a vaccine with a matching glycosylation profile [
51]. Thus, the glycosylation profile should be considered when new vaccine candidates are developed.
Author Contributions
Conceptualization, E.K., T.B and R.N.; Methodology, E.K., E.M., K.N. and A.K.; Data curation, E.M.; Validation, E.K., E.M., K.N. and A.K.; Formal analysis, E.K., E.M., K.N. and A.K.; Investigation, E.K., E.M., K.N. and A.K.; Resources, E.M, K.S., T.B. and R.N.; Writing – Original draft Preparation, E.K. and R.N; Writing – Review & Editing, E.K., E.M., K.N., K.S., A.K., T.B. and R.N.; Visualization, E.K., E.M. K.N. and AK; Supervision, E.M. and R.N.; Project administration, R.N.; Funding acquisition, R.N.