1. Introduction
An estimated 58 million people are infected with hepatitis C virus (HCV) worldwide with 1.5 million new infections occurring each year [
6]. It is a major medical and public health burden in which approximately 75% of infections become chronic, which lead can to liver cirrhosis or hepatocellular carcinoma, and a leading cause of liver-related deaths. The number of deaths associated with HCV infection in the United States has been increasing, and it is the primary indication for liver transplantation in the Western world [
7,
8,
9]. The development of direct-acting antivirals (DAA) have led to have led to cure rates over 90% with HCV treatment, but does not prevent reinfection. Moreover, DAA treatment is inaccessible to those infected in developing and in developed countries due to high costs and/or infrastructure limitations [
10,
11,
12]. Additionally, diagnosis of HCV infection often occurs at a late stage after infection due to the asymptomatic “silent nature” of the virus, and successful DAA treatment may not alter the risk for cancer. Reinfection also remains a problem even after successful treatment in subjects with continued at risk behavior such as injection drug use. For these reasons, the most viable method for controlling HCV infections worldwide is through development of an effective prophylactic vaccine[
13].
Structural characterization of the E2 glycoprotein has provided substantial information on the major antigenic sites that are the targets of bnAb-binding, particularly as it pertains to binding to primary CD81 receptor binding domain [
14,
15,
16,
17,
18,
19,
20,
21,
22]. More recently, the cryo-EM structural characterization of the E1E2 heterodimer, either as a membrane-extracted E1E2 heterodimer [
23] or as a soluble, secreted E1E2 heterodimer ectodomain [
1], was a major advance in the field to more fully understand the structural antigenic features of this complex molecule. This knowledge will greatly facilitate a structure-based design approach in order to optimize immune responses to the major bnAb antigenic determinants. However, development of a vaccine will be challenging for multiple reasons [
24,
25,
26,
27]. These include the genetic diversity of HCV of at least seven HCV genotypes that differ up to 30% in nucleotide sequence, which can be further subdivided into over 90 subtypes, flexibility of the conformational regions, glycan shielding of neutralizing epitopes, the presence of immunodominant non-neutralizing “decoy” epitopes, and the tendency for membrane solubilized E1E2 antigen preparations to form aggregates [
25,
28,
29,
30,
31,
32,
33]. Moreover, direct cell-to-cell transmission of the virus, systemic circulation of virions associate with lipoproteins, and downregulation of major histocompatibility complex (MHC) expression are other mechanisms for the virus to escape protective immunity [
30,
34,
35,
36]. Although immune correlates of protection have yet to be defined for HCV, there is broad agreement that both B and T cell immunity contribute to the control of acute HCV infection [
26,
37,
38]. Thus, the expectation is that an ideal vaccine will elicit high bnAb titers directed against multiple conserved E1E2 epitopes to ensure broad neutralization [
39,
40,
41,
42,
43], in conjunction with cytotoxic and tissue-resident memory T cells, in order to achieve immunity and protection against a high diversity of HCV isolates.
For this review, we will focus on structure-guided approaches to enhance a B cell immune response to E1E2 antigenic determinants that is a primary component of the host defense against HCV infection. It is worth highlighting that such approaches is more complex than sterilizing immunity as observed for hepatitis A, B and E vaccines [
28]. During an acute infection, spontaneous viral clearance occurs in about 25 percent of individuals that is typically correlated with a robust neutralizing antibody response early in infection [
44,
45,
46,
47]. The rate of clearance of a reinfection is improved with a shorter course of infection and an increased likelihood of viral clearance compared with primary infection suggesting that pre-existing immunity is important [
48,
49,
50,
51,
52]. In support of this observation, bnAbs passively administered to humanized mice or chimpanzees protect against HCV infection [
42,
53,
54] Also, passive immunization with anti-HCV antibodies before HCV challenge prevented infection in animal models [
54,
55,
56,
57,
58]. However, passive immunization of chimpanzees with antibodies from an HCV-infected patient that neutralized infectivity of several HCV genotypes in cell culture only suppressed infection with homologous virus challenge and failed to protect against heterologous virus strains [
55]. Moreover, in a human clinical study using membrane-extracted E1E2 formulated with MF59, an oil-in-water adjuvant, broadly neutralizing antibodies were observed but only in a small fraction of patients in which most patients had low titers and limited breadth of neutralization [
59,
60]. Therefore, approaches to enhance immunogenicity such as the development of adjuvant systems that enhance both cellular and humoral immunity and design of nanoparticle platforms that permit multivalent presentation of E1E2 and use of novel adjuvants will be essential to achieve immunity against the broad diversity of HCV isolates.
2. HCV E1E2 Diversity
As previously noted in the literature [
24,
35], the HCV E1E2 glycoproteins possess high sequence diversity, which is a major reason that HCV is a challenging vaccine target. This diversity is highlighted in
Figure 1, which shows a phylogenetic tree generated with representative E1E2 sequences for 69 HCV genotypes and subtypes. As can be seen in the figure, between subtypes of the same genotype, there is often 10% or more in sequence divergence, while greater divergence is observed between genotypes (20-30%). E1E2 sequence variability is not uniform across the glycoprotein residues and domains [
24,
35]; the sequences are punctuated by highly variable regions, including hypervariable region (HVRs) 1 and 2 and the inter-genotypic variable region (IgVR) in E2. However, other E1E2 regions and sites are highly conserved [
24,
35], including cysteine residues that form known or putative disulfide bonds, and the majority of the N-glycosylation sites (4 sites in E1 and 11 sites in E2 in the genotype 1a H77 strain). Many conserved sites have been confirmed to be important for E1E2 folding and/or function through systematic mutagenesis studies of E2 and E1E2 [
61,
62,
63]. As shown in
Figure 1, currently available experimentally determined E1E2 and E2 glycoprotein structures, which are discussed in greater detail in the next section, only account for a few genotypes and subtypes, while they do represent three out of four globally prevalent subtypes noted by others (1a, 1b, 2a, and 3a) [
64]. Of note, a recently described but currently unreleased cryo-EM structure contains a modified genotype 3a (S52 isolate) E1E2 [
65].
A key feature of HCV diversity is the virus’s capability to form quasispecies in infected individuals and to actively escape the immune response [
66]. This immune escape was highlighted in a clinical trial of a neutralizing monoclonal antibody, HCV1, that targets the highly conserved AS412 site (residues 412-423) in E2 [
67,
68]. Following liver transplantation and monoclonal antibody therapy, which led to dramatic viral reduction, viral rebound was observed in all treated patients along with rare mutations directly within the epitope, at residues 415 or 417 [
68,
69]. These mutations disrupted HCV1 antibody binding through mutation of a key epitope side chain or shift of a glycosylation site within the epitope [
69]; the latter escape mechanism was also observed for another antibody targeting that E2 site [
70]. Other E1E2 polymorphisms associated with resistance and escape have been noted previously [
35], with certain polymorphisms outside known epitope sites leading to broad viral resistance or sensitivity [
71,
72], possibly in some cases due to effects on E1E2 dynamics and putative “open” and “closed” conformational states [
73].
Recent observations have highlighted that HCV phenotypic diversity, based on analysis of viral neutralization sensitivity and resistance, often does not directly map onto overall E1E2 sequence identities and known genotypes and subtypes [
74,
75,
76], likely due in part to the importance of polymorphisms noted above. These studies have separately identified representative reference panels of HCV strains spanning levels of sensitivity and resistance that can prospectively be used to perform standardized assessments of antibodies and immune sera, analogous to a commonly used global panel of HIV strains [
77], however a single coordinated panel of HCV strains, versus multiple panels, would be advantageous for the HCV research community.
3. HCV E2 and E1E2 Structure
Antibody responses against the HCV E1E2 glycoprotein have been mapped to various antigenic domains and regions on the E1 and E2 subunits, which include targets of broadly-neutralizing, strain-specific, and non-neutralizing antibodies [
78]. Antigenic targets on the E2 subunit have been named using different nomenclatures, including antigenic domains A-E, antigenic regions 1-5, or epitopes I-III (
Figure 2) [
42,
79]. The most characterized broadly neutralizing antibodies inhibit entry by restricting host CD81 receptor binding and map to a region known as the neutralizing face of E2, which includes overlapping antigenic domains B, D, and E and antigenic region 3 (
Figure 2). A contiguous helical region within the stem of E1 also has been defined as a neutralizing antibody target [
80,
81,
82,
83]. Antigenic regions 4 and 5, defined by neutralizing antibodies AR4A and AR5A, also confer broad recognition across HCV genotypes but, in contrast to other sites, depend on an intact E1E2 heterodimer for antibody recognition [
84].
To date, numerous structures of truncated or modified forms of the E2 subunit ectodomain in complex with antibodies have been determined, including those belonging to genotypes 1a, 1b, 2a, and 6b [
14,
15,
16,
85,
86,
87,
88]. The first of these structures, which utilized a truncated E2 protein spanning residues 412-645, defined the overall architecture of the globular E2 core, which was found to contain a central β-sandwich flanked by front and back layers made of loops and short stretches of secondary structure elements [
14]. While the core structure of E2 has been found to be largely conserved across genotypes despite sequence diversity, some regions of E2 have been observed to adopt conformational differences, even within the same genotype. These include β-hairpin and extended conformations of the AS412 (domain E) epitope and conformational flexibility of the E2 front layer [
18,
80,
81,
87]. Moreover, the CD81 binding site of E2 (residues 418-422 and 520-539) undergoes substantial conformational changes when bound to the large extracellular loop of CD81 [
89]. These findings coupled with previous studies showing substantial flexibility and functional interplay between the E2 HVR1, the front layer, and the CD81 binding site, underscore inherent structural plasticity of some regions of E2 that are thought to underlie differences in susceptibility to antibody neutralization [
73,
90,
91].
In order to structurally characterize the HCV E1E2 envelope complex in a more native-like form, cryo-EM has recently become the technique of choice (
Figure 3). Notably, this technique yielded the first structure of a full-length membrane-extracted form of the E1E2 heterodimer of genotype 1a bound by neutralizing antibodies [
23]. This structure resolved the overall architecture of the E1E2 heterodimer, including regions of E1 and the C-terminus of E2, and defined for the first time the AR4 site of vulnerability. Subsequently, our group determined a cryo-EM structure of an engineered soluble form of the E1E2 heterodimer ectodomain (of genotype 1b) that we developed for vaccine efforts by replacing the transmembrane domains of E1 and E2 with a soluble self-assembling coiled-coil scaffold (
Figure 3) [
1]. Both heterodimer structures were found to share a similar E1E2 fold and architecture despite being of different genotypes possessing significant differences in amino acid sequences (22.8% between E1 subunits and 19.6% between E2 subunits), and one being completely liberated from the membrane [
1]. Comparative analysis of the structures revealed several common features of the E1E2 heterodimer. The E1-E2 interface was made up entirely of non-covalent, predominantly hydrophobic interactions. A substantial contribution of two conserved E1 N-linked glycans, N196 and N305, to the interface accounted for roughly a third of the E1 interface with E2 [
1]. Both E1E2 structures also allowed for a portion of the C-terminal domain of E2 to be resolved (residues 646-704), termed the base or bridging domain, which was found to pack against the back layer of the E2 core and against E1, and accounted for roughly 70% of the E2 interface with E1. This domain also contained the epitope of E1E2-specific neutralizing antibody AR4A, which was found to be one of the most highly conserved epitopes defined to date [
1].
Several domains of E1 were also resolved in the cryo-EM structures [
1,
23]. The E1 N-terminal domain (NTD), spanning residues 192-205, was found to consist of two anti-parallel β-strands that packed against variable region 2, post-variable region 3, and the bridging/base domain of E2. The E1 NTD also contained the N-linked glycan at position N196, which, along with N305, was previously found to be critical for the E1-E2 interface and heterodimer integrity [
63,
82,
92]. The E1 core domain, residues 206-255, consisted of a cluster of β-strands and contained two N-linked glycans, N209 and N250. The C-terminal loop (CTL) domain of E1, spanning residues 295-312, contained the N305 glycosylation site that, as noted above, packed against the bridging/base domain of E2 and was critical for heterodimer integrity. The CTL also contained the epitope for E1-specific neutralizing antibodies IGH520/IGH505 and IGH526 [
23,
80,
81,
82,
83]. Missing from either of the structures, however, was the region linking the E1 core and the C-terminal loop, residues 256-294, which has been predicted to contain the putative fusogenic machinery [
93,
94,
95].
Recently, a cryo-EM structure of a full-length membrane-extracted dimer of E1E2 heterodimers has also been reported, although not yet released, indicating possible higher order oligomers may represent the native state of the E1E2 glycoprotein on the HCV virion [
65]. Interestingly, flexible regions of E2 and E1 that were not resolved in previous structures, including parts of HVR1, AS412, and membrane embedded portions of the E2 and E1 stem regions, were reportedly resolved in this structure. Structural definition of these regions will likely provide further insight into underlying structural features of E1E2 that play a role in membrane fusion and possibly those that define phenotypic differences in susceptibility to antibody neutralization [
65].
4. Structure-Based Vaccine Design
To generate an effective vaccine for HCV and overcome the challenges of HCV diversity and immune evasion [
35], recent efforts have increasingly explored the use of reverse vaccinology and structure-based vaccine design [
96,
97] to generate optimized vaccine antigens that will elicit broadly neutralizing antibodies that target key conserved sites on E1E2 [
98]. These follow successful structure-based designs of antigens for other variable or dynamic viruses, including prefusion RSV F [
99] and influenza hemagglutinin stem [
100] antigens, which have both been in clinical trials, and recently approved for use in the case of prefusion RSV F [
101,
102]. Structure-based HCV antigen designs have included stabilized and scaffolded conserved epitopes [
103,
104,
105], optimized E2 antigens with truncated or removed variable regions [
106,
107] or a targeted proline substitution to stabilize a key epitope [
108], as well as display of E2 self-assembling protein nanoparticles [
106]; these and other HCV antigen designs and strategies are described in a recent review [
98]. Some E2 and E1E2 design strategies are shown in
Figure 4.
Several recent advances and findings provide possible avenues to pursue in future HCV antigen design efforts. Importantly, the current availability of experimentally determined E1E2 structures, as discussed in the previous section, enables structure-based design of E1E2, rather than E2 alone or individual epitopes as in previous work, to optimize its stability, antigenicity, or other features. Additionally, possible “open” and “closed” states of E2 and E1E2 [
73], or HVR1 entropy [
90], which have been associated with viral neutralization sensitivity, can be utilized to tune E1E2 antigenicity, particularly if sufficient structural and dynamic details underlying those states can be defined. Of relevance, a recently described cryo-EM structure with E1E2 in dimeric form seems to provide details of a preferred conformation of HVR1 which was corroborated by AlphaFold2 structural modeling [
65].
Other (non-structural) rational antigen design approaches represent promising strategies to address HCV diversity and escape. Frumento et al. identified E1E2 ectodomains associated with spontaneous viral clearance and improved neutralizing antibody breadth [
109] that may useful in a vaccine, versus the H77 glycoprotein sequences which are commonly used in E2 and E1E2 antigens. Another strategy is the use of consensus ectodomains, which was utilized in HCV E2 [
110] and HIV envelope glycoprotein [
111,
112] antigen designs. Finally, an additional means to address HCV diversity could be the use of multiple representative designed or natural E1E2 antigens in a vaccine that are representative of prevalent genotypes; given the previous success of self-assembling mosaic nanoparticles to display diverse coronavirus spike receptor binding domain antigens [
113], an analogous strategy could be explored for displaying diverse representative HCV antigens.
5. Multivalent Delivery Platforms and Considerations
Subunit vaccines are often poorly immunogenic, a phenomenon routinely attributed to their size as particulate antigens have been known for quite some time to be highly immunogenic [
114,
115,
116,
117,
118]. There is recent data in a study by Aung et al. [
119] that puts a finer point on this concept. In that study, the authors showed that subunit vaccines were trafficked primarily to the subcapsular sinus or extracellular regions of lymph nodes and subsequently degraded by metalloproteases. Such degradation eliminates conformation-dependent epitopes on the associated antigens and hampers the immune response. Nanoparticle-sized antigens were localized instead to follicular dendritic cells (FDCs) where they remained intact and preserved such conformation-dependent epitopes and thus elicited a more robust immune response. In light of these recent observations and the historical data, it seems clear that increasing the size of a subunit vaccine is beneficial and thus a number of strategies have been developed toward that end. One strategy is to employ an adjuvant system that, when formulated with the antigen, produces a nano- or microparticulate vaccine. Common adjuvant systems in use to make particulate subunit vaccines are aluminum salts (Alum), polymers like poly(D,L-lactic-co-glycolic) acid (PLG), oil and water emulsions like MF59, liposomes and other vesicles, and micelle forming adjuvants. Micelles can be particularly advantageous for membrane-anchored antigens as the formulation process creates rosettes of antigens covering the exterior of the micelle. These kinds of formulations have been used in vaccines for influenza (Flublok) and SARS-CoV-2 (NVX-CoV2373) [
120,
121,
122,
123,
124,
125,
126]. A second strategy to develop particulate vaccines is to construct virus-like particles (VLPs). VLPs are non-infectious but more closely mimic the native virion than subunit vaccines and other particulate platforms by using the viral structural proteins to self-assemble in a similar manner to the virus. Energix (hepatitis B virus) and Gardasil (human papilloma virus) are two prominent VLP-based vaccines. A third strategy for increasing the size of a subunit vaccine is the use of protein-based nanoparticles. These nanoparticle platforms are typically naturally occurring or engineered protein shells that allow multivalent display of subunit vaccines on the exterior [
127,
128,
129,
130,
131,
132,
133]. These assemblies can be formed in cis, where the subunit vaccine and nanoparticle protomer is expressed as a single open reading frame and assembly yields a 100% occupied nanoparticle or in trans where the nanoparticle shell and subunit vaccine are produced separately and coupled post hoc, as in the case of the SpyCatcher-SpyTag system [
134]. Each of these platforms is being explored for a potential HCV vaccine candidate [
106,
135,
136]. The first studies, by Yan et al. [
136] and by He et al. [
106], used the E2 ectodomain and a modified E2 core ectodomain respectively as proof-of-principle antigens to be appended to nanoparticles. Given the importance of the E1E2 complex, and in particular the AR4/AR5 antigenic region in viral clearance [
41,
137], a nanoparticle presenting native E1E2 should be a high priority for HCV vaccine development. One nanoparticle study has been conducted with E1 and E2 [
135], but this used a permuted E2-E1 version of the antigen which does not retain the native AR4/AR5 antigenic domain.
While the above platforms address the question of size in the context of a vaccine against HCV, the question that still remains is how to accommodate HCV diversity in the context of a particulate adjuvant, a VLP, or a nanoparticle. As described in the previous section, this problem can be overcome by the use of consensus sequences or mosaic vaccines composed of multiple antigens encompassing different genotypes or phenotypes contained within a single vaccine (
Figure 5). These approaches have been used for vaccine trials against HIV and SARS-CoV-2 [
113,
138,
139,
140,
141,
142,
143,
144,
145,
146] and, importantly, are compatible with the adjuvant and nanoparticle platforms, but it is unclear if such approaches are compatible with VLPs. An additional approach is to use cocktails of different genotypic or phenotypic representatives mixed together after preparation and validation. This is the most straightforward approach, but requires a brute-force regimen of multiple separate preparations and validations for each different member of the cocktail. Which one of the above approaches is most likely to yield an effective HCV vaccine is currently an open question that will need to be evaluated experimentally.
6. Conclusions
Since its discovery in 1989, HCV has been a particularly vexing pathogen for vaccine development, in large part due to its high sequence diversity. However, significant advances have provided avenues to potentially overcome this sequence diversity. First, despite significant sequence variability among the different genotypes and subtypes, highly-conserved regions have been defined and characterized as antigenic regions [
42,
84,
147,
148,
149,
150,
151,
152,
153,
154,
155] that give rise to bnAbs and can serve as potential targets for rational vaccine design. Second, after significant struggles, structural studies were successful, first for modified truncated versions of the E2 ectodomain [
14,
15], and subsequently for a more complete E2 ectodomain [
16] in complex with neutralizing and non-neutralizing antibodies. More recently, the structure of membrane-extracted E1E2 in complex with multiple bnAbs was determined by cryo-EM [
23], providing the first look at the antigenic domain AR4, which is bound by the bnAb AR4A and correlates with viral clearance [
41,
137]. In addition, our group developed a soluble, secreted form of the E1E2 complex, with the idea that an E1E2 antigen liberated from the membrane would prove more amenable to vaccine design efforts. The structure of the secreted E1E2 complex [
1] shows that it preserves the native architecture of the E1E2 ectodomain outside the context of the membrane. This catalogue of structures (plus other structures likely to be determined) can be used for structure-based vaccine design for an E1E2-based HCV vaccine. Third, the development of nanoparticle platforms, both protein- and adjuvant-based, allows a multivalent presentation of E1E2 as a means to enhance its immunogenicity. Moreover, advances in these nanoparticle platforms such as plug-and-display technology [
134] allows the potential development of mosaic E1E2 vaccines. It is not known what means of incorporating sequence diversity into an E1E2 vaccine will be successful, so making available as many options as possible is critical for successful vaccine development. Additional improvements such as incorporating the PADRE peptide [
156,
157,
158,
159], which activates CD4
+ T cells, into nanoparticles could potentially boost the cellular immune response to an E1E2-based vaccine, thereby further enhancing its potency. Like pieces of a puzzle, these and other breakthroughs from research on HCV and other pathogens have come together to put the field of HCV vaccine development in a position to overcome the hurdle of HCV sequence diversity. This progress has the potential to deliver an HCV vaccine that elicits the breadth of neutralization required to achieve containment and eventual eradication.