1. Introduction
Helicobacter pylori (Hp) is a bacterium that colonizes the stomach of more than 50% of the human population and has evolved to specifically grow in the harsh environment of the human gastric mucosa [
1]. Gastric colonization is possible due to its spiral-shaped, multiple unipolar flagella and to the production of urease that counteracts the extreme acidity of the stomach [
2,
3].
Hp is usually acquired during early childhood, probably by intimate oral-oral contact with the mother [
4] to co-exist for life with human gastric cells in most cases (over 80%), representing a clear example of microbiota vertically transmitted from mother to child. However, in a few cases, Hp might cause peptic ulcers and may become a risk factor for gastric cancer (GC) [
5]. Hp strains with increased capacity to cause GC to encode virulence genes like those of the Type IV Secretion System (T4SS), adhesins, and a cytotoxin [
6,
7]. The Hp T4SS translocate the CagA protein, DNA, and heptose into the cytoplasm of gastric cells. CagA has multiple effects in the host cells, activating several pathways, and is the first bacterial protein recognized as an oncoprotein [
6,
7,
8,
9,
10]. The Hp T4SS is a complex secretion system with partial homology to proteins from other bacterial T4SS [
11,
12]. The largest protein in Hp T4SS is CagY, a protein of about 2000 amino acids (aa) and a carboxylic terminal region homologous to VirB10, but the rest of the protein has no homology to other known proteins [
9,
10].
The CagY protein is widely variable in length and sequence, and its structure is highly complex, presenting two repeated regions (RR), one known as the 5′ region (FRR) (at the amino terminus), which is an intrinsically disordered region (IDR), and the other in the middle of the protein, the unusually long Middle Repeat Region (MRR). The MRR repeats are classified into A and B modules, which are composed of three distinct motifs: delta, mu, and alpha for module A, and epsilon, lambda, and beta for module B [
13,
14] (Figure 1a,b). It has been suggested that the variation in the number and location of the repeats in the MRR may be involved in the regulation of the translocation of Cag A throughout the T4SS and may also help to modulate the host immune response. CagY may also regulate gastric tissue inflammation by interacting with the Toll-like receptor 5 (TLR5) [
15,
16,
17,
18,
19,
20,
21]. The middle repeat region is susceptible to rearrangements (by modification, deletion, and insertion of modules) that may affect the structure of CagY and hence its function, including modulation of the inflammatory response in the gastric mucosa [
22] and the phosphorylation of CagA [
23].
However, the complex and large size of CagY makes its study challenging, particularly in a high number of strains. Up to date, a total of 3446 cagY/CagY coding genes/proteins exists in the NCBI database, but most of them are truncated or incomplete (NCBI, 2023), mainly due to problems in sequencing and assembling the repetitive regions by short-read sequencing techniques. Only the recent long-read sequencing technologies (SMRT/PacBio or Oxford-Nanopore) have allowed us to get more accurate and reliable complete CagY sequences to study their complexity and possible function better.
Due to the complex characteristics of the CagY protein, its tridimensional structure is not completely elucidated yet, with only parts of the VirB10 homologous region recently reported [
24,
25], although this region represents only a fraction of the protein (approximately 20% of the sequence). Elucidating the structure of CagY and of the whole T4SS is a challenge, but it is essential to better understand the interaction and mechanisms of function of the SS proteins. The most recently reported structure elucidated by Cryo-EM shows the core of T4SS, composed of an Outer Membrane Complex (OMC), a Periplasmic Ring (PR), and a Stalk, where fragments of the proteins CagY (Cag 7), CagX (Cag 8), CagT (Cag 12), CagM (Cag 16) and CagD (Cag 3) have been identified. The pilus of the T4SS was found to be composed mainly of CagY and CagL [
26]. However, the previous works have helped to partially decipher the assembly of T4SS. Still, most of the tridimensional structure of CagY is unknown. CagY seems to play a major role in the structure and function of T4SS, probably spanning the inner Hp membrane, the periplasmic space, and the outer membrane facing the host cell [
11,
12]. It is then of the utmost importance to clarify the structure of the whole protein and the function of this major protein in the T4SS.
As a complementary tool to the experimental techniques to elucidate the protein structure, different bioinformatic methods can be used to model and predict the structure of proteins [
27]. For years, the most accurate method for protein structure prediction was homology modeling, where a tridimensional known structure is used as a template for modeling the structure of a close sequence homolog [
28]. Until recently, these computer methods had limitations for predicting the structure of proteins with low identity to homologous with known tridimensional structure. However, with the advent of Deep learning approaches such as AlphaFold2, bioinformatic methods for structure prediction reached unprecedented accuracy [
29,
30]. Threading methods, also known as fold recognition methods, are currently used to search potential low-similarity templates for remote homologous [
31]. In ab initio modeling, methods are used to build the structure of fragments from the amino acid sequence with the help of short tridimensional templates from known structures or molecular dynamics modeling and posterior simulation and optimization for their assembly [
32]. On the other hand, deep learning-based modeling uses neural networks to calculate the tridimensional structure from all available information in structural databases [
33]. These methods can be combined to achieve the highest quality and efficiency in protein modeling [
34]. Nevertheless, even these methods still have limitations for complex targets such as CagY. In this work, we combine different approaches to elucidate a theoretical structure for the CagY protein, which may allow us to understand how it interacts with the remaining proteins of the T4SS.
We propose a tridimensional model for most of the CagY protein based on all available data on the protein and theoretical data presented in multiple works on CagY. We used tridimensional models available for fragments of CagY in the PDB database for homology modeling and ab initio methods to model segments of CagY with missing tridimensional structures. Moreover, deep learning methods (DL) were used to predict the structure of the MRR of CagY.
3. Discussion
The CagY protein is an essential structural component of the T4SS of
Hp with a key participation in its regulation [
13,
14,
20,
22,
36,
37]. The length of the protein from the
Hp 26695 reference strain is 1927 amino acids and has unusually long repetitive elements (close to 900 amino acids) that might be related to their immunogenicity and pathogenicity properties [
15,
20,
38].
It has been reported that the CagY protein completely spans the cell envelope of
Hp from the Outer Membrane (OM) to the Internal Membrane (IM). This arrangement agrees with bioinformatic predictions of the secondary structure of CagY, which show two transmembrane regions (
Figure 1a). Considering its length and extension through the membrane, it is suggested that CagY plays an essential role in the translocation of the Cag A oncoprotein [
11,
12,
39].
CagY is an unusually long protein that seems to have evolved to perform multiple activities associated with a specific region of the protein. Thus, in order to better understand its functions, we need to thoroughly study its sequence and structure, which, because of its large size and complexity, has not been an easy task. Preliminary analysis of CagY secondary structure showed that residues 5 to 343 correspond to a repeated region domain in the amino terminal (
Figure 1a), also known as FRR. The next region (344 to 365) corresponds to a transmembrane region, which may allow anchoring the protein to the IM, although it is unknown if this domain associates with other proteins [
13]. Positions 366 to 1458 correspond to a region that includes the Middle Repetitive Region (MRR), almost 1,000 aa long, sometimes described as a conserved region [
13,
36]. In fact, it is the most polymorphic region of CagY among different
Hp strains [
13,
36].
). Secondary Structure predictions of this region show essentially short alpha-helix motifs, mainly associated with the repeated modules (Figs. 1a and b). CagY proteins from different
Hp strains show changes in the arrangement and number of these modules, affecting the length of CagY (although the FRR also contributes to some extent) [
13,
36].
The repeated modules have been classified as A or B (
Figure 1a,c). It is worth noting that this is the only region in the protein where the amino acid cysteine is present. In an unusual number and distribution, e.g., in
Hp 26695 strain, it has a total of 42 residues, two for each repeated module. It has been suggested these residues may be involved in the formation of disulfide bonds, playing an essential role in the folding stability of CagY, probably also needed for maintaining the multimer structure, and function as a modulator of the T4SS activity [
13,
14].
The region from residues 1469 to 1927 has been majorly described in different structural studies with cryo-EM and cryo-ET techniques [
11,
12,
24,
39]. This is also the only region of CagY that has homology with the VirB10 proteins from other bacterial T4SS, such as
Escherichia coli and
Agrobacterium tumefaciens [
12,
13,
24]. Experimental structures for this region are reported in the RCSB-PDB database (
Table 1). However, recent studies have reported an asymmetric complex with 17 chains of CagY/Cag X (6X6J) and a symmetric complex with only 14 CagY chains (6ODI), which is known as the symmetry mismatch [
11,
12]. Previous studies with cryo-ET have only detected a 14-chain structure complex thought [
12,
24].
Due to the complexity of this protein, its complete structure has not been experimentally elucidated yet. Recent structural works with cryo-ET and cryo-EM have shown an impressive complex assembly for the T4SS complex, with several proteins coded in the Cag PAI, like Cag X, CagY, CagT, CagM, and Cag3. However, most of the CagY protein structure remains unsolved. Bioinformatic sequence analysis predicts that the C-terminal of this protein corresponds to an Intrinsically Disordered Region (IDR), consistent with sequence composition with amino acids that promote sequence disorder. This region is on the interior side of the
Hp IM, and its structure is difficult to predict because of its disordered character and possible interaction with other proteins, like the energetic protein complexes of T4SS (
Figure S6) [
26]. Proteins to which CagY interacts, according to STRING, are CagE virB homologs involved in DNA transfer and required for induction of IL-8 in gastric epithelial cells, CagT or TrwB, an inner-membrane nucleoside-triphosphate-binding protein. The periplasmic region of CagY interacts with VirB9 and VirB10. This family also includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain has a similar fold to the NTF2 protein and CagX, a conjugation protein.
Starting from the structures obtained from cryo-ET and cryo-EM studies, we built an initial model by homology modeling. Despite the structural discrepancies between the models, we used the 6ODI and 6X6J structures, focusing on building a model with only 14 chains of CagY, consistent with cryo-EM studies. A segment of 32 amino acids between 6ODI and 6X6J corresponds to an “empty zone” not solved in the cryo-EM studies. We filled this space with an ab initio model for the corresponding residues at this zone. This step was relatively simple because most bioinformatic methods for secondary prediction and ab initio and threading modeling methods showed agreement with an alpha-helix conformation.
The symmetry mismatch, previously described in cryo-EM studies, arises from detecting only 14 CagY chains in the structures closer to the OM and 17 units in structures from the PR region, without an evident connection between the 3 CagY chains in excess. Authors of the cryo-EM works admitted that it was not possible to explain the causes of such structural discrepancies. Possible explanations include the presence of truncated versions of CagY (and Cag X) in the T4SS, conformational isomers of these proteins, or early or incomplete assemblies of the T4SS. [
11,
12,
39]. However, previous cryo-ET studies have reported only 14 chains for the same complex. Therefore, we decided to deduce the CagY structure model using the symmetric 14 chains of CagY. As was demonstrated in the built model (
Figure 2c), the space occupied by the “asymmetric complex zone” was similar, with a slightly wider space between the chains.
More challenging for modeling is the zone that includes the MRR, as it corresponds to the most polymorphic and complex zone of CagY. This zone is of particular interest since previous studies associate sequence variations with the modulation of the T4SS activity for CagA translocation, the immunogenicity response, and the phosphorylation of the CagA oncoprotein [
5,
15,
36].
Sequence variations include losses and gains of the repeated modules A and B and single residue substitutions. Some authors have also highlighted the high degree of sequence conservation of the modules, even at the nucleotide coding level.
Previous studies suggest that the initial T4SS assembly should start by placing CagY and CagX in the bacterial envelope [
40]. Because the multimer structure of CagY and Cag X can be remarkably stable and independent of other proteins, the absence of Cag3 in mutant strains interferes with the incorporation of other proteins of the Cag PAI. Still, it does not avoid the assembly of CagY and CagX. [
12,
41]. CagX is shorter than CagY and seems to interact only with the OM section of CagY but is absent in the IM region. Therefore, we consider a reasonable assumption to build multimer models of this protein domain in the presence of the CagX protein.
Our first attempts to build structure models for this domain produced results with considerable differences in conformation. Nevertheless, usually, there was good agreement on the secondary structure. The ColabFold2/AlphaFold2 DL has been recognized as the most confident approach for predicting protein structure, reaching “experimental confidence” in some cases [
33,
42,
43]. However, the method still has limitations for challenging targets, which include complex and large proteins and those for which there are not enough templates for modeling, like CagY [
44]. When trying to model the structure for the whole sequence of CagY, ColabFold2 yielded a structure that was mainly disordered and difficult to align with the known partial structures from cryo-EM studies. Therefore, we divide the problem by identifying the previously determined domain structures and sequence motifs to build hybrid structure models. On the other hand, recent works have remarked that DL approaches are bioinformatic methods that produce highly confident structure models for proteins with repetitive modules [
45,
46] and disulfide bonds [
47].
We focus on the DL approaches for modeling the MRR with ColabFold2/AlphaFold2. The modeling as a monomer of this domain had a more compact structure than those obtained with ab initio methods such as I-Tasser and Robetta (
Figure S1). However, it was not possible to establish its correct orientation to the remaining part of the CagY structure. The modeling as a multimer was challenging because of memory limitations due to this domain’s large size and complexity. Still, it was possible to estimate models for the dimer. Despite the low confidence values for these predictions, the dimer model showed a more compact CagY structure placed parallelly, allowing us to deduce their correct orientation regarding the remaining part of the model. Possible causes of the low confidence values are probably an insufficient refinement of the conformation of individual chains in the dimer, which showed considerable differences (RMSD > 2Å). On the other hand, stereochemical and structure analysis of the monomer and dimer models showed that all the cysteine residues were grouped in pairs at the correct distance to participate in disulfide bonds (typically less than 2.5 Å), further supporting the accuracy of the model [
48].
We refine the conformation by EMD by building a trimer from two dimer models to optimize the conformation of the central chain and including restrictions for the predicted disulfide bonds. The RMSD behavior of the trimer dynamics was consistent with the hypothesis that the neighboring chains help maintain the multimer’s conformation. Notably, the chains with only a single neighbor did not stabilize in conformation. Therefore, we consider the conformation of this central chain as the most appropriate for assembling the remaining part of the model.
A key observation was the presence of unusually large numbers of Cys (58 in Hp 26695 strain) separated at very regular intervals in the MRR region, a number rarely observed in any protein. It has been suggested that Cys content is correlated with evolution, and prokaryotes have the lower content and mammals the highest [
49]. Small human proteins represent some of the richer Cys proteins, like in metallothionein domain proteins (MTs) or in granulins (GRNs) with over 20% of Cys [
50]; however, we are unaware of other large proteins with such a high number of Cys concentrated in a region of the protein like in MRR of CagY. Besides, Cys usually presents as C-(X)2-C motifs, with disorders domains interspersed with Cys. The above highly suggests a particular evolution history of MRR, different from the rest of CagY protein and probably different from most Cys-rich proteins.
Previous studies already reported the presence of Cys in the MRR of CagY, and authors suggested they may participate in the formation of disulfide bonds and the stability of the region [
13,
14]. The process involved in their formation and the functions of the disulfide bonds in prokaryotes have been less studied than in eukaryotes [
47]. Of note, these motifs are commonly found in transmembrane and extracellular proteins. In our study, the structural analysis of the trimer showed evidence of intrachain disulfide bonds but not interchain bonds. Therefore, these bonds do not seem to play an essential role in the quaternary structure but seem to be most relevant for the stability of the complex structure of CagY, particularly in the large MRR region.
It has been previously described that Hp has numerous Cysteine-rich proteins and that many of them possibly form disulfide bonds. [
51]. For many prokaryotic organisms, the enzymes DsbA and DsbB are the most frequently involved in disulfide bond formation [
47]. However, these enzymes are absent in Hp, and instead, the enzyme DsbK (HP0231 gene) is present and is most probably involved in this process [
52,
53].
The finding that CagY has cysteines forming disulfide bonds is essential because several studies have remarked that this covalent interaction can play an important role in the dynamics and modulation of proteins [
54,
55]. We propose a scenario where CagY modifies its structure to modulate the translocation of the CagA protein. These disulfide bonds can act as a Redox switch or mediate the contraction of this protein region to modulate the T4SS function and Cag A translocation. Cys-rich proteins are reported to be important to respond to oxidative stress. They may then be essential in the response and regulation of inflammation [
56]. It is intriguing to suggest that the redox-sensitive potential of Cys residues in the MRR region of CagY may play a role in its well-described regulation of inflammation in animal models [
53,
55,
57].
The model of the CagY structure proposed in this work can be used to understand how this protein interacts with CagA, DNA and heptose during their transit through the T4SS and the importance of the MRR repeat modules in the reported CagY functions. The tridimensional model will also help better understand its interaction with other proteins of T4SS, such as CagX, CagM, CagT, and Cag3, which are part of the core complex of the secretion system (see
Figure 6b). Our model suggests that CagY constitutes the tunnel that spans from the inner membrane of
Hp to the membrane of the gastric cell in its host, and the assembly of the T4SS components illustrates the role of CagY as the “skeleton” of the system around which all proteins assemble.
Some studies have suggested that changes in the AP regions and in the MRR are associated with the translocation capacity of the CagA protein, the immunogenic response, and the phosphorylation capacity [
35].
The models obtained by DL and homology for the AP in this work allow us to propose, in relation to the recent work of Tran et al. 2023, that the shortening of the AP region may also affect the translocation capacity. by altering its interaction with the OM of Hp. In contrast, the decrease in the pore channel’s diameter may obstruct CagA’s passage. It is important to emphasize that the size of the CagA protein is considerably larger than the AP pore, so the translocation process is more complex and may involve major structural changes.
On the other hand, the models calculated by homology tend to fit the dimensions of the template (in this case, the 6odi structure), which may explain the opposite prediction of the DL methods with respect to the pore diameter. DL methods consider a multitude of factors, and it is generally accepted that they model more efficiently the interactions between the chains [
33], suggesting they can offer a more accurate description of the conformational changes of the pore. However, to obtain more precise data, it will be necessary to use other techniques, such as molecular dynamics of these complexes, preferentially including the membrane, and our model may be used as a starting point for such simulations.
The structural prediction of the MRR region may provide essential clues regarding its role in modulating the translocation and phosphorylation of CagA and the immunogenicity associated with the polymorphisms reported in this region. The presence of disulfide bonds in this region, as well as the similarity with some contractile proteins as detected by I-Tasser (Table 2), suggest that MRR has contractile properties, and given the interaction of the disordered region of CagY with the energy complex beneath the IM, its function might be similar to that of flagellar or pilins proteins [
58,
59].
Contractile properties may play a critical role in modulating the translocation capacity of CagA and other molecules. We plan to conduct MD studies to predict better conformational changes in the MRR and AP regions and their possible associations with the phenotypes produced by CagY variants present in Hp strains of patients with different gastric conditions.
We recognize that a limitation of our proposed model is that we did not include the 17/14 CagY asymmetric model previously suggested [
12] and considered only the symmetric 14 CagY subunits to build the model. However, with the available data, it is not possible to determine whether the asymmetry corresponds to incomplete versions of CagY or to important conformational topology and arrangements of the CagY multimer.
In conclusion, using a combination of informatics and analytic tools, we provide a structural model for almost the complete CagY protein, particularly the complex and highly diverse MRR region, and present evidence of the role of the unusually abundant Cys residues in stabilizing the protein. We also modeled the AP region and provided proof of the role of sequence variants in CagA translocation. Finally, we offer a more detailed model for the assembly of the T4SS focused on the central role of CagY and show the agreement of our model with cryo-ET and cryo-EM studies.
Figure 1.
Motifs, secondary structure, and repeated modules of Cag Y. a) Cag Y sequence of 266695 is 1927 residues length, structural domains frequently cited are the 5′ Repeat Region, FRR (cian), the Transmembrane regions (TM) (green), the Middle Repeat Region (MRR) (blue) and the VirB10 homologous region (purple). Also, the figure identifies motifs in which structure was divided for modeling: Intrinsically disordered region (IDR), Middle Repeats Region (MRR), 6odi and 6x6j modeled regions, and ab-initio modeled. b) Secondary structure, disordered regions, and transmembrane regions of Cag Y predicted by the PSIPRED server. c) Sequence of repeated modules A and B from the MRR.
Figure 1.
Motifs, secondary structure, and repeated modules of Cag Y. a) Cag Y sequence of 266695 is 1927 residues length, structural domains frequently cited are the 5′ Repeat Region, FRR (cian), the Transmembrane regions (TM) (green), the Middle Repeat Region (MRR) (blue) and the VirB10 homologous region (purple). Also, the figure identifies motifs in which structure was divided for modeling: Intrinsically disordered region (IDR), Middle Repeats Region (MRR), 6odi and 6x6j modeled regions, and ab-initio modeled. b) Secondary structure, disordered regions, and transmembrane regions of Cag Y predicted by the PSIPRED server. c) Sequence of repeated modules A and B from the MRR.
Figure 2.
Homology modeling of the VirB10 homologous region. a) PDB structures 6ODI and 6X6J were used for templates of the OM, Periplasmic regions. b) Monomer model (virB10 homology region) and alignment with template structures, 6odi, 6x6j, and I-tasser model of residues 1604 to 1677 (model calculated with Modeller10v8). c) Multimer 14-chain model of virB10 homology region. Chains aligned with respect 6odi as a template.
Figure 2.
Homology modeling of the VirB10 homologous region. a) PDB structures 6ODI and 6X6J were used for templates of the OM, Periplasmic regions. b) Monomer model (virB10 homology region) and alignment with template structures, 6odi, 6x6j, and I-tasser model of residues 1604 to 1677 (model calculated with Modeller10v8). c) Multimer 14-chain model of virB10 homology region. Chains aligned with respect 6odi as a template.
Figure 3.
AlphaFold2 dimer modeling of MRR, trimer building, and Equilibrium Molecular Dynamics (EMD) for the trimer. a) Best Colabfold/AlphaFold2 model dimer structure for the MRR and IDDT values. b) Alignment of two identical dimers, chains labeled as AB (blue and red) and A’B’ (gray and green) respectively, for producing an ABB’ trimer (blue, red, and gray) structure (A’ chain was removed from superposed B/A’ structures). c) RMSD plot of 10 ns Molecular Dynamics Plot for the trimer. Individual plots for chains A (red), B (blue), and B’ (green) are displayed, as well as the plot for the whole ABB’ trimer. Only chain B stabilized its rmsd values (gray).
Figure 3.
AlphaFold2 dimer modeling of MRR, trimer building, and Equilibrium Molecular Dynamics (EMD) for the trimer. a) Best Colabfold/AlphaFold2 model dimer structure for the MRR and IDDT values. b) Alignment of two identical dimers, chains labeled as AB (blue and red) and A’B’ (gray and green) respectively, for producing an ABB’ trimer (blue, red, and gray) structure (A’ chain was removed from superposed B/A’ structures). c) RMSD plot of 10 ns Molecular Dynamics Plot for the trimer. Individual plots for chains A (red), B (blue), and B’ (green) are displayed, as well as the plot for the whole ABB’ trimer. Only chain B stabilized its rmsd values (gray).
Figure 4.
Cag Y monomer and multimer assembly. a) Summary of the Cag Y whole monomer assembly, structure templates, and methods employed are displayed (6odi, 6x6j, I-tasser, ColabFold2/AlphaFold2-multimer, this monomer was constructed with ChimeraX and a alignment of the missing structure b) Final assembly of the 14-chain multimer. 6odi structure was used as a template. It is showing the side view, the bottom view, and the upper view of the assembled multimer of CagY protein. The missing region corresponding to AP of CagY was completed by homology modeling with Swiss-Model and 6ODI template was used. c) All the 58 Cys residues present in the MRR participate in the formation of 29 disulfide bonds, here highlighted in yellow. A bottom view of the multimer complex seemed to describe about 4 concentric rings of these bonds.
Figure 4.
Cag Y monomer and multimer assembly. a) Summary of the Cag Y whole monomer assembly, structure templates, and methods employed are displayed (6odi, 6x6j, I-tasser, ColabFold2/AlphaFold2-multimer, this monomer was constructed with ChimeraX and a alignment of the missing structure b) Final assembly of the 14-chain multimer. 6odi structure was used as a template. It is showing the side view, the bottom view, and the upper view of the assembled multimer of CagY protein. The missing region corresponding to AP of CagY was completed by homology modeling with Swiss-Model and 6ODI template was used. c) All the 58 Cys residues present in the MRR participate in the formation of 29 disulfide bonds, here highlighted in yellow. A bottom view of the multimer complex seemed to describe about 4 concentric rings of these bonds.
Figure 5.
Comparison of CagY with the Cryo-EM electrography of T4SS. a) Up figure represents the surface of the Cryo-EM dispersion maps, the down figure represents the overlaying T4SS cryo-EM dispersion maps with our model of CagY protein, showing a good fitting with the surface of the Cryo-EM b) Up figure represents the backbone of the Cryo-EM dispersion maps of the T4SS, down figure represents the overlaying of our model of CagY.
Figure 5.
Comparison of CagY with the Cryo-EM electrography of T4SS. a) Up figure represents the surface of the Cryo-EM dispersion maps, the down figure represents the overlaying T4SS cryo-EM dispersion maps with our model of CagY protein, showing a good fitting with the surface of the Cryo-EM b) Up figure represents the backbone of the Cryo-EM dispersion maps of the T4SS, down figure represents the overlaying of our model of CagY.
Figure 6.
Overlapping of the CagY protein with other Cag proteins. a) Build of the CagYm multimer with the 6X6S model b) Transversal view of the components of 6X6S model and CagY multimer, it shows how the core complex of T4SS of Hp is assembled.
Figure 6.
Overlapping of the CagY protein with other Cag proteins. a) Build of the CagYm multimer with the 6X6S model b) Transversal view of the components of 6X6S model and CagY multimer, it shows how the core complex of T4SS of Hp is assembled.
Figure 7.
Antenna projections (AP) of multiples modifications of CagY. a) Wildtype AP of CagY modelled by homology, shows a pore open with a crest-shape. b) Wildtype AP of CagY modelled by alphafold2, this shows a dent in the pore c) CagY GS20, with a modification of the AP region with glycine and serine of 20 amino acids of length, modelled by homology show a pore with up crest. d) CagY GS20 modeled by alphafold2, this structure displays a deformed pore. e) CagY Xc wich has the AP region replaced with AP region of X. citri, modeled by homology, shows a greater overture in comparison with CagY WT. f) CagY Xc, modeled by alphafold2, shows a closer pore g) CagY model with deltaAP lacking both flanking alpha-helix modeled by homology.
Figure 7.
Antenna projections (AP) of multiples modifications of CagY. a) Wildtype AP of CagY modelled by homology, shows a pore open with a crest-shape. b) Wildtype AP of CagY modelled by alphafold2, this shows a dent in the pore c) CagY GS20, with a modification of the AP region with glycine and serine of 20 amino acids of length, modelled by homology show a pore with up crest. d) CagY GS20 modeled by alphafold2, this structure displays a deformed pore. e) CagY Xc wich has the AP region replaced with AP region of X. citri, modeled by homology, shows a greater overture in comparison with CagY WT. f) CagY Xc, modeled by alphafold2, shows a closer pore g) CagY model with deltaAP lacking both flanking alpha-helix modeled by homology.
Table 1.
Functional/Structural domains of CagY protein.
Table 1.
Functional/Structural domains of CagY protein.
Region |
inicio |
fin |
FRR |
1 |
331 |
TM |
343 |
368 |
MRR |
585 |
1369 |
TM |
1798 |
1814 |
AP |
1820 |
1851 |
VirB10 |
1653 |
1896 |