Similar protein structures can be encoded by highly divergent amino acid sequences, enabling convergent evolutionary routes to shared molecular functions. Viruses and pathogenic bacteria have explored this structural space in their evolution, producing a plethora of novel strategies for mimicking host molecular interactions [
8]. However, eukaryotic hosts may be more restricted in the structures they can evolve, as protein mimics can be antigenic and induce autoimmunity or coevolutionary responses [
5,
56]. Thus, intraspecific selfish element structural mimicry may select for proteins that reappropriate functions without triggering cellular or humoral immunity.
Structural mimicry also enables genetic elements in conflict to diverge from their own endogenous interaction networks, while converging on the function of their target. This strategy simultaneously enables mimicked function while preventing the disruption of existing essential networks in the conflicting genome [
57]. Examples of structural mimics have been known for decades, despite the difficulty in detecting them prior to accurate
ab initio protein structure prediction tools, through protein crystal structures and binding data. Recent advances in protein structure and interaction predictions are enabling the discovery of many more structural mimics, both across genomes and taxa [
17,
58,
59].
3.2.1. Repurposed Ancient Domains for Molecular Mimicry
Ancient, conserved molecular machinery essential to housekeeping functions provides a slow-moving target for genetic conflict to co-opt function. For example, both bacteria and eukaryotes use PTMs, such as phosphate, to modulate the activity of their enzymes and signaling proteins. Through employing their own PTMs on host proteins, selfish elements can induce large downstream impacts on host signaling pathways. In contrast to the small motifs that trigger PTMs on bacterial and viral effector proteins by eukaryotic enzymes, larger repeats and domains are required for exogenous elements to conjugate PTMs onto eukaryotic factors [
32].
Protein tyrosine phosphatase (PTP) domains catalyze the removal of phosphate from activated residues, are present in all domains of life, and have been leveraged in genetic conflicts. For example, PTP domains are encoded on
Salmonella’s SptP and
Yersinia’s YopH bacterial effector proteins. These effectors are injected into the host cytoplasm through protein secretion systems, where they dephosphorylate host proteins to restore cellular homeostasis and prevent phagocytosis, respectively [
60,
61]. Some viruses encode type III dual function PTP domains, which can dephosphorylate activated serine and threonine residues, in addition to tyrosine. Overall the structures of these enzymes strongly resemble their eukaryotic targets [
60], which likely reflects both deep conservation of PTP function as well as convergence on the host structural conformation.
3.2.2. Convergent Structural Mimics among Membrane-Bound and Secreted Effector Proteins
Ligand structural mimicry enables genetic elements in conflict to bind target receptors in order to block immune activation, invoke cellular signaling, and gain access to targeted cells. As discussed above, eukaryotic extracellular matrix (ECM) proteins bind to integrin and other cell surface receptors to induce normal development and cellular maintenance. Structural mimicry of ECM binding surfaces is a common strategy leveraged by intraspecific selfish elements to promote internalization, and provides an alternative mechanism to sequence-based RGD-motif mimicry. For example, the
Yersinia pseudotuberculosis bacterial Invasin protein is a membrane-embedded trimeric autotransporter adhesin that mimics the structure of eukaryotic integrin αv to bind to integrin 𝛽1 for cell surface attachment. Similarly, West Nile Virus glycoprotein E mimics the fibronectin FN10 domain for integrin αV𝛽3 binding [
15]. Through searching proteomic databases for potential binding partners of the coronavirus spike protein receptor binding motifs, a new epidermal growth factor (EGF) receptor binding motif was discovered that mimics host receptors and may determine cellular tropism [
62].
Cell surface glycan modifications are commonly mimicked among cancer cell lineages, viruses, and bacterial toxins. In non-selfish
in vivo cell biology, glycan modifications play significant roles in cellular adhesion and signaling cascades that impact differentiation, metabolism, immunity, and other processes [
63]. Loss of cell surface glycans enables cancer cells to evade immune detection. Glycan co-option bestows different metastatic lineages with the ability to colonize new tissues and form tumors [
64]. Bacterial pathogens such as
Campylobacter jejuni, Neisseria species
, H. pylori, and some
E. coli strains possess cell walls composed of lipooligosaccharides (LOS), lipopolysaccharides (LPS), or capsular polysaccharides (CPS) decorated with carbohydrate modifications that mimic eukaryotic lectin-binding glycans. For example, an altered glycan linkage in
Neisseria meningitidis serogroup B capsules enables recruitment of host factor H, which is required to deactivate the immune complement pathway induced by bacterial infection. In
Neisseria gonorrhoeae, LOS with broken sialic acid linkages bind to the lactosamine residues of eukaryotic asialoglycoprotein receptors, triggering clathrin-mediated cell entry. The O-antigen of
H. pylori’s LPS membrane is a mimic of Lewis blood group antigens, which enables colonization and invasiveness [
65]. In viruses, glycan mimicry often takes the form of a glycan shield that enables immune evasion. For example, HIV envelope proteins and Influenza A hemagglutinin glycoproteins add host-derived N-linked glycans near critical residues to sterically block neutralizing antibody binding [
66] (
Figure 4A). Enveloped viruses are able to mimic apoptosis cues by collecting phosphatidylserine from organelle membranes and displaying it to receptors such as T cell immunoglobulin and mucin receptor 1 (TIM1) to trigger uptake by phagocytic cells [
67].
Vesicle traffic is vital for cellular homeostasis and is commonly manipulated in interspecific conflicts. Through the regulation of membrane budding, cytoskeletal transport, and fusion with cellular organelles, vesicle traffic plays vital roles in many cellular processes. SNARE (soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein receptor) proteins mediate and regulate vesicle fusion with target membranes. Emerging evidence indicates that diverse pathogens, including
Chlamydia and
Legionella, use SNARE protein mimicry to prevent fusion with autophagosomes, expand the replicative niche, and control membrane traffic (
Figure 4B). SNARE proteins consist of transmembrane domains connected with linker sequences to coiled-coil motifs that enable membrane fusion. Based on the sequence and structural diversity among
Legionella LegC SNARE mimics, and their ability to complement the loss of endogenous SNAREs, the fusion mechanism appears to be highly permissible to structural variation [
68]. LegC fusion with liposome vesicles containing the endosomal arginine (R)-SNARE vesicle-associated membrane protein 4 (VAMP4), providing extra membrane and cytosolic resources for bacterial niche expansion. However, LegC is an imperfect SNARE mimic. Once LegC fuses with the VAMP4 protein, it cannot be disassembled by the disassembly machinery and rendered inactive.
Many different actin and microtubule-associated proteins are mimicked by pathogenic bacteria and viruses for movement within a eukaryotic cell [
69]. Actin nucleation proteins, such as the Wiskott-Aldrich syndrome (WASP) family of proteins, are essential for building and recycling the cytoskeleton and vesicles. Microtubule-associated proteins permit long-range directed transport of cellular cargos in normal cell biology. Intraspecific conflicts have evolved mimics for both of these processes. The membrane-associated ActA protein from
Listeria monocytogenes is a structural mimic of WASP actin nucleation proteins, which enables the bacteria to polymerize actin, propelling themselves from cell to cell. Following phosphorylation by host casein kinase II (CK2), ActA interacts with the actin-related protein (ARP) 2/3 complex through a region that displays structural similarity to the C region of the VCA domain of WASP proteins to induce polymerization. While these phosphorylation sites are shared among divergent WASP proteins, the rest of the protein is unique to
L. monocytogenes, indicating that this mimic evolved by convergence [
70]. The microtubule cytoskeleton also offers opportunities for interspecific conflicts to co-opt cellular transport: vaccinia virus encodes a linker protein mimic that enables it to commandeer host Kinesin heavy chain motor proteins for transport to the plasma membrane from the cytoplasm for cell exit (
Figure 4C). Similarly, HIV’s capsid protein contains a region that is a structural mimic of the microtubule-associated End-binding1 (Eb1) protein, enabling it to bind to Cytoplasmic linker protein-170 (CLIP-170) for dynein-mediated minus-ended microtubule transport to the nucleus [
69].