1. Introduction
Rabies virus (RABV) also known as Rhabdovirus causes rabies, which is a preventable (through prompt administration of post-exposure prophylaxis (PEP) to victims of bites by rabid animals [
1]) but rarely curable disease [
2]. Once the symptoms start manifesting the disease is nearly 100% fatal [
3]. It was reported that RABV infection causes more than 55,000 deaths worldwide [
4].
Rabies virus affects the central nervous system causing acute infection [
5]. The transmission of virus usually happens through the bite of a rabid animal [
2,
3]. The virus has a rod- or bullet-like shape, and its genome is a single-stranded, negative-sense, linear non-segmented enveloped RNA [
6]. RABV belongs to the Rhabdoviridae family and genus Lyssavirus, hence, the name rhabdovirus [
6,
7].
The genome encodes for 5 different proteins named as N (nucleoprotein), P (phosphoprotein), M (matrix protein), G (glycoprotein), and L (polymerase) [
6]. The bullet shaped virus enclosed in lipid envelope covered by glycoprotein that facilitates the attachment of the virus to the host cell receptors and thus ensures viral entry. The helical ribonucleocapsid core is composed of viral genome and nucleoprotein [
8].
Most often, the exposure to RABV happens due to the bite or the scratches of a rabid animal [
2,
6]. At the site of injury, the muscle cells of the new host become exposed to the rabid animal saliva, which contains the particles of rabies virus [
9,
10]. RABV initially replicates in the muscle cells, but its next destination is peripheral nervous system [
6,
9,
10]. The virus binds to the receptors on the nerve endings of peripheral nervous system near the site of infection [
11,
12]. From here on, RABV moves along the nerves through axonal transport to enter the peripheral nervous system [
11]. Then it moves to the main target, the central nervous system [
2]. When the RABV is in the central nervous system of the host, it starts to replicate rapidly, spreading to the spinal cord and different parts of the brain causing inflammation of the brain (encephalitis) [
2].
The lifecycle of rabies virus as it enters the host cell can be divided into following steps:
- -
Attachment: At first, G protein of virus attaches itself to the cell surface receptors. [
11];
- -
Endocytosis: Then it enters the host cell through receptor-mediated endocytosis as shown in the
Figure 2 below [
6,
11];
- -
Uncoating: After the entry through endosomal formation, the ribonucleoprotein complex of virus is released into cytoplasm [
6,
11];
- -
Transcription: once in the cytoplasm, the RNA-dependent RNA polymerase is used to transcribed antigenome RNA to mRNA [
6,
11];
- -
Replication: Replicative intermediate is used to replicate progeny genome RNA [
6,
11];
- -
Translation: Viral mRNA strand is used for the Translation of 5 major proteins (N, P, M, G, and L);
- -
Assembly: All these viral particles (genome and proteins) assembled into new virions [
11];
- -
Budding: Assembled virions bud off from the cell surface of host cells acquiring its envelope from the host cell membrane [
13];
- -
Release: The mature rabies virus normally releases from the cells through cell lysis and spread through the central nervous system and brain to infect healthy cells [
13].
During the assembly of virus progeny, some host proteins become integrated into the mature virion particles, which may help the virus to camouflage as the host cells to escape the immune system [
14]. In this article we will focus on the analysis of the intrinsic disorder of such host proteins entrapped in the virus particles. Knowing more about the intrinsic disorder property of these proteins will help us understand about the interaction of viruses with host cells, because intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are highly flexible and can change their structure and function in response to different environments [
15]. Therefore, protein intrinsic disorder can help viruses to become more adaptable and flexible. We can also learn the strategies of viruses to evade the immune system to help us understand the pathogenesis of rabies virus in greater depth.
In this context, Yan Zhang and colleagues published a paper discussing the host proteins incorporated into RABV particles when they are released from the host cells [
16]. The authors purified the viral particles to perform the proteome profiling of RABV. They found out that along with 5 main viral proteins, 49 host proteins are also integrated in viral particles, and 24 of these are directly taking part in viral replication suggesting that the virus hijacks the host cellular machinery and interacts with host proteins for efficient replication [
16]. An illustrative example is given by the integration of heat shock protein (HSP70) into a matured RABV virion. Decreasing the expression of HSP70 leads to a substantial reduction in the levels of viral RNAs, proteins, and virions [
17]. This suggests that specifically the enveloped viruses utilized the host proteins to carry out their replication [
16].
Rabies viruses that belong to the Rhabdoviruses family bud out of host cells using the host endosomal sorting complex required for transport (ESCRT) machinery [
18,
19]. Hijacking of the host ESCRT machinery plays a vital role in integrating the host proteins in the virus particles [
18,
19]. Two important proteins in this respect are charged multivesicular body protein 4b (Chmp4b) and Vacuolar protein sorting-associated protein 37B (Vps37b), both play crucial roles in budding process during the virus life cycle [
16]. Chmp4b is an essential component of ESCRT III complex, which is responsible for the final stages of budding [
16]. Thus, protein is involved in the final detachment of the newly formed virions from the cell membrane of the host cells. On the other hand, the Vps37b is involved with ESCRT I, which is taking part in the initial step of the viral budding process [
16]. Therefore, these two proteins can serve as potential therapeutic targets.
The protocol utilized by Zhang and colleagues in this important study is outlined below [
16]. The authors used nano-scale liquid chromatography tandem mass spectrometry techniques on purified viral particles to identify 49 virus-associated host proteins [
16]. Then, the Western blotting approach was used to validate the presence of these proteins in the matured viral particles [
16]. They used RABV, CVS-11 strain to infect mouse Neuro-2A cells [
16]. This step was crucial to obtain virus particles for further analysis [
16]. In these experiments, cells were infected at 70% confluence, and the virus-containing supernatant was subject to multiple rounds of differential centrifugation [
16]. Ultra-centrifugation at 100,000 × g was performed on samples containing bullet-shaped viruses for higher purification [
16]. The purified viral particles isolated through centrifugal separation were observed through transmission electron microscopy and cryo-electron microscopy to inspect and verify the purity and structural integrity of viral particles [
16].
At the next stage, the deglycosylation of proteins by the PNGase F enzyme was conducted to enhance the sensitivity of peptides that contain glycosylation sites [
16,
20,
21]. The proteins were analyzed through SDS-PAGE to separate them based on their molecular weight [
16], followed by Western blotting using a monoclonal antibody against the RABV glycoprotein and a secondary antibody labeled with Alexa Flours 680 for detection [
16].
Next, the proteomic analysis was conducted by nano LC-MS/MS on the peptide mixtures obtained through the digestion of deglycosylated protein samples with trypsin [
16]. Analysis of proteins through Liquid Chromatography-Tandem Mass Spectrometry allowed for the detailed analysis of peptides, including their structure and sequence [
16]. To ensure accuracy and reliability, the data obtained through mass spectrometry were then analyzed using the Andromeda search engine and MaxQuant software against specific databases, such as UniProtKB mouse sequence database and RABV (CVS-11) protein sequences from GenBank [
16]. The authors employed intensity-based absolute quantification (iBAQ) for evaluation of the protein abundance and applied FDR (False Discovery Rate) of 0.01 % to ensure the data accuracy [
16]. To further increase the confidence, Zhang
et al. did not rely solely on one experimental run. Instead, they observed proteins in three different assays [
16]. The repetition helped in ruling out false positives and enhanced the credibility of the results. In the identification of high-confidence proteins, each candidate was supposed to fulfill three criteria; first was the consistent identification in three different independently purified virion preparations, second was the abundance threshold exceeding 10
6, and the third criterion was the presence of at least 2 unique peptides corresponding to each target protein [
16].
Then, to investigate the incorporation of host proteins into the virus particles, purified virus particles were treated with protease K to digest proteins, which helps in focusing on proteins that are truly incorporated into the viral particles and in removing the loosely attached proteins [
16]. The treated virions were processed to remove any cleaved peptides [
16].
To prepare the cell extracts, 2 groups of N2a cell lines were prepared; one treated with CVS11 and the other serves as control. The supernatant of the cell extracts was collected after 72 hours, these liquid parts contain the proteins released from the cells [
16]. Both the cell extracts and virus particles the one treated with protease K and untreated were subjected to Western blotting. The goal was to identify the proteins in Virus particles and cell extracts using antibodies [
16].
Viral proteins, such as Glycoprotein G, Nucleoprotein N, and matrix protein M were detected using specific mouse polyclonal antibodies, while host cell proteins, such as Hsc70, cofilin, and Chmp4b were probed through additional antibodies [
16]. Finally, all these proteins were visualized using the fluorescently labeled secondary antibodies [
16]. The overall step was crucial to check whether host proteins were incorporated into the virus particles and do they incorporate in them firmly [
16].
Moving on, Zhang
et al. performed functional characterization of the 49 incorporated host proteins in the virus particles through the gene ontology database [
16]. They were aiming to get a deep understanding of the complex interaction of host cells and RABV, and the functional implications of these proteins in the virion particles like involvement in the viral process like budding [
16]. Protein-protein interaction network analysis was carried out, which also strongly suggests that many of these host proteins are involved in viral budding, especially through ESCRT machinery [
16]. This implies the possibility that the virus might be exploiting these host proteins, mainly the ones involved in ESCRT machinery, to exit the host cells further assisting the viral pathogenesis [
16]. One important aspect was left unexplored by the authors, namely, the intrinsic disorder status of the host proteins entrapped in RABV particles.
Intrinsically disordered proteins (IDPs) are a class of biologically active proteins without unique structures [
15,
22,
23,
24,
25]. Contrary to traditional ordered proteins IDPs and intrinsically disordered regions (IDRs) lack well-defined three-dimensional structures and exist as highly dynamic conformational ensembles [
15,
22,
23,
24,
25]. Intrinsic disorder is highly prevalent, and almost 70% of PDB structures have disordered regions [
26,
27]. IDPs are multifunctional proteins that can have multiple binding partners and are characterized by high sensitivity to subtle changes in local environmental conditions like pH and temperature, being capable of rapid change of their structures in response to the external environment [
15,
22,
23,
24,
25]. IDPs/IDRs have a large interface area with the dominance of hydrophobic-hydrophobic contact. Unlike ordered proteins, IDPs have a weak hydrophobic core (if any), as their amino acid sequences have low content of hydrophobic and aromatic residues and contain large numbers of charged and polar residues [
15,
28]. All these properties make intrinsic disorder proteins an integral part of the protein universe with important biological functions that complement functionality of ordered proteins. The flexibility and adaptability of IPDs make them suitable candidates to take part in diverse cellular functions like cell signaling, molecular recognition, and protein-protein interactions [
29,
30]. At the same time, the adaptable and flexible nature of IDPs also makes them important players in the pathogenesis of various diseases like cancer and neurodegenerative diseases [
31,
32,
33,
34,
35].
In this study, to analyze the intrinsic disorder status of the host proteins entrapped in RABV, we used the data on the 47 high confidence host proteins reported by Zhang
et al. [
16]. These entrapped proteins were subjected to multifactorial disorder analysis using a set of commonly used disorder predictors. Then, we conducted more detailed bioinformatics characterization of 5 entrapped proteins with highest levels of predicted disorder.
Figure 1.
Multifactorial intrinsic disorder analysis of mouse proteins entrapped in RABV particles. A. PONDR® VSL2 Score vs. VSL2 PONDR® (%) analysis. PONDR® VSL2 (%) is a percent of predicted disordered residues (PPDR), i.e., residues with disorder scores above 0.5. PONDR® VSL2 score is the average disorder score (ADS) for a protein. Color blocks indicate regions in which proteins are mostly ordered (blue and light blue), moderately disordered (pink and light pink), or mostly disordered (red). If the two parameters agree, the corresponding part of the background is dark (blue or pink), whereas light blue and light pink reflect areas in which the predictors disagree with each other. The boundaries of the colored regions represent arbitrary and accepted cutoffs for ADS (y-axis) and the percentage of predicted disordered residues (PPDR; x-axis). B. Charge-Hydropathy and Cumulative Distribution Function (CH-CDF) analysis of entrapped host proteins. The CH-CDF plot is a two-dimensional representation that integrates both the CH plot, which correlates a protein's net charge and hydrophobicity with its structural order, and the CDF, which accumulates disorder predictions from the N-terminus to the C-terminus of a protein, offering insight into the distribution of disorder residues. The Y-axis (ΔCH) represents the protein's distance from the CH boundary, indicating the balance between charge and hydrophobicity, while the X-axis (ΔCDF) represents the deviation of a protein's disorder frequency from the CDF boundary. Proteins are then stratified into four quadrants: Quadrant 1 (bottom right) indicates proteins likely to be structured; Quadrant 2 (bottom left) includes proteins that may be in a molten globule state or lack a unique 3D structure; Quadrant 3 (top left) consists of proteins predicted to be highly disordered; Quadrant 4 (top right) captures proteins that present a mixed prediction of being disordered according to CH but ordered according to CDF.
Figure 1.
Multifactorial intrinsic disorder analysis of mouse proteins entrapped in RABV particles. A. PONDR® VSL2 Score vs. VSL2 PONDR® (%) analysis. PONDR® VSL2 (%) is a percent of predicted disordered residues (PPDR), i.e., residues with disorder scores above 0.5. PONDR® VSL2 score is the average disorder score (ADS) for a protein. Color blocks indicate regions in which proteins are mostly ordered (blue and light blue), moderately disordered (pink and light pink), or mostly disordered (red). If the two parameters agree, the corresponding part of the background is dark (blue or pink), whereas light blue and light pink reflect areas in which the predictors disagree with each other. The boundaries of the colored regions represent arbitrary and accepted cutoffs for ADS (y-axis) and the percentage of predicted disordered residues (PPDR; x-axis). B. Charge-Hydropathy and Cumulative Distribution Function (CH-CDF) analysis of entrapped host proteins. The CH-CDF plot is a two-dimensional representation that integrates both the CH plot, which correlates a protein's net charge and hydrophobicity with its structural order, and the CDF, which accumulates disorder predictions from the N-terminus to the C-terminus of a protein, offering insight into the distribution of disorder residues. The Y-axis (ΔCH) represents the protein's distance from the CH boundary, indicating the balance between charge and hydrophobicity, while the X-axis (ΔCDF) represents the deviation of a protein's disorder frequency from the CDF boundary. Proteins are then stratified into four quadrants: Quadrant 1 (bottom right) indicates proteins likely to be structured; Quadrant 2 (bottom left) includes proteins that may be in a molten globule state or lack a unique 3D structure; Quadrant 3 (top left) consists of proteins predicted to be highly disordered; Quadrant 4 (top right) captures proteins that present a mixed prediction of being disordered according to CH but ordered according to CDF.
Figure 3.
Correlation between the intrinsic disorder levels in the host proteins entrapped in RABV particles and their interactivity within the intra-set PPI (A) and predisposition for to be involved in liquid-liquid phase separation, LLPS (B). Solid lines in both plots show linear fits of the reported data, whereas short-long-dashed lines represents boundaries between different disorder categories, as well as between hubs and non-hubs (A) and LLPS promoters and other proteins (B). .
Figure 3.
Correlation between the intrinsic disorder levels in the host proteins entrapped in RABV particles and their interactivity within the intra-set PPI (A) and predisposition for to be involved in liquid-liquid phase separation, LLPS (B). Solid lines in both plots show linear fits of the reported data, whereas short-long-dashed lines represents boundaries between different disorder categories, as well as between hubs and non-hubs (A) and LLPS promoters and other proteins (B). .
Figure 4.
Functional disorder analysis of mouse neuromodulin (UniProt ID: P06837). A. Per-residue disorder profile generated by RIDAO showing that a major portion of this protein has predicted value of disorder above the established threshold (0.5). B. Functional disorder profile generated for neuromodulin by the D2P2 database showing rhe outputs of several disorder predictors such as VLXT, VSL2b, PrDOS, IUPred and Espritz. The colored bar highlighted by blue and green shade represents the disorder prediction, colored circles below the bar shows the predicting PTMs. C. The FuzDrop-generated plot showing the sequence distribution of the residue-based droplet-promoting probabilities, pDP. D. The FuzDrop-generated plot of the multiplicity of binding modes showing positions of regions that can sample multiple binding modes in the cellular context (sub-cellular localisation, partners, posttranslational modifications)-dependent manner (residues 9-16 and 40-66). E. Protein-protein interaction network generated by STRING. This PPI nework was generated using the minimum required interaction score of 0.4 (medium confidence) and adjusting the value of a maximum number of interactors to 500. Network nodes represent individual proteins and edges represent protein-protein interaction for shared function, with types of Interactions; the blue line represents curated databases, black line for co-expression, and green line for gene neighborhood. F. 3D structural model is predicted through AlphaFold. The structure is colored according to the per-residue model confidence score ranging from orange to blue. Fragments of structure from very low (pLDDT < 50) value to very high confidence (pLDDT > 90), respectively.
Figure 4.
Functional disorder analysis of mouse neuromodulin (UniProt ID: P06837). A. Per-residue disorder profile generated by RIDAO showing that a major portion of this protein has predicted value of disorder above the established threshold (0.5). B. Functional disorder profile generated for neuromodulin by the D2P2 database showing rhe outputs of several disorder predictors such as VLXT, VSL2b, PrDOS, IUPred and Espritz. The colored bar highlighted by blue and green shade represents the disorder prediction, colored circles below the bar shows the predicting PTMs. C. The FuzDrop-generated plot showing the sequence distribution of the residue-based droplet-promoting probabilities, pDP. D. The FuzDrop-generated plot of the multiplicity of binding modes showing positions of regions that can sample multiple binding modes in the cellular context (sub-cellular localisation, partners, posttranslational modifications)-dependent manner (residues 9-16 and 40-66). E. Protein-protein interaction network generated by STRING. This PPI nework was generated using the minimum required interaction score of 0.4 (medium confidence) and adjusting the value of a maximum number of interactors to 500. Network nodes represent individual proteins and edges represent protein-protein interaction for shared function, with types of Interactions; the blue line represents curated databases, black line for co-expression, and green line for gene neighborhood. F. 3D structural model is predicted through AlphaFold. The structure is colored according to the per-residue model confidence score ranging from orange to blue. Fragments of structure from very low (pLDDT < 50) value to very high confidence (pLDDT > 90), respectively.
Figure 5.
Functional Disorder Analysis of mouse Chmp4b (UniProt ID: Q9D8B3). A. Per-residue diosrder profile generated by RIDAO. B. Functional disorder profile generated by D2P2. C. Per-residue LLPS potential as estimated by FuzDrop, demonstrating the tendency of each residue to promote droplet formation. D. Multiplicity of Binding Modes plot generated by FuzDrop. E. The PPI network generated utilizing STRING by adjusting the value of the maximum number of interactors at 500. F. 3D structural model generated by AlphaFold. The structure is colored according to the per-residue model confidence score ranging from orange (very low confidence pLDDT < 50) to blue (very high confidence pLDDT > 90), respectively.
Figure 5.
Functional Disorder Analysis of mouse Chmp4b (UniProt ID: Q9D8B3). A. Per-residue diosrder profile generated by RIDAO. B. Functional disorder profile generated by D2P2. C. Per-residue LLPS potential as estimated by FuzDrop, demonstrating the tendency of each residue to promote droplet formation. D. Multiplicity of Binding Modes plot generated by FuzDrop. E. The PPI network generated utilizing STRING by adjusting the value of the maximum number of interactors at 500. F. 3D structural model generated by AlphaFold. The structure is colored according to the per-residue model confidence score ranging from orange (very low confidence pLDDT < 50) to blue (very high confidence pLDDT > 90), respectively.
Figure 6.
Functional Disorder Analysis of protein DnaJ homolog subfamily B member 6 (UniProt ID: O54946). A. RIDAO-generated per-residue disorder profile. B. Disorder-based functionality evaluated by D2P2. C. Per-residue LLPS potential as estimated by FuzDrop, demonstrating the tendency of each residue to promote droplet formation. D. Multiplicity of Binding Modes plot generated by FuzDrop. E. The PPI network generated utilizing STRING by adjusting the value of the maximum number of interactors at 500. F. 3D structural model generated by AlphaFold. The structure is colored according to the per-residue model confidence score (pLDDT), with fragments of structure with very low (pLDDT < 50), low (70 > pLDDT > 50, high (90 > pLDDT > 70), and very high confidence (pLDDT > 90) confidence being shown by orange, yelow, cyan, and blue colore, respectively.
Figure 6.
Functional Disorder Analysis of protein DnaJ homolog subfamily B member 6 (UniProt ID: O54946). A. RIDAO-generated per-residue disorder profile. B. Disorder-based functionality evaluated by D2P2. C. Per-residue LLPS potential as estimated by FuzDrop, demonstrating the tendency of each residue to promote droplet formation. D. Multiplicity of Binding Modes plot generated by FuzDrop. E. The PPI network generated utilizing STRING by adjusting the value of the maximum number of interactors at 500. F. 3D structural model generated by AlphaFold. The structure is colored according to the per-residue model confidence score (pLDDT), with fragments of structure with very low (pLDDT < 50), low (70 > pLDDT > 50, high (90 > pLDDT > 70), and very high confidence (pLDDT > 90) confidence being shown by orange, yelow, cyan, and blue colore, respectively.
Figure 7.
Functional Disorder Analysis of protein Vps37b (UniProt ID: Q8R0J7). A. Per-residue disorder profile generated by the RIDAO platform. B. Functional disorder profile generated by the D2P2 database. C. Per-residues droplet formation propensity generated by FuzDrop. D. Multiplicity of Binding Modes plot generatwd by FuzDrop. E. Protein-Protein interaction network generatwd for this protein utilizing STRING database. F. 3D structural model predicted by AlphaFold. The structure is colored according to the per-residue model confidence score ranging from orange (pLDDT < 50) to blue (pLDDT > 90).
Figure 7.
Functional Disorder Analysis of protein Vps37b (UniProt ID: Q8R0J7). A. Per-residue disorder profile generated by the RIDAO platform. B. Functional disorder profile generated by the D2P2 database. C. Per-residues droplet formation propensity generated by FuzDrop. D. Multiplicity of Binding Modes plot generatwd by FuzDrop. E. Protein-Protein interaction network generatwd for this protein utilizing STRING database. F. 3D structural model predicted by AlphaFold. The structure is colored according to the per-residue model confidence score ranging from orange (pLDDT < 50) to blue (pLDDT > 90).
Figure 8.
Distribution of ELMs (short linear functional motifs) within the sequence of the mouse Vps37B protein. Refer to the additional information provided in
Supplementary Table S9.
Figure 8.
Distribution of ELMs (short linear functional motifs) within the sequence of the mouse Vps37B protein. Refer to the additional information provided in
Supplementary Table S9.
Figure 9.
Functional Disorder Analysis of mouse protein Wasl (UniProt ID: Q91YD9). A. Multiparametric intrinsic disorder profile generated by RIDAO. B. D2P2-generated functional disorder profile. C. Residue-based LLPS propensity. D. Multiplicity of Binding Modes plot. E. Wasl-centered PPI network generatwd utilizing STRING Database by adjusting the value of the maximum number of interactors to 500. F. 3D structural model as predicted by AlphaFold. The structure is colored according to the per-residue model confidence score ranging from orange (very low confidence, pLDDT < 50) to blue (very high confidence, pLDDT > 90).
Figure 9.
Functional Disorder Analysis of mouse protein Wasl (UniProt ID: Q91YD9). A. Multiparametric intrinsic disorder profile generated by RIDAO. B. D2P2-generated functional disorder profile. C. Residue-based LLPS propensity. D. Multiplicity of Binding Modes plot. E. Wasl-centered PPI network generatwd utilizing STRING Database by adjusting the value of the maximum number of interactors to 500. F. 3D structural model as predicted by AlphaFold. The structure is colored according to the per-residue model confidence score ranging from orange (very low confidence, pLDDT < 50) to blue (very high confidence, pLDDT > 90).
Figure 10.
Distribution of ELMs (short linear functional motifs) within the sequence of the mouse Wasl proten (UniProt ID: Q91YD9). For additional information see
Supplementary Table S10.
Figure 10.
Distribution of ELMs (short linear functional motifs) within the sequence of the mouse Wasl proten (UniProt ID: Q91YD9). For additional information see
Supplementary Table S10.
Figure 11.
Intra-set interactivity of 11 most disordered mouse proteins entrapped in the RABV particles. Networks are constructed by STRING using medium confidence of 0.4 (A) and low confidence of 0.15 (B). .
Figure 11.
Intra-set interactivity of 11 most disordered mouse proteins entrapped in the RABV particles. Networks are constructed by STRING using medium confidence of 0.4 (A) and low confidence of 0.15 (B). .
Figure 12.
Global interactivity of the 11 most disordered mouse proteins found in the RBV particles. Using the k-means clustering (the alrotithm, which is included in STRING, automatically assigns data points to one of the K clusters depending on their distance from the center of the clusters) this PPI network can be divided on three clusters. .
Figure 12.
Global interactivity of the 11 most disordered mouse proteins found in the RBV particles. Using the k-means clustering (the alrotithm, which is included in STRING, automatically assigns data points to one of the K clusters depending on their distance from the center of the clusters) this PPI network can be divided on three clusters. .
Table 1.
Localization of ELMs (Eukaryotic Linear Motifs) within the Droplet Promoting Regions, Aggregation Hot-spots and MoRFs of mouse Neuromodulin (UniProt ID: P06837). For additional information, see
Supplementary Table S6.
Table 1.
Localization of ELMs (Eukaryotic Linear Motifs) within the Droplet Promoting Regions, Aggregation Hot-spots and MoRFs of mouse Neuromodulin (UniProt ID: P06837). For additional information, see
Supplementary Table S6.
Region Type |
Region Range |
ELM ID |
Position |
Droplet Promoting region |
52-277 |
LIG_PDZ_Class_3 |
222-227 |
LIG_WD40_WDR5_VDV_2 |
219-222 |
218-222 |
215-222 |
155-161 |
154-161 |
133-137 |
132-137 |
131-137 |
96-99 |
95-99 |
64-66 |
58-64 |
MOD_GlcNHglycan |
209-212 |
132-135 |
127-130 |
85-88 |
84-88 |
MOD_SUMO_rev_2 |
203-207 |
200-207 |
198-207 |
196-201 |
193-201 |
192-201 |
191-201 |
154-159 |
149-159 |
122-126 |
118-126 |
CLV_C14_Caspase3-7 |
197-201 |
DOC_USP7_MATH_1 |
207-211 |
190-194 |
119-123 |
MOD_CK2_1 |
190-196 |
142-148 |
MOD_GSK3_1 |
186-193 |
135-142 |
MOD_PIKK_1 |
190-196 |
LIG_TRAF6_MATH_1 |
184-192 |
DOC_WW_Pin1_4 |
169-174 |
139-144 |
93-98 |
MOD_ProDKin_1 |
169-175 |
139-145 |
93-99 |
DOC_USP7_UBL2_3 |
153-157 |
MOD_SUMO_for_1 |
152-155 |
97-100 |
25-28 |
MOD_CK1_1 |
142-148 |
128-134 |
86-92 |
LIG_BIR_III_2 |
118-122 |
MOD_Plk_2-3 |
107-113 |
MOD_CDK_SPK_2 |
93-98 |
MoRF |
102-109 |
MOD_Plk_2-3 |
107-113 |
MoRF |
58-81 |
LIG_WD40_WDR5_VDV_2 |
58-64 |
63-66 |
Aggregation Hotspot |
52-66 |
LIG_WD40_WDR5_VDV_2 |
58-64 |
63-66 |
MoRF |
1-9 |
LIG_UBA3_1 |
1-9 |
LIG_FHA_1 |
6-12 |
MOD_PKA_2 |
5-11 |
CLV_NRD_NRD_1 |
6-8 |
CLV_PCSK_KEX2_1 |
6-8 |
TRG_ER_diArg_1 |
5-7 |
DEG_Nend_Nbox_1 |
1-3 |
Table 2.
Distribution of ELMs (Eukaryotic Linear Motifs) in Droplet Promoting Regions, Aggregation Hot-spots, regions with multiplicity of binding modes, and MoRFs (Molecular recognition features) of the protein Chmp4b (UniProt ID: Q9D8B3). Table summarizes the ELMs mapped onto these regions suggesting potential functional role of these motifs. For additional information, see the
Supplementary Table S7.
Table 2.
Distribution of ELMs (Eukaryotic Linear Motifs) in Droplet Promoting Regions, Aggregation Hot-spots, regions with multiplicity of binding modes, and MoRFs (Molecular recognition features) of the protein Chmp4b (UniProt ID: Q9D8B3). Table summarizes the ELMs mapped onto these regions suggesting potential functional role of these motifs. For additional information, see the
Supplementary Table S7.
Region Type |
Region range |
ELM ID |
Position |
MoRF |
1-12 |
LIG_BIR_II_1 |
1-5 |
LIG_LIR_Nem_3 |
2-7 |
LIG_Pex14_2 |
4-8 |
Droplet-promoting region |
1-22 |
LIG_BIR_II_1 |
1-5 |
LIG_LIR_Nem_3 |
2-7 |
LIG_Pex14_2 |
4-8 |
DOC_WW_Pin1_4 |
18-23 |
MOD_ProDKin_1 |
18-24 |
LIG_FHA_2 |
19-25 |
Region with multiplicity of binding modes |
27-32 |
MOD_PKA_2 |
29-35 |
Region with multiplicity of binding modes |
39-82 |
MOD_SUMO_rev_2 |
41-47 |
TRG_NLS_Bipartite_1 |
55-75 |
DOC_USP7_UBL2_3 |
56-60 |
CLV_PCSK_PC1ET2_1 |
62-64 |
TRG_NLS_MonoExtN_4 |
70-75 |
TRG_NLS_MonoCore_2 |
69-74 |
70-75 |
TRG_NLS_MonoExtC_3 |
69-74 |
70-75 |
Aggregation Hotspot |
54-62 |
DOC_USP7_UBL2_3 |
56-60 |
CLV_PCSK_PC1ET2_1 |
62-64 |
MoRF |
108-118 |
LIG_SH2_STAP1 |
111-115 |
LIG_WD40_WDR5_VDV_2 |
111-115 |
CLV_PCSK_SKI1_1 |
114-118 |
MoRF |
141-200 |
MOD_GSK3_1 |
143-150 |
181-188 |
DOC_PP1_RVXF_1 |
149-156 |
LIG_Pex14_2 |
155-159 |
LIG_WD40_WDR5_VDV_2 |
161-168 |
162-168 |
163-168 |
177-183 |
CLV_PCSK_SKI1_1 |
178-182 |
TRG_Pf-PMV_PEXEL_1 |
178-182 |
LIG_SUMO_SIM_par_1 |
179-184 |
MOD_CK2_1 |
181-187 |
MOD_GlcNHglycan |
182-186 |
183-186 |
LIG_FHA_1 |
186-192 |
LIG_SH3_3 |
186-192 |
189-195 |
194-200 |
197-203 |
DOC_USP7_MATH_1 |
198-202 |
Region with multiplicity of binding modes |
183-190 |
LIG_WD40_WDR5_VDV_2 |
177-183 |
TRG_Pf-PMV_PEXEL_1 |
178-182 |
LIG_SUMO_SIM_par_1 |
179-184 |
MOD_CK2_1 |
181-187 |
MOD_GlcNHglycan |
182-186 |
183-186 |
LIG_FHA_1 |
186-192 |
LIG_SH3_3 |
186-192 |
189-195 |
Region with multiplicity of binding modes |
197-207 |
DOC_USP7_MATH_1 |
198-202 |
LIG_SH3_2 |
200-205 |
CLV_PCSK_SKI1_1 |
202-206 |
DOC_USP7_UBL2_3 |
202-206 |
Aggregation Hotspot |
197-207 |
DOC_USP7_MATH_1 |
198-202 |
LIG_SH3_2 |
200-205 |
CLV_PCSK_SKI1_1 |
202-206 |
DOC_USP7_UBL2_3 |
202-206 |
Droplet-promoting region |
190-224 |
LIG_FHA_1 |
186-192 |
LIG_SH3_3 |
186-192 |
189-195 |
194-200 |
197-203 |
DOC_USP7_MATH_1 |
198-202 |
LIG_SH3_2 |
200-205 |
CLV_PCSK_SKI1_1 |
202-206 |
DOC_USP7_UBL2_3 |
202-206 |
LIG_SH3_4 |
202-209 |
TRG_NESrev_CRM1_2 |
208-217 |
209-217 |
210-217 |
211-217 |
212-217 |
Aggregation Hotspot |
211-217 |
TRG_NESrev_CRM1_2 |
208-217 |
209-217 |
210-217 |
211-217 |
212-217 |
MoRF |
214-224 |
TRG_NESrev_CRM1_2 |
208-217 |
209-217 |
210-217 |
211-217 |
212-217 |
MOD_SUMO_rev_2 |
212-217 |
Table 3.
Distribution of ELMs (Eukaryotic Linear Motifs) in droplet promoting regions, aggregation hot-spots, regions with multiplicity of binding modes and MoRF (Molecular recognition features) of protein DnaJ homolog subfamily B member 6 (UniProt ID: O54946). Table is summarizing the ELMs mapped onto these regions suggesting potential functional role of these motifs. For additional information, see the
Supplementary Table S8.
Table 3.
Distribution of ELMs (Eukaryotic Linear Motifs) in droplet promoting regions, aggregation hot-spots, regions with multiplicity of binding modes and MoRF (Molecular recognition features) of protein DnaJ homolog subfamily B member 6 (UniProt ID: O54946). Table is summarizing the ELMs mapped onto these regions suggesting potential functional role of these motifs. For additional information, see the
Supplementary Table S8.
Region Type |
Region Range |
ELM ID |
Position |
Region with multiplicity of binding modes |
14-23 |
DOC_WW_Pin1_4 |
12-17 |
CLV_NRD_NRD_1 |
23-25 |
Region with multiplicity of binding modes |
39-55 |
CLV_NRD_NRD_1 |
43-45 |
CLV_PCSK_SKI1_1 |
44-48 |
Droplet Promoting Region |
58-94 |
LIG_LIR_Nem_3 |
63-68 |
DOC_WW_Pin1_4 |
83-88 |
DOC_PP4_FxxP_1 |
84-87 |
Region with multiplicity of binding modes |
57-69 |
TRG_Pf-PMV_PEXEL_1 |
62-66 |
LIG_LIR_Nem_3 |
63-68 |
Aggregation hotspot |
58-69 |
TRG_Pf-PMV_PEXEL_1 |
62-66 |
LIG_LIR_Nem_3 |
63-68 |
Droplet Promoting Region |
58-94 |
TRG_Pf-PMV_PEXEL_1 |
62-66 |
DOC_PP4_FxxP_1 |
84-87 |
94-97 |
Region with multiplicity of binding modes |
83-90 |
DOC_WW_Pin1_4 |
83-88 |
MOD_ProDKin_1 |
83-89 |
DOC_PP4_FxxP_1 |
84-87 |
Aggregation hotspot |
83-90 |
DOC_WW_Pin1_4 |
83-88 |
MOD_ProDKin_1 |
83-89 |
DOC_PP4_FxxP_1 |
84-87 |
Region with multiplicity of binding modes |
93-131 |
DOC_PP4_FxxP_1 |
84-87 |
94-97 |
CLV_PCSK_SKI1_1 |
102-106 |
Aggregation hotspot |
105-114 |
LIG_BRCT_BRCA1_1 |
111-115 |
LIG_AP2alpha_2 |
109-111 |
Aggregation hotspot |
119-131 |
DOC_PP4_FxxP_1 |
116-119 |
LIG_AP2alpha_1 |
116-120 |
120-124 |
LIG_AP2alpha_2 |
118-120 |
CLV_NRD_NRD_1 |
127-129 |
CLV_PCSK_KEX2_1 |
127-129 |
Droplet Promoting Region |
119-185 |
DOC_PP4_FxxP_1 |
116-119 |
LIG_AP2alpha_1 |
116-120 |
120-124 |
CLV_NRD_NRD_1 |
127-129 |
CLV_PCSK_KEX2_1 |
127-129 |
LIG_Arc_Nlobe_1 |
148-152 |
155-120 |
OC_USP7_MATH_1 |
164-168 |
LIG_BRCT_BRCA1_1 |
177-181 |
Aggregation hotspot |
156-185 |
LIG_Arc_Nlobe_1 |
155-159 |
DOC_WW_Pin1_4 |
160-165 |
OC_USP7_MATH_1 |
164-168 |
LIG_BRCT_BRCA1_1 |
177-181 |
Region with multiplicity of binding modes |
156-203 |
OC_USP7_MATH_1 |
164-168 |
LIG_BRCT_BRCA1_1 |
177-181 |
CLV_PCSK_SKI1_1 |
202-206 |
DOC_USP7_UBL2_3 |
203-207 |
Region with multiplicity of binding modes |
206-211 |
CLV_PCSK_SKI1_1 |
202-206 |
DOC_USP7_UBL2_3 |
203-207 |
CLV_PCSK_KEX2_1 |
207-209 |
MoRF |
223-278 |
CLV_NRD_NRD_1 |
245-247 |
CLV_PCSK_SKI1_1 |
226-230 |
Region with multiplicity of binding modes |
227-237 |
CLV_PCSK_SKI1_1 |
226-230 |
Droplet Promoting Region |
233-365 |
CLV_NRD_NRD_1 |
245-247 |
DEG_ODPH_VHL_1 |
253-264 |
DOC_USP7_MATH_1 |
291-295 |
293-297 |
334-338 |
DEG_SCF_FBW7_1 |
271-278 |
273-278 |
275-282 |
277-282 |
287-294 |
DOC_USP7_UBL2_3 |
310-314 |
341-345 |
348-352 |
352-356 |
358-362 |
Region with multiplicity of binding modes |
241-250 |
CLV_NRD_NRD_1 |
245-247 |
DOC_CKS1_1 |
248-253 |
Aggregation hotspot |
241-250 |
CLV_NRD_NRD_1 |
245-247 |
DOC_CKS1_1 |
248-253 |
MoRF |
282-298 |
DEG_SCF_FBW7_1 |
275-282 |
277-282 |
287-294 |
DOC_WW_Pin1_4 |
279-284 |
287-292 |
DOC_USP7_MATH_1 |
291-295 |
293-297 |
LIG_WD40_WDR5_VDV_2 |
290-295 |
Region with multiplicity of binding modes |
316-323 |
MOD_CK2_1 |
312-318 |
DOC_ANK_TNKS_1 |
323-330 |
Aggregation hotspot |
316-323 |
MOD_CK2_1 |
312-318 |
DOC_ANK_TNKS_1 |
323-330 |
Aggregation hotspot |
345-353 |
OC_USP7_UBL2_3 |
341-345 |
348-352 |
352-356 |
DOC_USP7_UBL2_3 |
341-345 |
348-352 |
352-356 |
TRG_NLS_Bipartite_1 |
345-361 |
346-261 |
347-361 |
CLV_NRD_NRD_1 |
345-347 |
CLV_PCSK_KEX2_1 |
345-347 |
Region with multiplicity of binding modes |
345-353 |
OC_USP7_UBL2_3 |
341-345 |
348-352 |
352-356 |
DOC_USP7_UBL2_3 |
341-345 |
348-352 |
352-356 |
TRG_NLS_Bipartite_1 |
345-361 |
346-261 |
347-361 |
CLV_NRD_NRD_1 |
345-347 |
CLV_PCSK_KEX2_1 |
345-347 |
MoRF |
305-365 |
DOC_USP7_UBL2_3 |
310-314 |
341-345 |
348-352 |
352-356 |
358-362 |
OC_USP7_UBL2_3 |
341-345 |
348-352 |
352-356 |
368-362 |
TRG_NLS_Bipartite_1 |
345-361 |
346-261 |
347-361 |
DOC_USP7_MATH_1 |
334-338 |
CLV_NRD_NRD_1 |
345-347 |
CLV_PCSK_KEX2_1 |
345-347 |