2.2. Protein-Protein Docking Simulation
The protein-protein docking simulations conducted in this study aimed to uncover the complex interactions between SOCS2 and proteins or peptides derived from earthworms of the
Lumbricus genus, focusing specifically on identifying potential therapeutic candidates for cardiovascular diseases.
Figure 2 illustrates the optimal binding poses of a standard agonist, a standard antagonist, and the top-performing
Lumbricus-derived protein. The molecular docking results provide a comprehensive overview of the binding affinity, structural parameters, and clustering characteristics of the various SOCS2 complexes formed with proteins and peptides from earthworms. Specifically, comparisons were made with standard agonist (SOCS2: EpoR peptide) and antagonist (SOCS2: N4BP1) complexes, which served as benchmarks for assessing the effectiveness of the
Lumbricus-derived compounds. A critical measure of the interaction strength between molecules is the binding affinity, quantified by the ΔG (Gibbs free energy) value. Lower ΔG values indicate stronger binding, suggesting more stable and favorable interactions between the proteins or peptides and SOCS2. The comparison between the top-performing proteins and peptides derived from
Lumbricus and the standard agonist (SOCS2: EpoR peptide) and antagonist (SOCS2:N4BP1) complexes based on binding affinity provides valuable insights into the potential efficacy of
Lumbricus-derived bioactive compounds in modulating SOCS2 activity compared to established proteins.
Firstly, the standard agonist, SOCS2: EpoR peptide, demonstrated a binding affinity with a ΔG value of -8.80 kcal/mol and a dissociation constant (Kd) of 6.50e
-7 M. This indicates a strong interaction between SOCS2 and EpoR peptide, highlighting its efficacy as an agonist for SOCS2 activity modulation. In contrast, the standard antagonist, SOCS2:N4BP1, exhibited a slightly lower binding affinity, with a ΔG value of -8.30 kcal/mol and a Kd of 1.50e
-6 M. Despite its lower binding affinity compared to the agonist, SOCS2:N4BP1 still displayed a notable interaction with SOCS2, indicative of its efficacy as an antagonist for SOCS2 activity inhibition. Among the
Lumbricus-derived complexes, Cytochrome b, SCBP3 protein, Lumbricin, Chemoattractive glycoprotein ES20, Histone H3, and Lumbrokinase-7T1 emerged as top-performing candidates based on their low values of binding affinity ΔG (kcal/mol). Cytochrome b exhibited an ΔG value of -15.1 kcal/mol, surpassing the binding affinity of the standard agonist, SOCS2: EpoR peptide. This suggests that Cytochrome b may possess a stronger interaction with SOCS2, potentially surpassing the efficacy of EpoR peptide and N4BP1 in modulating SOCS2 activity. Similarly, SCBP3 protein, Lumbricin, and Chemoattractive glycoprotein ES20 displayed notable binding affinities with ΔG values of -12.6 kcal/mol, -12.3 kcal/mol, and -12.2 kcal/mol, respectively (
Table 1 and
Figure 3c). These values compare favorably with the binding affinities of the standard agonist and antagonist, underscoring the potential of these
Lumbricus-derived compounds as effective modulators of SOCS2 activity for addressing cardiovascular diseases. Complete docking results are available in
Supplementary Data 2. This dataset includes complex identifiers, HADDOCK scores, binding affinities (ΔG), dissociation constants (Kd), cluster sizes, RMSD values, van der Waals energies, electrostatic energies, desolvation energies, restraints violation energies, buried surface areas, and Z-scores.
Furthermore, the cluster size and RMSD values provide crucial structural insights into the protein-protein complexes. Cluster size refers to the number of distinct conformations or poses observed within the ensemble of generated protein-protein complexes [
34]. A larger cluster size indicates greater structural diversity, suggesting multiple binding modes or orientations between the proteins and peptides from
Lumbricus and SOCS2. This diversity within the complexes reflects potential variations in interaction configurations, influencing their functional properties and biological impacts [
35]. For instance, the standard agonist SOCS2: EpoR peptide complex exhibits a notably large cluster size of 22, indicating the presence of diverse conformations or binding modes between SOCS2 and EpoR peptide. This diversity suggests the potential for versatile interactions, which could play crucial roles in various cellular processes regulated by ubiquitination. Conversely, the standard antagonist SOCS2:N4BP1 complex has a smaller cluster size of 19, implying fewer distinct binding configurations compared to the agonist complex. This difference in cluster size may reflect the specific nature of the antagonist interaction and its regulatory role in modulating SOCS2 activity. Analyzing the
Lumbricus-derived protein/peptide complexes, it's evident that the cluster sizes vary across different interactions. For instance, complexes involving proteins like SCBP3 protein and Lumbricin exhibit relatively larger cluster sizes (41 and 43, respectively), suggesting structural diversity and potential functional versatility in their interactions with SOCS2. On the other hand, complexes such as SOCS2:Lysosomal membrane glycoprotein and SOCS2:Extracellular globin-4 have smaller cluster sizes (5 and 6, respectively), indicating a more limited range of binding configurations.
Conversely, RMSD values measure the variations or discrepancies in structure among distinct conformations within a given cluster. Reduced RMSD values indicate limited deviation or greater structural similarity among various conformations, implying heightened stability and consistency in the binding interactions observed throughout the simulation period [
36]. This stability demonstrates the resilience of the protein-protein complexes, ensuring they maintain their specific structural arrangements despite variations or disturbances in their environment [
37]. For instance, the standard agonist SOCS2:EpoR peptide complex exhibits an acceptable RMSD value of 1.7 Å, indicating minimal structural deviation among its conformations and suggesting a stable and well-defined binding mode. Similarly,
Lumbricus-derived complexes like SOCS2:Cytochrome b (1.0 Å) and SOCS2:SCBP3 protein show low RMSD values (1.2 Å), indicating stable binding interactions and consistent structural configurations. In contrast, complexes with higher RMSD values, such as SOCS2:Intermediate filament protein (RMSD = 4.3 Å), suggest greater structural variability among their conformations, potentially reflecting dynamic binding interactions with SOCS2. This higher RMSD value suggests potential conformational flexibility or transient interactions, which could impact the functional significance of these complexes in cellular processes [
38].
The analysis of intermolecular contacts (ICs) and non-interacting surfaces (NIS) for the protein-protein complexes provides valuable insights into the interaction dynamics between SOCS2 and various earthworm-derived proteins and peptides. The comparison between these complexes and the standard agonist (SOCS2: EpoR peptide) and antagonist (SOCS2: N4BP1) reveals significant differences in their interaction profiles, which can be linked to their potential efficacy in modulating SOCS2 activity (
Table 2). The SOCS2: EpoR peptide complex, serving as the standard agonist, exhibited 3 charged-charged, 8 charged-polar, 12 charged-apolar, 2 polar-polar, 11 polar-apolar, and 12 apolar-apolar contacts. The NIS values for charged and apolar interactions were 25.55 and 38.69, respectively. In comparison, the SOCS2: N4BP1 complex, which functions as the standard antagonist, displayed a higher number of charged-charged (5), charged-polar (11), charged-apolar (20), and polar-polar (6) contacts, with slightly lower polar-apolar (11) and apolar-apolar (8) contacts. The NIS values were 29.71 for charged and 39.49 for apolar interactions.
Several proteins and peptides derived from Lumbricus showed distinct interaction profiles with SOCS2, suggesting varying efficacy in modulating its activity. The SOCS2: Cytochrome b complex, for instance, displayed high charged-apolar (25), polar-apolar (38), and apolar-apolar (33) contacts, with non-interacting surface (NIS) values of 15.13 for charged and 52.85 for apolar interactions. This indicates a potentially strong binding affinity. Cytochrome c oxidase subunit 3 showed balanced interactions, with significant polar-apolar and apolar-apolar contacts and NIS values of 14.71 (charged) and 49.41 (apolar), suggesting strong stability and efficacy. The SCBP3 protein complex, with 27.83 NIS (charged) and 41.74 (apolar), also displayed robust binding capability. Lumbricin exhibited higher numbers of charged interactions, indicating strong potential as a SOCS2 modulator. Chemoattractive glycoprotein ES20, with high numbers of charged and polar interactions and NIS values of 22.98 (charged) and 42.39 (apolar), also showed promise.
The statistical analysis aimed at exploring the relationship between the HADDOCK score and Root Mean Square Deviation (RMSD) unveiled a Pearson correlation coefficient (r) of 0.687. This coefficient quantifies the strength and direction of the linear relationship between these two variables. A value of 0.687 signifies a medium positive correlation, suggesting that there is a tendency for the HADDOCK score to decrease as the RMSD decreases. In the context of molecular docking, a lower HADDOCK score indicates a better docking quality and a lower RMSD reflects a closer match to the reference structure. Thus, the positive correlation implies that improvements in docking quality, as indicated by lower HADDOCK scores, are associated with a more accurate docking pose, as represented by lower RMSD values. This relationship underscores the consistency between these two metrics in evaluating the quality of protein-protein docking simulations. Further analysis revealed that approximately 46.75% of the proteins and peptides derived from
Lumbricus demonstrated favorable RMSD values, defined as RMSD values equal to or less than 2.00 Å (
Figure 3a). The favorable RMSD values indicate that many of the predicted protein-protein complexes have structural conformations closely matching experimental or reference structures, highlighting the overall success of the docking simulations in accurately predicting these arrangements. However, the analysis also revealed three outliers with very large RMSD values. These outliers suggest instances where the predicted structures significantly deviate from the experimental or reference structures. Such deviations could be due to inaccuracies in the docking algorithm, limitations in the input experimental data, or the inherent complexities of the protein-protein interactions being studied [
39].
The statistical analysis to examine the relationship between the HADDOCK score and binding affinity (ΔG) revealed a Pearson correlation coefficient (r) of 0.491 (
Figure 3b). This coefficient reflects the strength and direction of the linear relationship between these two variables. A value of 0.491 suggests a moderate positive correlation, indicating that as the HADDOCK score decreases (indicating better docking quality), the binding affinity also tends to decrease. This means that improvements in docking quality, as indicated by lower HADDOCK scores, are associated with stronger binding between the proteins, as shown by lower binding affinity values. Conversely, higher HADDOCK scores correlate with weaker binding affinity, underscoring a discernible trend between docking quality and binding strength.
The HADDOCK scoring system effectively captures key aspects of protein-protein interactions that influence binding affinity, including molecular surface complementarity, electrostatic interactions, and van der Waals forces. This indicates that HADDOCK's scoring metrics align well with the physical principles governing these interactions, enhancing the reliability of its predictions. The correlation matrix depicted in
Figure 3d provides deeper insights into the relationship between binding energy (kcal/mol) and its individual energy components. Specifically, the positive correlation scores between binding affinity and both van der Waals energy (correlation score: 0.63) and desolvation energy (correlation score: 0.39) are particularly noteworthy. Van der Waals forces are crucial in stabilizing the complex by facilitating close contact between the protein surfaces, while desolvation energy reflects the energetic cost of removing water molecules from the binding interface. A positive correlation with van der Waals energy implies that as these interactions become stronger, the overall binding affinity increases, indicating a more stable protein-protein complex [
40,
41]. Similarly, the positive correlation with desolvation energy suggests that as the system pays the energetic cost to remove water molecules, the resulting interactions between the proteins are more favorable, thus increasing the binding affinity [
42]. Conversely, the correlation between binding affinity and electrostatic energy revealed a negative value (-0.22), signifying an inverse relationship between these factors. This implies that as electrostatic energy increases, the binding affinity tends to decrease. Electrostatic interactions involve the attraction or repulsion between charged residues on the protein surfaces, influencing the stability of protein-protein complexes [
43]. However, interpreting this correlation requires caution due to the multifaceted nature of protein-protein interactions. While electrostatic energy plays a significant role, other variables beyond the HADDOCK score can also impact binding affinity. Factors such as specific amino acid residues at the binding interface, which may facilitate or hinder interactions, post-translational modifications that alter protein structure and function, and environmental conditions such as pH and ion concentrations, can all influence the overall stability and strength of protein-protein complexes [
44,
45].
The thorough examination of hydrogen bond interactions between SOCS2 and the highest-performing proteins sourced from the earthworm (
Lumbricus genus) provides insights into the molecular mechanisms that govern their binding interactions (
Table 3). These interactions are pivotal for stabilizing protein-protein complexes and facilitating precise recognition between the involved proteins [
46]. The interactions predominantly involve specific residues and atoms crucial for stabilizing the protein-protein complexes, focusing particularly on the ligand binding sites of SOCS2, namely Arg73, Ser75, Ser76, and Arg96. The interactions observed between SOCS2 and its standard agonist (EpoR peptide) and antagonist (N4BP1) molecules served as benchmarks for evaluating the binding efficacy of
Lumbricus-derived proteins and peptides. The EpoR peptide exhibited multiple hydrogen bonds with key residues such as Val55, Ser76, Arg96, Lys113, and others, with interaction distances ranging from 2.69 Å to 3.33 Å. Conversely, N4BP1 demonstrated interactions involving Gln32, Arg41, Tyr49, Asp74, and others, emphasizing its distinct binding profile characterized by shorter interaction distances (2.57 Å to 2.78 Å). These results underscore the specificity and strength of interactions necessary for SOCS2 modulation by both agonist and antagonist molecules. The proteins and peptides from the
Lumbricus genus, including Cytochrome b, SCBP3 protein, Lumbricin, Lumbrokinase-7T1, and others, displayed diverse binding interactions with SOCS2, highlighting their potential as novel modulators of SOCS2 activity. Cytochrome b, for instance, engaged in hydrogen bonds with His77 and Arg96, indicating a stable binding interface critical for functional interaction. Similarly, SCBP3 protein exhibited interactions involving Lys59 and Arg96, demonstrating specificity in binding residues essential for complex stability. Lumbricin and Lumbrokinase-7T1 revealed extensive hydrogen bonding networks with residues such as Arg41, Tyr49, Asp74, and Arg96, underscoring their potential to competitively bind with SOCS2 and influence its regulatory functions. These interactions were characterized by moderate to strong interaction distances, indicative of robust binding affinity crucial for therapeutic efficacy.
2.3. Molecular Dynamics (MD) Simulation
The MD simulation results offer a comprehensive understanding of the behavior of protein-protein complexes formed between
Lumbricus-derived proteins and SOCS2. Throughout the 100 ns simulation, SOCS2 maintained a relatively stable conformation, as evidenced by the average RMSD values ranging from 2.413 to 2.599 Å without significant spikes (
Figure 4a). This stability suggests that SOCS2 interactions with both standard agonist and antagonist, as well as
Lumbricus-derived proteins, were dynamically consistent over the simulation period. When analyzing the RMSD values, which indicate the deviation of protein structures from their initial conformations, notable differences emerge among the complexes. For instance, the SOCS2:EpoR peptide (Standard Agonist) complex displays a slightly higher average RMSD (2.423 Å) compared to apo-protein SOCS2 (2.417 Å). This observation suggests a moderate increase in structural flexibility upon the binding of the standard agonist, indicating potential conformational adjustments required for effective binding. Conversely, the RMSD value for the SOCS2:N4BP1 (Standard Antagonist) complex (2.496 Å) remains similar to that of the apo-protein, indicating that the binding of the antagonist may not significantly perturb the structural stability of SOCS2. Furthermore, the
Lumbricus-derived protein complexes, including Cytochrome b, SCBP3 protein, Lumbricin, Chemoattractive glycoprotein ES20, and Lumbrokinase-7T1, exhibit slightly higher average RMSD values ranging from 2.467 to 2.587 Å compared to the standard complexes. These differences suggest potential variations in the dynamic behavior and conformational changes induced by the binding of
Lumbricus-derived proteins. The higher RMSD values imply that these
Lumbricus-derived proteins may interact with SOCS2 in a manner that elicits different structural adjustments or conformational dynamics compared to the standard agonist and antagonist.
During MD simulations, RMSF analysis was performed to assess the flexibility of individual amino acid residues within SOCS2. The average RMSF values obtained ranged from 0.876 to 1.397 Å, reflecting moderate flexibility in various regions of the protein. This analysis provides valuable insights into the mobility and flexibility of specific residues, enhancing our understanding of their functional roles within the protein structure [
47]. Upon examining the interaction between the top-performing
Lumbricus-derived protein complex and SOCS2, it was apparent that this interaction disrupted hydrogen bonds with specific residues, especially within the regions encompassing amino acid residues Arg73 to Thr83 and Arg96 to Phe104 (
Figure 4b). These regions are notably critical as they constitute the active binding site of SOCS2 as a target receptor [
48]. The disruption of hydrogen bonds in critical areas led to greater residue fluctuations compared to the SOCS2 agonist and apo-protein complex. This increase in residue fluctuations suggests enhanced mobility and flexibility of these residues when the top-performing
Lumbricus-derived protein binds to SOCS2. Notably, this pattern of increased flexibility resembles that of N4BP1, a known standard antagonist. The similarity in flexibility patterns at these specific residues indicates that the top-performing proteins from
Lumbricus have the potential to act as inhibitors, much like standard antagonists. This finding is significant as it suggests that
Lumbricus-derived proteins could modulate SOCS2 activity by acting as inhibitors, similar to known antagonists. By disrupting hydrogen bonds and increasing the flexibility of key binding site residues, these proteins may interfere with SOCS2's functional interactions, presenting promising avenues for developing therapeutic interventions targeting SOCS2-mediated cellular processes. Further investigation into the structural and functional implications of these interactions is necessary to fully understand their therapeutic potential and to develop new treatments for SOCS2-related diseases.
The radius of gyration (RoG) offered insights into the compactness or degree of expansion of protein structures during MD simulations [
49]. For the SOCS2 complexes, the average RoG values ranged from 1.674 to 2.678 Å. The RoG values for the standard SOCS2 agonist (EpoR peptide) and antagonist (N4BP1) complexes were 2.101 Å and 2.162 Å, respectively. For the
Lumbricus-derived protein complexes, RoG values ranged from 2.176 to 2.678 Å, slightly higher than the standard agonist but comparable to the antagonist. The number of hydrogen bonds formed between the two interacting proteins was indicative of the strength and specificity of their interaction. For the standard complexes, the SOCS2: EpoR peptide agonist complex formed 11 hydrogen bonds, while the SOCS2:N4BP1 antagonist complex formed 24 hydrogen bonds. Comparatively, the
Lumbricus-derived protein complexes displayed similar or slightly higher numbers of hydrogen bonds, with values ranging from 16 to 39. Notably, the SOCS2:Cytochrome b complex formed the highest number of hydrogen bonds (39), followed closely by the SOCS2: Peroxidasin complex (35) (
Table 4). The findings indicated that the interactions between SOCS2 and
Lumbricus-derived proteins featured a comparable or slightly higher number of hydrogen bonds relative to the standard interactions. This suggests robust and specific binding between these proteins, which could enhance their functional relevance. Furthermore, the RoG values indicated that the
Lumbricus-derived protein complexes displayed similar levels of compactness or dispersion compared to the standard complexes, suggesting they maintained stable structural conformations throughout the simulation period.
2.4. Molecular Mechanics/Poisson–Boltzmann Surface Area (MM/PBSA) Calculations
The MM/PBSA calculations provided valuable insights into the thermodynamic stability and binding affinity of the SOCS2 protein-protein complexes, offering a robust method to assess their molecular interactions [
50]. We assessed the strength of the protein-protein interactions by calculating the average ΔGbinding values, which indicate the binding free energy, for each complex. In the case of the standard complexes, SOCS2 bound to EpoR peptide (Standard Agonist) demonstrated a mean ΔG
binding of -42.60 kcal/mol, indicating a substantial level of stability in the interaction. Conversely, when bound to N4BP1 (Standard Antagonist), SOCS2 exhibited a higher mean ΔG
binding of -42.51 kcal/mol, suggesting a comparatively weaker interaction. However, the
Lumbricus-derived protein complexes presented intriguing findings, showcasing similar or even more negative mean ΔG
binding values compared to the standard complexes. Notably, SOCS2 complexed with Lumbrokinase-7T1 displayed a notably high mean ΔG
binding of -69.28 kcal/mol, indicative of a robust and energetically favorable interaction. This suggests that the
Lumbricus-derived protein has a strong affinity for SOCS2, potentially surpassing the binding strength observed with the standard agonist and antagonist. Furthermore, the complexes formed between SOCS2 and Lumbricin and Chemoattractive glycoprotein ES20 exhibited mean ΔG
binding values of -59.25 kcal/mol and -55.02 kcal/mol, respectively (
Table 5). These values suggest strong binding affinities comparable to or even exceeding those observed in the standard complexes. This implies that the
Lumbricus-derived proteins possess significant potential in modulating SOCS2 activity, potentially rivaling or surpassing the efficacy of standard proteins.
Furthermore, a detailed examination of the binding free energy of individual amino acid residues within SOCS2 provided profound insights into the molecular mechanisms governing binding specificity and affinity. Notably, Arg96 and Ser78 were identified as crucial contributors to the observed antagonistic activity within the complexes. In interactions between SOCS2 and the top-performing protein from
Lumbricus, these specific residues demonstrated significantly higher binding affinity values compared to the standard agonist complex. The elevated binding affinity of Arg96 and Ser78 underscores their pivotal roles in mediating the observed antagonistic effects (
Figure 5). These residues are likely involved in essential interactions that govern complex stability and specificity, influencing overall functional outcomes. The heightened affinity observed in the complexes involving
Lumbricus-derived proteins highlights the importance of these interactions in modulating SOCS2 activity and emphasizes their potential as key factors in therapeutic efficacy. By elucidating the roles of individual amino acid residues in binding energetics, this analysis offers critical insights into the structural basis of protein-protein interactions. The identification of Arg96 and Ser78 as significant contributors to antagonistic activity enhances our understanding of the molecular determinants underlying the intricate interplay between SOCS2 and its interacting partners. These findings pave the way for targeted manipulation of specific residues to effectively modulate SOCS2 function, presenting promising opportunities for developing therapeutic interventions aimed at SOCS2-associated pathways.