To methodologically identify which BPs are worthy of evaluation against COVID-19 in human observational studies, all the BP molecules (synthetic and approved) obtained from the ChEMBL database were docked into the binding site of the RdRp protein (PDB ID: 6M71) using the ligand docking wizard of the glide module of Schrödinger software. The test molecules were docked using the SP mode and we obtained 1,992 molecules, signifying that all test molecules have occupied the active site pocket. Larger molecules >500 Daltons find it harder to absorb as smaller molecules are more absorbable. Hence, we used molecular weight (<500 Daltons) to filter out large molecules. The number of drugs that passed the filter was 1,398. To further reduce this number, a filter corresponding to the number of rotatable bonds was applied, as the drugs having fewer rotatable bonds are acceptable [
68]. With this filter, 628 molecules were obtained, which were further subjected to molecular docking using the extra precision (XP) mode.
All the molecules were binding to the active site with varying glide scores; hence, a cut-off docking score of -9.0 kcal/mol was chosen for down-selection as it is 2-3 times that of comparables (remdesivir -3.27 kcal/mol; favipiravir -3.44 kcal/mol; molnupiravir -4.93 kcal/mol). Concurrently we also enabled protomerization so that the uncharged BPs can be ionized at pH 7.0 ± 2.0. This is because BPs contain two phosphonate groups attached to a carbon or a nitrogen atom that can undergo ionization depending on the environment’s pH. For instance, alendronate, a bone resorption inhibitor used for treating osteoporosis exhibits multi-level ionization for the dissociation of 4-hydroxyl groups present on the phosphorous atom, resulting in a total of 4 dissociation constants at varying pH [
69]. Similarly, at a pH ranging from 5 to 9, these BPs may exist as a mixture of protonated or ionized forms. Hence, considering the ionization behaviour at this stage is crucial to evaluate their binding mechanism with the target residues. Therefore, to improve the accuracy and reliability of our docking studies further, and to allow for a comprehensive exploration of ligand’s behavior in different biological environments, protomers at pH 7.0 ± 2.0 were generated using Epik module for the top 14 compounds that met the -9.0 kcal/mol docking score cut-off. This resulted in a total of 48 protomers, using which the XP mode of molecular docking was carried out. The list of molecules obtained after the cut-off and their corresponding best protomers are shown in
Table 1.
2.2. Molecular Dynamics Results
The top ligands with the least binding energy from MM-GBSA studies were analyzed by molecular dynamics studies (
Figure 3). The clinically used drug remdesivir was also subjected to dynamics simulations for comparative analysis, and showed a stable RMSD plot for the entire duration (5.0 Å-7.0 Å from the initial 5 ns until the end). Additionally, the fluctuations in the active site region of its corresponding protein were low, resulting in a stable RMSD and RMSF plots of the protein (
Figure 3 and
Figure 4). Significant interactions were observed by the residues of the palm domain that are necessary to bind with the RNA. The residual interactions include Asp618 (63%, water-mediated), Asp623 (70%, water-mediated), and Asp760 (90%, H-bond) (
supplementary file, S1). In the case of compound CHEMBL164344, the RMSD plot has deviations till 10.0Å for the initial 18 ns, during which strong interactions were observed with the residues of the palm domain, Thr556, Asp623, and Arg624; later, these interactions gradually decreased, and the interactions with two residues of F domain, Asp452 (81%, H-bond) and Tyr455 (61%, π-π stacking) increased over the time. This has caused a decrease in the RMSD from 12.0 Å at 20th ns to 7.5 Å at 100th ns. In the case of its protomer, the ligand has stable deviations for the initial 25 ns (RMSD 4.0 Å – 6.0 Å), later increasing to 16 Å till 45 ns and then attaining equilibrium until the end with RMSD ranging between 11 Å and 13 Å. The residues that significantly participated in the interactions include Tyr455 (73% π-π stacking, 68% H-bond), Lys551 (69%, H-bond), Arg553 (92%, H-bond), Arg555 (35%, H-bond), and Lys621 (33%, π-cation). It is noted that the protomer’s RMSF is similar to that of the uncharged molecule (
Figure 4). Change of phenoxymethyl (as in CHEMBL164344) with biphenyl group (as in CHEMBL196676) results in significantly more interactions with the residues Asp452 (81%, H-bond) of F domain and Asp623 (85%, H-bond) of palm domain, as well as inconsistent water-mediated interactions with Asp760 (27%). Initially, the interactions with Arg553, Thr556, and Asp623, as observed in the docked complex, remained for 18 ns, during which no significant deviations were observed (RMSD between 1.0 Å – 3.0 Å); later, the interactions with Asp452, Lys621 along with Asp623 had changed the ligand's conformation, resulting in the increase in the RMSD ranging between 8.0 Å and 7.0 Å which retained till 70 ns, and then a gradual decrease was observed until the end of the simulation (last frame RMSD is 5.4 Å) (
Figure 3).
The quinazoline-4-amine compound (CHEMBL4291724) shows greater stability in terms of its RMSD plot as equilibrium is attained after 20 ns until the end of the simulation with RMSD ranging between 10.0 Å and 12.0 Å (
Figure 3). Most of the interactions are between the phosphonic acid groups and the residues of the palm domain, which include Asp623, Thr680, and Asp760, with a contribution of 50%, 39%, and 86%, respectively (
Supplementary File, S1). In the case of CHEMBL387132, a 2-amino isoquinoline compound, the RMSD plot is comparatively lower than other complexes, as the overall deviations for 100 ns were between 3.0 Å and 6.4 Å. Additionally, strong interactions were observed with Asp623 for the first 30 ns and then with Asp760 of the palm domain, contributing to an overall interaction of 30% and 64%, respectively. A residue from the F domain, Lys545, has a total of 33% H-bond interactions with the ligand. (
Figure 3,
Supplementary File, S1).
The 2-amino thiazole containing BP (CHEMBL4569308) showed a stable RMSD plot for initial 45 ns (between 1.0 Å and 2.0 Å), had a slight increment over the next few time frames and then attained equilibrium from 60 ns until the end with deviations between 2.5 Å and 4.0 Å. The stability is observed due to the significant involvement of thiazole ring in the interactions with Asp623 (99%, H-bond; 71% water-mediated H-bond) and Arg624 (89%, π-cation; 52% and 40% H-bond). Other residues such as Lys545 (68%, H-bond), Arg553 (76%, 82%, H-bond) and Arg555 (68%, 41%, H-bond) showed moderate interactions with the oxygen atoms of the compound. The imidazole compound (CHEMBL98211) attained stability from 20 ns until the end with RMSD between 5.5 Å and 7.0 Å. However, there were slight deviations observed between 45 ns-55 ns and 85 ns-90 ns with RMSD ranging between 4.0 Å and 7.0 Å. These deviations could be due to the intermittent water-mediated interactions with Asp623 (72%). The significantly contributing residues in the interactions include Lys545 (96%, H-bond), Arg555 (77%, 67%, H-bond) and Arg624 (73%, 49%, H-bond) (Supplementary File, S1). Other residues such as Thr556 and Lys621 contributed with 52% of water-mediated and 48% of H-bond interactions, respectively. The imidazo[1,2-α]pyridine-4-ium compound (CHEMBL608526) exhibited the most stable RMSD plot among all the studied complexes. For the initial 30 ns, its RMSD was between 3.0 Å and 4.0 Å, later there was a gradual decrease until 60 ns, and then equilibrium was attained until the end with RMSD ranging between 1.0 Å and 2.0 Å. There were several residues that significantly contributed to the interactions during the simulation which include Asp450 (91%, H-bond), Lys545 (99%, 68%, H-bond), Arg553 (100%, 99%, H-bond), Arg555 (100%, 93%, H-bond), Asp623 (102%, water-mediated H-bond) and Arg624 (101%, 84%, H-bond interactions). The other residues such as Lys551, Thr556, Ala558 and Ser682 contributed moderately with 63%, 43%, 33% and 60% interactions, respectively. The negative control cinnamaldehyde showed highly unstable RMSD plot with high deviations (up to 90.0 Å), and interactions with <10% contributions (Supplementary File, S1).
The protein RMSD plot shows a similar pattern of deviations with all the hit ligands, indicating their stability with the ligands during the MD simulation. Similarly, the RMSF plot shows that interacting residues in the active site have acceptable fluctuations (<2.0 Å) (
Figure 4). From this, we infer that 5 out of our top 7 ligands behave in a similar fashion to that of remdesivir, with the active site residues forming strong bonds, and most of the interactions by our “hit compounds” attributable to bisphosphonic acid groups. Therefore, there is a good probability that these compounds may emerge as potential RdRp inhibitors if evaluated in vitro and/or ex vivo.
2.3. Estimation of Entropic Contribution by gmx_MMPBSA
The MMPBSA analysis helps to understand entropic contribution between protein and ligand during the molecular dynamic simulations [
70]. It refers to the degree of randomness in a system and can guide to understand the entropic contributions of the ligand in the active site of the protein. Since MMPBSA analysis module is not available in Schrödinger software, the gmx_MMPBSA tool was used to determine the entropic contributions of the ligands, protein, and the protein-ligand complexes [
71]. The change in free energy (ΔG) of the complex is then calculated using the following equation:
In the above equation (1), the G
complex represents the free energy of the protein-ligand complex in water, while G
protein and G
ligand represent the free energies of the protein and ligand respectively in water. The free energy of the total system (ΔG
binding) can be obtained by adding the interaction entropy (I.E. = - TΔS) to the change in total energy (ΔE
MM) of the system [
72]. ΔE
MM is the summation of various change in energies such as van der Waals (ΔE
vdW), electrostatic columbic (ΔE
EL), electrostatic potential (ΔE
PB) and non-polar (ΔE
NP). Applying equation (3), the free energy of the system (ΔG
binding) was calculated and tabulated in
Table 4 and represented graphically in
Figure 5.
The results reveal that the protomer of CHEMBL608526 has the least value (-75.62 kcal/mol), while CHEMBL196676 (-33.41 kcal/mol), CHEMBL164344 (-26.77 kcal/mol), CHEMBL4291724 (-26.38 kcal/mol) and CHEMBL4569308 (-22.97 kcal/mol) have values lower than remdesivir (-19.91 kcal/mol). The more negative values signify that the free energy of the complex is lower than the sum of the individual free energies of the protein and the ligand, as seen from equation (2). Therefore, these five candidates, highlighted in
bold in
Table 4, are worth evaluating in vitro and/or ex vivo. The two other candidates, CHEMBL387132 (-1.16 kcal/mol) and CHEMBL98211 (0.42 kcal/mol), have values close to zero and to the negative control cinnamaldehyde (-0.99 kcal/mol), but they may also be worth evaluating for the following reasons: 1. They perform as well as the 5 highlighted compounds in molecular dynamics studies, showing several interactions with the protein’s active site that are absent in the case of negative control (c.f.
Figure 3,
supplementary file S1); 2. ΔG
binding values can vary substantially, e.g., CHEMBL164344’s uncharged state value (-26.77 kcal/mol) is twice that of its protomer (-12.65 kcal/mol), therefore it may be premature to dismiss these two compounds; and 3. CHEMBL98211 resembles the approved drug zoledronate which has been observed to benefit COVID-19 human patients as described in
Section 3.4 below. Although charged molecules are associated with electrostatic repulsions, higher degrees of freedom, and generally lower binding energies leading to more spontaneous binding, we have to exercise caution. This is because the relative contribution of the first term (enthalpy) versus the second term (entropy) in equation (3) is not straightforward in the case of proteins due to conformational effects in the latter and specific properties of a given system [
73]. More studies are required with a range of molecules wherein the in silico predictions can be experimentally validated to determine whether free energy calculations for charged or uncharged molecules are better predictive and more useful for drug selection.