1. Introduction
Since the first applications reported shortly before 1990 [
1,
2], so-called alchemical free energy simulations (FES) have become an essential tool of computational chemists in academia and industry alike [
3,
4,
5,
6]. The free energy difference determines the spontaneity of chemical or biochemical processes and, hence, makes it possible to predict, e.g., binding affinities of potential drugs [
7,
8,
9]. In addition to computationally estimating an important macroscopic, thermodynamic quantity, FES can also provide insight in the microscopic origins [
10]. One source of error is the accuracy with which interactions within and between molecules are modeled in the underlying molecular dynamics simulations. In certain applications, force-field based descriptions may be insufficient [
11]. Even if one is not interested in studying chemical reactions, the neglect of induced electronic polarization in classical force fields may lead to problems even for seemingly simple applications, such as computing relative free energies of hydration of mono- or bivalent ions [
12,
13,
14]. In such situations, hybrid quantum-mechanical/molecular mechanical (QM/MM) Hamiltonians are called for. However, even when using only semiempirical quantum chemical methods (henceforth abbreviated as SQM) for the core region, i.e., the part of the system one is particularly interested in, the computational cost can quickly become prohibitive. Further, several tricks frequently used in alchemical FES do not work with SQM/MM Hamiltonians [
15]. Early on, Warshel, Gao, and others pioneered the use of indirect cycles, also referred to as “multilevel” free energy simulations, as depicted in
Figure 1a), which has become a widely used way to carry out FES with a (S)QM/MM description of interactions [
16,
17,
18].
To illustrate the indirect approach, we consider the calculation of the aqueous solvation free energy of a solute X. As can be seen in
Figure 1a, the quantity of interest, the solvation free energy difference
at the high SQM/MM level of theory (dashed, black arrow in
Figure 1a), can also be obtained in three steps according to
In addition to computing the solvation free energy
at the force field (MM) level of theory (solid black arrow), two correction steps, accounting for the free energy difference of X in gas phase and aqueous solution between the two levels of theory (green and red arrows), are required. The development in this field during recent years shows that there is not only enormous potential but also a high interest in the FES community to use this strategy to compute free energy differences at the SQM/MM level of theory [
19,
20,
21,
22,
23,
24,
25,
26,
27].
Work by Heimdal and Ryde [
28] made clear that the challenging step of indirect cycle SQM/MM FES is the calculation of the low-to-high corrections
and
, the green and red arrows of
Figure 1a. Various approaches have been used, often requiring sophisticated decomposition of the cycle and the employment of fitting procedures to be successful [
18,
29]. In the past, we showed that the corrections
can be computed reliably and accurately by non-equilibrium work (NEW) methods [
30,
31,
32,
33,
34,
35]. In our work to date, computationally expensive protocols were used, which are unsuited for most real-world applications. Recently, we optimized the NEW protocols regarding switching length and number of switches and presented a sufficiently efficient workflow for practical applications [
36]. However, these optimized protocols were tested so far only in the gas phase, i.e., the green arrow of
Figure 1a (
).
Here, we investigate whether the protocols suggested in Ref.
36 are also suitable for aqueous solution, i.e., when the region of interest to be treated at a high level of theory, e.g., SQM, interacts with an MM environment (
, red arrow in
Figure 1a). In earlier work, we had noted that NEW switching simulations from a pure MM to a SQM/MM level of theory converged more slowly compared to gas phase transformations involving only MM to SQM and observed that the charge distribution resulting from the fixed charges of the force-field and the average charge distribution obtained at the SQM/MM level of theory were noticeably different [
31]. We speculated that the necessary reorientation of water in response to such a change in charge distribution of the solute might be responsible for the slower convergence. Using equilibrium methods to compute
, Ito and Cui reported that inserting an intermediate state with fixed charges more similar to the charge distribution at the SQM/MM target state improved convergence considerably [
37].
The solvent-reorientation dynamics of water in response to charge changes is experimentally well characterized and understood [
38,
39]. Two relaxation times, one faster than 0.5 ps, the other approximately 2 ps, can be discerned. Therefore, if the charge distribution of the region of interest is significantly different at the two levels of theory, NEW switching simulations of only 2 ps, as used in Ref.
31, may be too short to obtain converged results. With this in mind, we tested two strategies: (i) Similarly to the work by Ito and Cui, we introduce an intermediate state, which is still purely MM, but with the atomic partial charges modified to resemble the average charge distribution of the SQM description at the high level of theory. (ii) Since water reorientation in response to a change in charge distribution occurs on two timescales, the linear switching protocols we have used so far may be suboptimal. In particular, we test whether protocols in which switching is carried out initially at a faster rate, e.g., switching from the MM to a 35% SQM/MM description in just 10% of the switching length, followed by a second slower phase during the remaining 90% of the switching length, might lead to improved convergence. Obviously, both strategies can be combined. Model calculations exploring these two approaches are complemented by a series of analyses in which we investigate the effects of a solute’s charge distribution on properties such as its dipole moment, as well as on the interactions with surrounding waters.
The focus of this work is on the effect which the charge distribution of a solute described at different levels of theory has on the convergence of calculating
. To avoid unrelated complications from too flexible molecules, where conformational preferences might be different at the low and high level of theory, respectively, we mostly chose rather rigid model compounds for the test calculations. Rather than using (a subset of) the so-called “HiPen” test set [
34], we selected tautomeric pairs (cf. Methods). Most of the compounds are rather rigid without rotatable bonds. Therefore, the pure MM and the SQM/MM descriptions of the systems — the solute (tautomer) is described by SQM, all waters remain classical — differed primarily by the charge distributions of the solute at the two levels of theory.
The remainder of the manuscript is organized as follows. In Theory and Methods, we summarize the theoretical background, introduce the model systems, and provide the technical details of all simulations and analyses carried out. In Results, we first study the convergence of NEW switching simulations using the optimized protocol from our preceding study [
36] (switching lengths of 2 ps, 200 NEW switches per transformation). We then investigate how convergence
improves when employing switching lengths of 5 ps, which, however, may be too costly for many applications. Next, we present results using the two mitigation strategies outlined above. Our findings are rationalized by additional data characterizing the differences in solute properties, in particular the dipole moment, at the two levels of theory, as well as details about the solvent-reorientation dynamics of water under the simulation conditions.
3. Results
3.1. Performance of 2 and 5 ps linear switching protocols in solution
Before carrying out calculations in aqueous solution, the optimized JAR protocol from Ref. [
36] was tested for all compounds in the gas phase. All results can be found in Table S3 of SI. Except for
1-t1 (see further down),
was
kcal/mol. While
kcal/mol is slightly larger than the ideal maximum deviation of
kcal/mol, this value is still acceptable for most practical applications. Thus, the protocol recommendations of our previous study [
36] are applicable to the model compounds studied here. Based on the gas phase results, any noticeable deterioration of
, therefore, has to be caused by differences in the description of solute–solvent interactions at the two levels of theory.
In
Figure 4, the
values for all compounds obtained by three protocols/methods are compared against reference results obtained from 200 forward/backward switches of 5 ps length: 200 MM to SQM switches of 2 ps length, the recommended protocol from Ref. [
36], (JAR(2ps), red circles), 200 MM to SQM switches of 5 ps length (JAR(5ps), orange circles), and CRO results obtained from 200 forward/backward switches of 2 ps length (CRO(2ps), blue circles). The data visualized in
Figure 4 are tabulated in Table S2; even more detailed results can be found in Tables S4 and S5. For the shortest protocol, JAR(2ps), four results lie outside the
kcal/mol threshold indicated by the two dashed lines in the figure. Taking error bars into account (see Table S2), the results for
1-t1 and
8-t2 can be considered acceptable,
kcal/mole; furthermore,
(
1-t1) is comparable to the value obtained in the gas phase (see Table S3). However, for
5-t2 and
6-t2 the deviation from the reference result is larger than 1 kcal/mol and would lead to a sizable systematic error.
The use of a 5ps switching length (JAR(5ps), orange circles) certainly helps, but one poor result,
6-t2,
kcal/mol, remains. Thus, the relatively inexpensive protocol suggested in Ref. [
36] cannot be recommended for calculations in solution. Even worse, the use of more costly switching simulations of 5 ps length does not help in all cases.
Figure 4 also shows results for the CRO(2ps) protocol (blue circles). Compared to our findings in the gas phase [
36], the differences to the reference result CRO(5ps) are noticeably more pronounced. Furthermore, as can be seen in Tables S4 and S5, the statistical uncertainty of the CRO(2ps) results was consistently higher than those of the CRO(5ps) results, even though the overlap between forward and backward work distributions seemed adequate in all cases. While all CRO(2ps)
values lie within the
kcal/mol threshold, we, nevertheless, decided to choose CRO(5ps) as the reference protocol.
Three of the four compounds failing the kcal/mol threshold using the JAR(2ps) protocol, 5-t2, 6-t2, and 8-t2, are lactams. Interestingly, the corresponding tautomeric lactim states do not cause any problems. The pair 1-t1/1-t2 belongs to the keto-enol class of tautomerism. For 1-t1 kcal/mol in both the gas phase and in aqueous solution; therefore, different conformational preferences at the two levels of theory may be responsible for the poor convergence when using switching lengths of 2 ps.
3.2. Performance of hybrid charge intermediates
In
Figure 5 we summarize the performance of the indirect NEW switching protocols using a hybrid intermediate charge state in terms of
, cf. Eq. 14b. As described in Methods, three approaches to obtain average Mulliken charges were tested: averaging over gas phase configurations (MULL(gas), over configurations sampled in aqueous solution at the MM level of theory (MULL(solv)), and over configurations sampled in aqueous solution at the SQM/MM level of theory (MULL(solv*)). The results obtained with the direct JAR(2ps) protocol (red circles), already shown in
Figure 4 and henceforth referred to as just MM, are included as well.
Clearly, all three protocols employing an intermediate hybrid charge state perform much better than the direct NEW switches (MM), though not to the same degree. Using the MULL(gas) intermediate state (orange triangles), the kcal/mol threshold for 5-t2 is still missed, kcal/mol, and the result for Ala becomes worse ( kcal/mol). For MULL(solv), blue squares, de facto all results lie within the kcal/mol threshold ( kcal/mol, kcal/mol). It should be noted that taking the statistical uncertainty of the results into account, none of these deviations from the kcal/mol threshold is statistically significant. Finally, all MULL(solv*) results (green diamonds) lie well within the kcal/mol threshold.
Figure 5 shows that in most cases the use of a hybrid intermediate charge state lowers
, and typically the use of MULL(solv*) leads to the best results, followed, in this order, by MULL(solv) and MULL(gas). To quantify this, we report in
Table 1 the mean absolute deviation (MAD) for the data shown in
Figure 5, together with the spread of
for each of the protocols. Already the use of MULL(gas) lowers the MAD from 0.29 to 0.12 kcal/mol, the largest error
reduces from 1.80 to 0.49 kcal/mol. The MULL(solv) intermediate state leads to a further improvement (
, the largest error is
kcal/mol). Finally, the MAD for MULL(solv*) is 0.07 kcal/mol and the largest
is as low as 0.20 kcal/mol.
The improvements resulting from the hybrid intermediate charge states are roughly inversely proportional to the computational effort to obtain the average Mulliken-like charges. The MULL(gas) and MULL(solv) charges are obtained from configurations at the MM level of theory; the required computations at the SQM(/MM) level of theory to obtain the Mulliken charges are slightly more costly for the solvated system. By contrast, to obtain the MULL(solv*) charges, sampling needs to be carried out at the high (SQM/MM) level of theory. Given that the improvements of using MULL(solv*) rather than MULL(solv) are relatively small, this leads to an additional computational cost that, at least for the systems studied here, seems not worthwhile. In other words, the use of MULL(solv) hybrid partial charges seems to be a good compromise between correctness and computational cost. Each of the indirect NEW switching protocols requires the calculation of ; however, this is a strict force field-based free energy calculation, which can be computed fast and efficiently on GPUs using the CHARMM/OpenMM interface (cf. Methods).
3.3. Performance of modified switching protocols
Figure 6 summarizes the results obtained with stepwise linear switching protocols for a subset of the model compounds. In all cases, the baseline results, indicated as red, solid circles, are the free energy difference obtained from one-step MM → SQM switches, i.e., the MM/JAR(2ps) results already presented. For the first problematic case,
5-t2, all three stepwise linear protocols reduce
considerably. However, in the case of
6-t2, it is difficult to speak of improvements. Two protocols,
L2-1 and
L3-1 lower
, but the values are still far outside the
kcal/mol threshold. The third protocol,
L3-2, which worked extremely well for
5-t2 actually increases
for
6-t2 compared to
L1, the default linear protocol. The other three compounds included in the subset were chosen as negative controls because they performed already well with the JAR(2ps) protocol. The performance of the stepwise linear protocol is comparable, except for one poor result for
4-t2 when using
L3-1.
Turning to the combination of stepwise linear and indirect NEW switching protocols with the MULL(solv) hybrid intermediate charge state (shown as squares), most of the results lie within the kcal/mol threshold. However, since in this case, the performance of the linear protocol was excellent to begin with, any improvements are marginal. Moreover, for 4-t2 the stepwise linear protocols perform slightly poorer, and for 6-t2 obtained with the L3-1 is kcal/mol.
Based on these findings, the performance of stepwise linear switching protocols seems mixed at best. None of the protocols tested consistently improved the results. Furthermore, even when the results were improved and fell within the
kcal/mol threshold, the standard deviation of the work values,
, was always larger than for the linear switching protocol. Since
is a sensitive indicator of whether one can expect convergence of JAR calculations [
32], the utility of protocols that increase it seems doubtful.
3.4. A detailed analysis of the factors affecting convergence
3.4.1. Effects of charge distribution on solute properties
To quantify the differences between the various charge models used in this study, we calculated the root-mean-square deviation of the atomic charge differences between the MM, MULL(gas), MULL(solv) representations, using the MULL(solv*) charges as the reference value (cf. Eq. 8). Since the latter are the average of the Mulliken charges derived from simulations at the SQM/MM level of theory, they are closest to the fluctuating charge distribution at the target high level of theory. The variability of the Mulliken charges derived from the SQM(/MM) reevaluations about their mean values were always extremely low, fluctuating typically less than e about the mean; the largest value observed was 0.08 e. These low standard deviations make the use of average values meaningful in the first place. In addition to , we looked at the magnitude of the differential dipole moment (Eq. 9) and the angle between the dipole moment of a charge distribution and that of the MULL(solv*) reference charges (Eq. 10).
The individual results for each compound are listed in Table S12 and depicted in Figures S4–S6 of SI. Some overall trends can be seen from the MAD values of
,
, and
in
Table 2. As expected, the MAD(
) is largest for MM and smallest for MULL(solv), with MULL(gas) in between; the same is the case for MAD(
). This is in accord with the MAD(
) results of
Table 1 in
Performance of hybrid charge intermediates, where the use of the MULL(gas) intermediate charge state led to a noticeable improvement, and the MULL(solv) charges lowered the MAD(
) even further. The MAD(
) behaves somewhat differently; here the MULL(gas) value is lower than that of MULL(solv). As can be seen in Table S12 and Figure S6, this holds true for most compounds. Nevertheless,
of MULL(solv) is always smaller than that of the MM charges. The findings strongly suggest that the differences between charge distribution at the two levels of theory have a strong influence on the convergence of calculations of
by JAR, with the difference in magnitude of the two dipole moments being more important than the orientation of the respective dipole moment vectors.
Based on the overall convergence results (
Figure 4) we surmised that compounds containing a lactam moiety tended to converge slower than the corresponding lactim state, e.g.,
5-t2 vs.
5-t1, etc. One reason for this is that the differences between the MM and the SQM(/MM) descriptions are greater for the lactams than for the lactims. The analysis of partial charges points in this direction as well. Several differences can be discerned in the detailed charge distribution data for the individual molecules, e.g., Figure S4. For all lactim–lactam pairs (compounds
2,
3,
4,
5,
6,
8),the MM
of the lactam state is noticeably higher than that of the corresponding lactim state. Similarly, the MM
of each of the lactams (red circle) is at least 0.1e higher than for MULL(solv) (blue squares).
In
Figure 7 we visualize some points made above for the two molecules, for which obtaining a converged
was most difficult,
5-t2 (top) and
6-t2 (bottom). For each of the three charge representations, MM, MULL(gas), and MULL(solv), we display the differential atomic charges, both as labels, as well as a color gradient from blue to red, and the resulting differential dipole moment vectors (orange arrows). The magnitude of the differential dipole moment
and the angle
between the dipole moment of a charge distribution and that of the MULL(solv*) reference charge distribution are listed directly. The difference in partial charges and the MULL(solv*) charges can be large, e.g., the difference for the -C=O part of the lactam moiety is almost
e for
5-t2 (top, left in
Figure 7). One further sees that the charge differences become smaller between the MM and the two MULL charge sets, atoms colored in clear blue or red for MM, e.g., the -C=O group (left side) have just a shade of blue or red for MULL(solv) (right side). Accordingly, the length of the orange arrows, i.e.,
of the MULL charge states, is noticeably smaller than that for MM. Even more detailed plots, including the exact values of the charge difference for each atom, for
3-t2,
4-t2,
5-t2,
6-t2, and
8-t2 can be found in Figures S7–S11.
3.4.2. Effects of charge distribution on the first solvation shell
Figure 8 displays the difference in the average number of waters
within
Å of the solute. The numbers are relative to the target high-level state, i.e., the average number of waters found in the SQM/MM simulations. First, one sees that
for the MULL(solv*) charges (green diamonds), which were also derived from the SQM/MM simulations, are very close to zero, in accord with expectation. Several results for the MULL(solv) charges (blue squares) are also quite close to zero, but mostly for the lactams (particularly
2-t2,
3-t2,
4-t2,
5-t2) the difference in waters
. For the MULL(gas) charges (orange triangles), all
values are negative, i.e., on average, there are fewer waters in close contact with the solute compared to SQM/MM. This is not too surprising because the charges were derived from SCC-DFTB gas phase calculations; hence, the solute did not experience polarization from interaction with the solvent. This may also explain why on average the use of the MULL(gas) hybrid intermediate state performed worse than MULL(solv), even though significantly better than MM (cf.
Table 1).
The MM results (red circles) fall into two distinct groups, with either more or less water present than in SQM/MM. For most lactams, with the slightly surprising exception of 6-t2, is negative, whereas the lactims, i.e., have a positive . Thus, one sees again distinct differences in properties for the force field representations of lactims and lactams, respectively.
3.4.3. Water reorientation dynamics
In the previous subsections, we characterized the influence of different charge distributions (atomic partial charges) on static properties of the solute and the surrounding solvent layer in several ways. As the endpoints of the NEW switches have different charge distributions, it is of considerable interest to study the dynamics of solvent reorientation. We, therefore, computed the un-normalized Stoke shift relaxation
(Eq. 12) for several compounds and the MM and MULL(solv) charge distributions. The results for
5t-2 are shown in
Figure 9; analogous plots for
4t-2,
6t-2, and
8t-2 can be found in Figures S12–S14 in SI. All fit parameters are summarized in Table S13 of SI.
As described in Methods, the mono-exponential fit was carried out in such a way that it picks out the slow process(es) of the water reorientation. Looking at
Figure 9, one sees that the relaxation times for MM and MULL(solv) are relatively comparable (
ps), but that the prefactor
is quite different (
kcal/mol for MM and
kcal/mol for MULL(solv)). Thus, while
has for all practical reasons reached 0 after about 3 ps in the MULL(solv) case (blue curve), for MM (red curve) this is the case only after 5 ps. Since we would ideally carry out NEW switches of only 2 ps length, the obvious question is the value of
. Using the fit parameters, we find for MM ≈ 1.7 kcal/mol, and for MULL(solv) ≈ 0.5 kcal/mol. These values roughly mirror the values for
for MM/JAR(2ps) and MULL(solv), respectively. For the second problem case,
6-t2, very similar results were obtained; see Figure S13 and Table S13.
Overall, the slow decay process of always occurs with a time constant ps. Thus, the degree to which the system is still out of equilibrium regarding the SQM/MM target state, therefore, depends crucially on how big the difference is at . Thus, it becomes clear that convergence is facilitated if the solute charge distributions of the initial and final state resemble each other, explaining why better results were obtained for all the MULL intermediate charge states. Phrased differently, to obtain converged JAR results from 2 ps switching simulations, the charge distributions of initial and final state have to be very similar. Since the determining factor seems to be the initial difference in charge distribution, using stepwise linear switching paths cannot help much, explaining at least in part why we did not obtain any real improvements from the stepwise linear protocols we attempted.
3.4.4. Interplay between charge distribution and conformational preferences
The compounds used in this work were mostly rigid and specifically chosen to avoid complications from conformational degrees of freedom, exceptions being
1-t1 and the blocked amino acids
Ala and
Ser. Modifying partial charges as for the MULL hybrid intermediate states may have an effect on conformational preferences. As can be seen in
Figure 5, all three MULL hybrid intermediate states improve convergence for
1-t1. For the two blocked amino acids, MULL(gas) results in poorer convergence for
Ala, and MULL(solv) in slightly poorer convergence for
Ser. Given that
Ala and
Ser are the smallest possible peptide-like molecules with protein backbone-like
and
dihedral angles, we did some analysis on dihedral angle conformational preferences. In earlier work [
31] we showed that purely classical Hamiltonians and SQM/MM Hamiltonians resulted in different preferred conformations of
and
, as well as
in the
Ser case. In
Figure 10 and
Figure 11 maps for
Ala and
Ser are shown for MM, MULL(solv) and SQM. The differences between MM (left) and SQM (right) are clearly visible. For both blocked amino acids, MM has a single narrow minimum at
,
, whereas SQM has a broader distribution at
, and
. While the MULL
maps (middle plots) are more similar to MM than to SQM, a second minimum at
has appeared. Thus, although the effect is small, the use of hybrid charge intermediates also makes this state slightly more high-level like in terms of conformational preferences.
4. Discussion
The optimized protocol proposed in Ref. [
36] (200 switches of 2 ps length) would be efficient enough for many practical applications. However, it has only been validated in the gas phase. If water reorientation is responsible for the slower convergence in aqueous solution as observed previously [
31,
37], then, given the experimentally known slower relaxation time constant of
ps, switching lengths of 2 ps may not be long enough. This suspicion was confirmed in this work, and we found that in one case even a switching length of 5 ps was insufficient.
We explored two approaches to keep the switching length at 2ps: (i) introducing a hybrid intermediate charge state, which makes the solute charge distribution more similar to that at the SQM/MM level of theory. (ii) Since the solvent reorientation also has an ultrafast component, we used a stepwise linear switching scheme instead of the standard one, switching faster initially followed by one or two slower steps. While the use of hybrid intermediate charge states significantly improved convergence, the benefits of multi-stage switching protocols were mixed.
The partial charges of the intermediate hybrid states were obtained by averaging over the Mulliken charges obtained from 200 energy evaluations at the SQM level of theory. The computational cost of this is relatively small; the overhead, if any, comes from sampling the configurations over which one averages in the first place. Not surprisingly, the best results (lowest overall
; see
Table 1) were obtained from coordinate sets generated during SQM/MM simulations, i.e., calculations at the target high level of theory (MULL(solv*)). While we have performed such calculations to reference results with CRO, we would ideally like to avoid them in practical applications. In contrast, the cheapest approach is to use the coordinate sets saved during the MM gas phase simulations (MULL(gas)). While these results were an improvement over the direct MM→SQM/MM switches, the MAD(
) was higher than that of MULL(solv*). The results for the third method tested, MULL(solv), fell in between. Here, the configurations to be re-evaluated at the SQM/MM level of theory were generated from MM simulations in the solvated phase. Since one can reuse the stored coordinates/restart files to start the NEW switching simulations, the computational cost is practically zero. All results obtained with the MULL(solv) hybrid intermediate charge state met our rather stringent quality criteria; the improvements over the direct MM→SQM/MM switches were noticeable, and the difference in convergence over the more expensive MULL(solv*) approach was small (MAD of 0.09 instead of 0.07 kcal/mol). Thus, the MULL(solv) approach to derive an intermediate charge representation seems to be a good compromise between computational cost and convergence improvement. All methods require an additional, free energy simulation to obtain
at the low level of theory, but its computational cost is small compared to that of the NEW switching simulations needing SQM/MM Hamiltonians.
As reported in Performance of modified switching protocols, we have not seen consistent convergence improvements for all systems using the two-stage and three-stage switching protocols. This prompted a detailed analysis of how the MM and SQM/MM representations differ. Comparing the MM charges of the force field to the average Mulliken charges, one sees that there are indeed surprisingly large differences in charge distribution and hence the solute dipole moment. The differences in properties of the solute have a noticeable effect on the number of water molecules in close contact with the solute. Furthermore, by studying the time correlation function after switching from one level of theory to the other, we were able to confirm the time scales of experimental solvation spectroscopy. There clearly is a fast process with a relaxation time ps, and a slower process with ps. This slower time constant does not change much if the initial charge distributions are equal to MM or MULL(solv). If the difference between the initial and final distribution is large, then the effect of the unfinished solvent reorientation after 2 ps (the switching length we are aiming for), will also be large. If the two charge distributions are more similar, as is the case for MULL(solv), then is small enough to have little effect on the results, even though the solvent reorientation may not be complete. On the one hand, this observation explains the effectiveness of using hybrid charge intermediate states. On the other hand, it helps to rationalize our failure to obtain consistently convergent results with the multistage switching schemes. While one could certainly construct more complex switching paths, the speed of the solvent reorientation would remain the same.
Figure 1.
a) Indirect alchemical thermodynamic cycle to compute (dashed arrow) in three steps according to Eq. 1. b) Indirect alchemical thermodynamic cycle to compute (red arrow) via hybrid charge intermediates; cf. Eq. 7.
Figure 1.
a) Indirect alchemical thermodynamic cycle to compute (dashed arrow) in three steps according to Eq. 1. b) Indirect alchemical thermodynamic cycle to compute (red arrow) via hybrid charge intermediates; cf. Eq. 7.
Figure 2.
Model systems used in this work: 8 tautomer pairs, 7 taken from the SAMPL2 challenge [
47], one (
7-t1 ⇌
7-t2) from
Tautobase [
48]. In addition, we used the blocked amino acids
Ala and
Ser as model solutes.
Figure 2.
Model systems used in this work: 8 tautomer pairs, 7 taken from the SAMPL2 challenge [
47], one (
7-t1 ⇌
7-t2) from
Tautobase [
48]. In addition, we used the blocked amino acids
Ala and
Ser as model solutes.
Figure 3.
The standard linear (L1) and three modified switching protocols investigated. The total switching length was 2 ps in all four cases. The color code used for L2-1, L3-1 and L3-2 is also used in Results.
Figure 3.
The standard linear (L1) and three modified switching protocols investigated. The total switching length was 2 ps in all four cases. The color code used for L2-1, L3-1 and L3-2 is also used in Results.
Figure 4.
Comparison of using three protocols without intermediate state: JAR(2ps): 2 ps forward switches from MM to SQM, JAR(5ps): 5 ps forward switches from MM to SQM, and CRO(2ps): 2 ps forward and backward switches. In all cases, the CRO results obtained from 5 ps switches were considered as the reference values . For 5t-2 and 6t-2 the JAR(2ps) results are off-scale, which is indicated by the red arrows.
Figure 4.
Comparison of using three protocols without intermediate state: JAR(2ps): 2 ps forward switches from MM to SQM, JAR(5ps): 5 ps forward switches from MM to SQM, and CRO(2ps): 2 ps forward and backward switches. In all cases, the CRO results obtained from 5 ps switches were considered as the reference values . For 5t-2 and 6t-2 the JAR(2ps) results are off-scale, which is indicated by the red arrows.
Figure 5.
Comparison of
obtained with three hybrid intermediate charge states: MULL(gas), orange triangles; MULL(solv), blue squares; MULL(solv*), green diamonds. All results were obtained from 200 NEW switches of 2 ps length to compute
and include the correction
; cf. Eq. 14b. The JAR(2ps) results already shown in
Figure 4 (red circles) are included for comparison purposes.
Figure 5.
Comparison of
obtained with three hybrid intermediate charge states: MULL(gas), orange triangles; MULL(solv), blue squares; MULL(solv*), green diamonds. All results were obtained from 200 NEW switches of 2 ps length to compute
and include the correction
; cf. Eq. 14b. The JAR(2ps) results already shown in
Figure 4 (red circles) are included for comparison purposes.
Figure 6.
Performance of the three modified switching protocols for a subset of the compounds. Results for direct NEW switching protocols MM → SQM are shown as red circles, results for calculations using the MULL(solv) hybrid charge intermediate state as blue squares. The fill color of the circles/squares indicates the stepwise linear switching protocol used (green = L2-1, orange = L3-1 and salmon = L3-2). For comparison purposes, the MM/JAR(2ps) (red circles) and MULL(solv) results (blue squares), both using the default switching protocol L1, are included as well.
Figure 6.
Performance of the three modified switching protocols for a subset of the compounds. Results for direct NEW switching protocols MM → SQM are shown as red circles, results for calculations using the MULL(solv) hybrid charge intermediate state as blue squares. The fill color of the circles/squares indicates the stepwise linear switching protocol used (green = L2-1, orange = L3-1 and salmon = L3-2). For comparison purposes, the MM/JAR(2ps) (red circles) and MULL(solv) results (blue squares), both using the default switching protocol L1, are included as well.
Figure 7.
Graphical representation of differences in partial atomic charges compared to MULL(solv*) for 5-t2 (top) and 6-t2 (bottom). Charge differences are indicated as a color gradient from blue () to red (). The differential dipole moment is displayed as an orange arrow. The magnitude and the angle between the dipole moment of the respective charge distribution with that of the MULL(solv*) reference distribution are given below each structure.
Figure 7.
Graphical representation of differences in partial atomic charges compared to MULL(solv*) for 5-t2 (top) and 6-t2 (bottom). Charge differences are indicated as a color gradient from blue () to red (). The differential dipole moment is displayed as an orange arrow. The magnitude and the angle between the dipole moment of the respective charge distribution with that of the MULL(solv*) reference distribution are given below each structure.
Figure 8.
Difference in average number of waters closer than 3 Å compared to SQM/MM. The average number of waters in the SQM/MM simulations was used as the reference value.
Figure 8.
Difference in average number of waters closer than 3 Å compared to SQM/MM. The average number of waters in the SQM/MM simulations was used as the reference value.
Figure 9.
Water reorientation dynamics of TIP3P solvent for 5-t2 when switching from MM and MULL(solv) to SQM/MM. Raw data are shown in gray; fit for MM in red, fit for MULL(solv) in blue. The fit parameters ( in kcal/mol, in fs) are listed in the inset; cf. Eq. 13.
Figure 9.
Water reorientation dynamics of TIP3P solvent for 5-t2 when switching from MM and MULL(solv) to SQM/MM. Raw data are shown in gray; fit for MM in red, fit for MULL(solv) in blue. The fit parameters ( in kcal/mol, in fs) are listed in the inset; cf. Eq. 13.
Figure 10.
Ramachandran plots of pseudo-dipeptide/blocked alanine.
Figure 10.
Ramachandran plots of pseudo-dipeptide/blocked alanine.
Figure 11.
Ramachandran plots of pseudo-dipeptide/blocked Serine.
Figure 11.
Ramachandran plots of pseudo-dipeptide/blocked Serine.
Table 1.
MAD (mean absolute deviation) for the results shown in
Figure 5, as well as the spread of
(respective smallest and largest absolute deviation).
Table 1.
MAD (mean absolute deviation) for the results shown in
Figure 5, as well as the spread of
(respective smallest and largest absolute deviation).
|
|
Spread MAD [kcal/mol] |
Pathway |
MAD [kcal/mol] |
Min |
Max |
MM |
0.29 |
0.01 |
1.80 |
MULL(gas) |
0.12 |
0.01 |
0.49 |
MULL(solv) |
0.09 |
0.01 |
0.28 |
MULL(solv*) |
0.07 |
0.01 |
0.20 |
Table 2.
MAD (Mean Absolute Deviation) over all compounds for , the magnitude of the differential dipole moment , and the angle between the dipole moment vector of the respective charge distribution and that of the MULL(solv*) charge distribution.
Table 2.
MAD (Mean Absolute Deviation) over all compounds for , the magnitude of the differential dipole moment , and the angle between the dipole moment vector of the respective charge distribution and that of the MULL(solv*) charge distribution.
Pathway |
MAD
|
MAD [D] |
MAD ] |
MM |
0.14 |
3.38 |
29.8 |
MULL(gas) |
0.06 |
2.44 |
7.6 |
MULL(solv) |
0.04 |
1.81 |
16.0 |