1. Introduction
Hydrogen bonding is one of the most important forces governing molecular integrity in the physical and biological worlds. It describes the sharing of a hydrogen that is covalently bound to an electronegative Donor atom, with another electronegative Acceptor atom that has a lone pair of electrons: D−H···A.
In the biological world, the acceptor group is often composed of nitrogen or oxygen bound to a carbon. Upon H-bonding, the C−O or C−N bonds are weakened due to the pulling exerted on the electronegative atom by the hydrogen. In the current study, we identified the opposite effect, wherein the acceptor of an H-bond is surprisingly strengthened upon H-bonding. Consequently, by looking into the governing factors of H-bond formation, we aimed to examine the roots of this unexpected and intriguing observation.
Theoretical and experimental findings describe the formation of hydrogen bonds due to several energy components: electrostatic attraction, electron delocalization, charge transfer, dispersive interactions, cooperative effects, steric effects, and secondary interactions [
1].
H-bonds in proteins exhibit variations in these energy components, tailored to fulfill the specific physiological role of the molecule [
2]. The nature of H-bonds in proteins is further influenced by the molecular makeup of the donor and acceptor, the protein backbone, and the side chain [
3,
4].
One can classify protein and peptide H-bonds as canonical or noncanonical. Canonical or conventional H-bonds are formed by regular and predictable H-bonding interactions, like the
i to
H-bonds formed between the C=O and N−H groups in -helices [
5]. Noncanonical H-bonds exhibit a significantly larger variation of donor and acceptor groups and a multiplicity of partners. For example, an over-coordinated system entails several donors bonded to a single acceptor, while multifurcation involves a single donor with multiple acceptors. Experimental and computational analyses have shown that some of these noncanonical bonds are highly important for protein structure [
6,
7].
An important class of noncanonical H-bonds in proteins are those formed between the a C hydrogen and a carbonyl oxygen [
8,
9,
10,
11,
12,
13,
14,
15]. X-ray and neutron crystallography have confirmed the identity of C−H···O=C hydrogen bonds within the protein environment in the presence of a solvent, other proteins, and ligands [
16,
17], suggesting an essential role in catalytic activity [
18]. These important hydrogen bonding schemes necessitate detailed analyses to provide a quantitative understanding of their role in macromolecules.
The C−H···O=C hydrogen bond in a transmembrane helical dimerization interface, was first identified in glycophorin A [
19]. The interface has a GxxxG motif, one of the most prevalent oligomerization factors in transmembrane helices [
20,
21]. In the GxxxG motif of glycophorin A, the C−Hs of glycines 79 and 83 form H-bonds with the carbonyls of isoleucine 76 and valine 80 in the opposing helix, respectively. Such bonds form the basis of the GxxxG dimerization motif [
13].
Previous attempts to measure the energetics of the C−H···O=C H-bonds resulted in varying estimates. Using a mutagenesis cycle analysis, Bowie and coworkers found that such bonds are not stabilizing [
22], while an empirical FTIR-based approach indicated a bond enthalpy of 0.88 kcal/mol [
23].
Considering the prevalence of the C−H···O=C interaction and its contested contribution to protein stability, we decided to examine the energetics of C−H···O=C hydrogen bonds using a combination of experimental and computational methods. Experimentally, we employed vibrational spectroscopy, an exquisitely sensitive tool to measure H-bond strength. Computationally, we use Molecular Dynamic Simulations and DFT calculations that can yield detailed energetics and vibrational shifts that can be verified with experiments. Specifically, we targeted the frequency change of the C=O acceptor upon H-bond formation, made possible by isotopic labelling to resolve the specific carbonyl group [
24,
25]. This combined approach enabled us to both measure the strength of this interaction and explain the source of its curious spectroscopic behavior.
2. Results and Discussion
Glycophorin A is the first membrane protein to have its sequence determined [
26] and is the archetypical GxxxG containing transmembrane domain dimer [
13,
19,
27]. The GxxxG motif enables close positing of the two helices with minimal steric hindrance due to the small size of the glycine hydrogen side-chain that H-bonds to a carbonyl residue in the other helix.
Figure 1 depicts the bond between the C−H of Gly83 and the carbonyl acceptor of Val80. Note that the acceptor is also involved in a canonical H-bond with the backbone amide donor hydrogen Val84. Hence, Val80’s carbonyl group participates in an over-coordinated H-bonding interaction.
We spectroscopically isolated the C−H···O=C bond in Glycophorin A by editing of Val80’s carbonyl group with
13C and
18O. Due to the fact that the amide I mode is composed mainly of the C=O stretch [
28], such labeling baseline resolves the isotope-edited mode from the unlabelled amide groups by shifting it more than 60 cm
−1, enabling detailed site-specific analysis of the labeled peak separately from the unlabeled peak. [
25]. Accordingly, as shown in
Figure 2, the
13C=
18O amide I mode of Val80 within glycophorin A is shifted to 1599.4 cm
−1, which is 62 cm
−1 from the main amide-I envelope of the protein. The entire spectra of both peptides, in which all other vibrational modes arising from the lipid and protein can be observed, are presented in
Figure S1.
The peptide encompasses the transmembrane domain of wild-type glycophorin A, a strongly dimeric species [
27,
29,
30] in which the C−H···O=C hydrogen bond exists [
13,
19]. Therefore, to determine the impact of the Gly83–Val80 inter-helical H-bond, we measured the spectrum obtained from a peptide containing the Gly79Leu monomerizing mutation which separates the helices and so the bond does not exist [
27].
In the monomeric protein that does not contain the noncanonical H-bond, the isotope-labeled amide I mode resonates at 1596.8 cm
−1, which is lower than the H-bonded dimeric species by 1.7 cm
−1 (
Figure 2). These results appear surprising since H-bond formation is expected to reduce the vibrational energy of the acceptor carbonyl and thus shift the isotope-edited carbonyl peak to lower frequencies when it is involved in a hydrogen bond.
We note that previous studies report a blue shifting of an H-bond donor. For example, using computational tools, Schlegel and colleagues have elucidated the electrostatic origin of this shift [
31]. However, to the best of our knowledge, we are reporting for the first time the blue shift of the stretching frequency of an H-bond
acceptor C=O in proteins.
To understand the root of the surprising frequency shift, we employed DFT-based computational tools that detail the characteristics of such H-bonds [
32]. The size of a transmembrane helix system is beyond the current capabilities of detailed quantum calculations. Therefore, two mimetic systems that capture the specific H-bonding interactions were analyzed and compared (see
supplementary Figure S4):
- Monomer (single H-bond):
The canonical -helical H-bond between the carbonyl of valine 80 and the amide H of glycine 84 was mimicked by two N-methylacetamides.
- Dimer (two H-bonds):
The inter-helical H-bonding system contained the canonical H-bond described above, with an additional non-canonical inter-helical hydrogen bond. The C−H of glycine 83 from the opposing helix is bonded to the same carbonyl of valine 80 that is also involved in the canonical hydrogen bond. This system was mimicked by two N-methylacetamides forming the canonical H-bond to which an acetylglycinemethylamide is hydrogen bonded.
To ensure that the calculations replicate the experimental system accurately, we sought to superimpose the coordinates of the atoms on the corresponding groups of the transmembrane protein. However, only the structure of the wild-type, dimeric glycophorin A, has been solved experimentally [
19,
33]. Moreover, the structures were elucidated in micelles or bicelles but not in a lipid bilayer, the native environment of the protein and the one in which the FTIR spectra were obtained. Therefore, we used molecular dynamics (MD) simulations in hydrated 1,2-dimyristoyl-sn-glycero-3-phosphocholine bilayers to determine the atom positions in the wild-type, dimeric glycophorin A and a monomeric glycophorin A peptide that contains the G79I mutations. In both instances, the experimentally determined structure in bicelles was used as a starting point [
33]. The results of the MD simulation can be seen in supporting information,
Figure S2, depicting structures of both monomer and dimer species and
Figure S3 showing the root mean square deviation for each species.
We extended the utility of the MD simulations beyond creating mimetic systems for density functional theory (DFT) calculations, applying them also to geometric analyses of bond parameters. The geometry of a bond, and in particular its distance and angle influences its energetics and, consequently, the vibrational frequency. Previous research, has highlighted the variability of H-bond distances along helical peptides, discussing a trend of bond shortening at helix midpoints due to the cooperative effect of H-bonding along the length of the helix [
34]. Similarly, Tan
et al. have shown how the strength of an H-bond is dependent on the donor-acceptor angle, introducing the concept of an `antecedent angle’, which varies across protein secondary structures [
35].
We calculated hydrogen bond distances and angles within our simulated systems. Specifically, we found the average O···N distance between Val80’s C=O and Val84’s N−H along the same helix to be Å in the dimeric species of glycophorin A, compared to Å in the monomeric variant. On the other hand, the C−O···N bond angle was in the dimer and in the monomer.
Hydrogen bonding interactions between donor and acceptor moieties can be both dipolar and electronic. An electronic interaction is possible when the donor N-H anti-bonding orbital overlaps the C=O oxygen nonbonding orbital. Electronic interactions are optimal when the C−O···N angle is near 120 [
36], facilitating maximal overlap of these orbitals. But in protein systems, there are typically geometric constraints that restrict such bond angles to around 155 [
35,
37].
In the the dimeric model, with the additional inter-helical H-bond, a 3 wider C−O···N bond angle is observed relative to the monomeric model. This finding points to a reduced C=O to N−H canonical hydrogen bond strength. On the other hand, the canonical H-bond distance upon dimer formation was shortened by 0.09 Å , potentially increasing orbital overlap and thus increasing the canonical C=O···H−N interaction strength. Given the competing influences observed, MD simulations alone do not provide a definitive conclusion on the dominant effect on if the canonical hydrogen bonding strength should increase or decrease in the dimer system versus the monomer system. Such ambiguity is perhaps not unexpected due to the simplicity of the point charge model of the MD force field). Therefore, we proceeded to conduct quantum mechanical calculations for each system in which the C=O oscillator strength and other parameters can be estimated with greater accuracy, and the hydrogen bond strength directly inferred.
We calculated the frequencies of the amide I stretching modes (C=0) of both the monomeric and dimeric systems. We chose several points of time from the MD simulation trajectory to make model systems for DFT calculations and expressed the results as a time average form. (see Experimental section) The average Val80 C=0 value for the dimeric glycophorin A is cm−1, whereas for the monomeric species (with the G79I mutation) the frequency is cm−1. Notably, there is a 3.7 cm−1 increase in the C=O stretching upon dimer formation which corresponds with the FTIR experimental C=0 shift upon dimer formation. Hence, the surprising C=O bond strengthening is obtained both experimentally and computationally. The observed shift is independent of isotopic substitution. The 12C=16O variant of the construct produced a shift of 4.1 cm−1 comparable to the 13C=18O analog that yielded an average shift of 4.2 cm−1, arising from the stretching frequencies cm−1 for dimer and cm−1 for monomer.
In a typical H-bond, the acceptor (C=O in this instance) transfers electrons to the donor anti-bonding orbital, reducing the double bond character of the C=O, reflected as a decrease in
C=0. The H-bond of a C=O···H−D undergoes a hyperconjugative C+−O−H···D− type interaction (analogous to charge transfer) that in turn lowers the double bond character of C=O. [
38]
To explore the electronic basis of the observed acceptor vibrational mode strengthening of the dimer system (upon over-coordination with the additional inter-helical hydrogen bond), we employed Natural Bonding Orbital (NBO) calculations to analyze the estimated electron distributions and atomic orbitals in a localized way. Local hybrid orbitals can be extracted using the NBO approach, which can correlate with classical qualitative bonding theories [
39]. Alabugin
et al. have shown how imperfect H-bonds can be analyzed by the NBO approach by showing the hyperconjugation effects [
38].
According to valence bond theory, a C=O bond should be formed by the overlap of two
sp2-orbitals from the C and O atoms to form a -bond, and two
p-orbitals from the C and O atoms to form a -bond. Yet in reality, this picture changes depending on the electronegativity of each participating atom, as explained by Bent’s Rule [
40]. However, for an ideal situation of a double bond, a -bond (like in the Val84 amide C=O) should project an
sp2 hybrid orbital to the other atom (O in this instance) present in the covalent bond, making the bond
%
s and
%
p. A deviation from these percentages will weaken the bond. In contrast, the -bond should be formed by two unhybridized 100%
p-orbitals with no
s-character.
Upon taking the hyperconjugative effect into account, the C=O bond should lose some of its double bond character due to the presence of a C+−O−H species and the localized orbitals will be far from an ideal sp2 overlap that we would expect in the bond.
We ran NBO calculations on the energy-optimized DFT structures collected from MD-simulation trajectories and extracted the values of the
s-character of the C and O atomic orbitals of the C=O bonding interactions. The results of the calculations, shown in
Figure 3, indicate that both the monomeric and dimeric species experience a deviation in the
s-character of the and -bonds from the ideal,
% and 100%
s and
p-characters, respectively.
The dimeric form has a higher
s-character in the -bond and a higher
p-character in the -bond compared to the monomeric form. The average
s-character of the atomic orbitals contributing to bond of C=O are shown in
Figure 3, upper panels. It can be seen that the C and O hybrid orbitals posses a slight but definite higher
s-character than the corresponding monomeric species. The monomer has a 29.5% and 37.3%
s-character in its C and O hybrid orbitals, respectively. On the other hand, the dimeric species has a 30% and 38.8% of
s-character in the respective orbitals. Therefore, upon dimerization and formation of the additional C−H···O bond, the C−O bond becomes stronger due to its higher
s-character which can be a major contributing factor for the increased
C=0 stretching frequency. This kind of
s-character enhancement has been previously encountered and has been accounted as the main factor for bond strengthening of the H-donor [
38].
On the other hand, the -bonding C and O atomic orbitals in the dimer are 98.4% and 97.8% of
p, respectively, whereas for the monomer they are 97.6% and 96.4% of
p, respectively (
Figure 3). So the C=O
dimer has more
p-character in the sidewise overlapping orbitals than in the C=O
monomer, suggesting reduced hyperconjugation, which should theoretically enhance the oscillator strength of the C=O bond, supporting the observed frequency increase [
38]. Finally, we note that indistinguishable results were obtained upon using a dispersion correction (D3_BJ) [
41] method with a larger basis set (cc-pVTZ) [
42].
These computational results suggest that the C=O bond strength increases upon dimerization, and the formation of the second H-bonding C−H group, due to the increased s-character of the bond and a simultaneous increase in the p-character of the bond which reduces hyperconjugation. Altogether, this leaves the C=O bond more double bonded in nature as observed in the later dispersion-corrected calculation.
We followed by analyzing how H-bonding impacts the charges of Val80’s carbonyl group. When the C=O···H−C interaction is absent, either in the monomeric system or in a separated dimeric assembly (
Figure S6), a polar distribution is obtained: C
+0.45=O
-0.71 or C
+0.60=O
-0.65, respectively. In contrast, when the C=O···H−C is present (dimeric assembly) a markedly different charge distribution is obtained: C
-0.21=O
+0.09.
Lowering the C=O bond’s polarity should increase its covalent character. This factor, together with reduced hyperconjugation, potentially contributes to a stronger C=O bond.
It is worth mentioning that less hyperconjugation and a reduction in bond polarity do not imply that the C−H···O H-bond is energetically unfavorable. H-bonding involves both electronic and polar counterparts. Electronic interactions involve the transfer of electrons and the formation of charge-transfer species, whereas dipolar interactions are solely electrostatic in nature. An analysis of the energy terms associated with the canonical and noncanonical H-bonds in both monomer and dimer species reveals this complexity.
There is a small destabilization in the canonical H-bond of the dimer compared to the monomer, with the dimer having a canonical H bond energy value of
kcal/mol compared to that of the monomer at
kcal/mol. Hence, both bonds impact one another and are consequently not orthogonal in nature [
43]. On the other hand, the noncanonical C−H···O H-bond in the dimer has a bond energy of
kcal/mol which indicates an overall stabilization of the dimer species due to the incorporation of two H-bonds. In other words, the additional H-bond from the glycine C−H strengthens the interaction between the two helices. Finally, the value of this additional noncanonical C−H···O H-bond is similar to previous measurements [
23].