1. Introduction
In addition to the double-stranded B-form, genomic DNA can fold into various sequence-dependent non-canonical structures. Among them, G-quadruplexes (G4) are the most studied and biologically significant form of nucleic acids.
Endogenous G4s are formed via intramolecular interactions of G-rich nucleic acids whose sequence contains at least four tandem G-tracts of consecutive guanosines separated by quasi-random nucleotide residues [
1,
2]. G4 structures are stabilized by two or three stacked G-tetrads, planar arrangements of four guanines from different G-tracts connected via Hoogsteen base pairing. Nucleotide sequences between G-tracts can form G4 loops that play an important role in determining quadruplex topology and stability. The G4 core is further stabilized by the coordination of guanine O6 with monovalent cations, predominantly K
+, in the central cavity [
3,
4].
In vitro, it was shown that G4 structures can have a wide range of folds, differing in the orientation of G-tracts, adopting parallel, antiparallel, or mixed (3+1) topologies, type of loop and their length, the number of G-tetrads, as well as local structural parameters; some of the G4s are highly dynamic, while others adopt only one conformation. More than 700,000 G4 motifs have been identified in the human genome [
5]. Bioinformatics analysis has shown that G4 motifs are frequently clustered in the promoter regions of many oncogenes and genes involved in growth control [
6,
7,
8], replication origins, untranslated exon regions, telomeric DNA and micro(mini)satellite repeats [
9,
10,
11]. G-rich promoter sequences are always present along with their C-rich complements, and G4 formation competes with the maintenance of the Watson
–Crick duplex.
Currently, G4s are considered as novel regulatory elements that are involved in key genome functions such as transcription, telomere maintenance, DNA replication and repair [
12,
13]. In addition, numerous connections of G4 structures to cancer biology have been proposed. Thus, G4 in promoters of oncogenes performs regulatory functions, acting as a transcriptional silencer element that can be targeted with potential anticancer drugs [
13,
14,
15,
16].
Much less studied is the role of promoter G4s in another key biological process - methylation of cytosine residues at CpG sites. These basic epigenetic DNA modifications are involved in the regulation of gene expression, maintenance of genomic stability, aging, etc. In mammals,
de novo DNA methylation, i.e., the implementation of the methylation pattern (specific alternation of methylated and unmethylated CpG sites) is performed by DNA methyltransferase (MTase) Dnmt3a [
7,
8]. The maintenance of the methylation pattern during DNA replication is carried out by the Dnmt1 MTase [
17]. Despite the primary role of Dnmt1 in maintaining methylation, cooperation between Dnmt3a and Dnmt1 has been observed in some cases, particularly in embryonic stem cells. The promoter regions of actively transcribed genes containing CpG islands are usually unmethylated. In contrast, the promoters of oncogenes, unlike tumor suppressor genes, are hypermethylated. Methylated cytosines are potentially mutagenic due to their spontaneous deamination, leading to C>T point substitutions. Therefore, the presence of 5-methyl-2’-deoxycytosine residues in oncogene promoters can affect the DNA repair pathways and lead to somatic “driver mutations.” Thus, the level of methylation in regulatory regions of the mammalian genome is directly related to the progression of several types of cancers [
18]. Evidence of the participation of the promoter G4s in transcriptional regulation, as well as the dependence of the transcription level on the methylation status of CpG islands in oncogene promoters, makes it relevant to analyze the G4 impact on the DNA methylation level. However, there are only a few studies on the involvement of G4 structures in the functioning of the DNA methylation machinery [
12,
19]. Nevertheless, studies based on whole-genome sequencing data identified a correlation between CpG methylation levels and G4 formation, namely, hypomethylation of the genome in regions enriched in G4 motifs [
20,
21]. The G4 formation in the promoters of imprinted genes was revealed [
22]. Using surface plasmon resonance spectroscopy, the authors reported high binding affinity of recombinant human MTases to quadruplex structures formed in synthetic oligonucleotides containing G4 motifs of the corresponding promoter regions. The binding affinity of the enzyme to G4 has been shown to be comparable to that of other cellular proteins that specifically interact with G4s. Recently, the interplay between G4 formation and DNA methylation levels has been confirmed [
23]. Using sequence analysis, it was shown that the majority of G4 motifs found in the human genome are localized in regions of unmethylated CpG islands. The authors also discovered co-localization of G4 motifs and binding sites of the maintenance MTase DNMT1 and proposed a mechanism for protecting CpG islands from methylation by G4 structures that effectively bind and inhibit DNMT1. However, the G4 effect was characterized indirectly by inhibiting methylation of a standard substrate with G4-forming oligonucleotides. G4 impact on
de novo methylation by Dnmt3a has not been studied at all.
Given the crosstalk between DNA methylation status and G4 formation in oncogene promoters, the main goal of our work was to evaluate the effect of this non-canonical DNA structure on Dnmt3a function. In our study, we selected the G4 motif of the
c-MYC promoter. This gene product is a major oncogenic driver in cancer involved in the regulation of cellular proliferation, differentiation, and apoptosis [
24]. Aberrant expression of the
c-MYC oncogene (usually
c-MYC hyperexpression) leads to malignant transformation of cells. The 27-nt G-rich region of the
c-MYC promoter (Pu27) located in the nuclease hypersensitive element (NHE) III
1 has the ability to fold into intramolecular parallel G4 structures [
6,
16,
25]. Five tandem G-tracts in its sequence: 5’-T
GGGGA
GGGT
GGGGA
GGGT
GGGGAAGG-3’, lead to the potential formation of multiple G4s [
8]; the resulting G4 structures differ in the positions of the four G-tracts involved in their formation and, consequently, in their conformational features and thermodynamic stability. Nevertheless, a parallel arrangement of G-tracts characteristic of promoter G4 structures are realized in this mixture of G4s, which act as a transcriptional repressor.
In our work, we aimed to investigate how parallel G4s formed by a sequence derived from the c-MYC oncogene promoter region affect the activity of the murine Dnmt3a catalytic domain (Dnmt3a-CD) in a double-stranded context. Using this DNA model, we evaluated the role of the distance between the G4 structure and the analyzed CpG site on its methylation. Furthermore, we examined the influence of isolated c-MYC G4 on the Dnmt3a-CD function.
3. Discussion
Both DNA methylation and G4 structures play multiple roles in cancer biology, DNA replication, and gene regulation. Therefore, the experimental evidences for crosstalk between
c-MYC G4s and the functioning of the eukaryotic Dnmt3a presented in our
in vitro study are of great importance. To address this issue, we used the catalytic domain of Dnmt3a and designed DNA models containing
c-MYC G4: a single-stranded 27-nt oligonucleotide 27G4 mimicking the G4 motif (Pu27) in the
c-MYC promoter, and a DNA duplex (c-MYC_35Uf/62Df_G4) bearing an extrahelical Pu27 insert in the center of one of its strands (
Figure 1).
Using CD and UV spectroscopy, we revealed that 27G4 forms a stable
c-MYC G4 structure with a parallel topology under Dnmt3a-CD operating conditions. The parallel topology and high thermal stability of the G4 structure were maintained when the 27G4 sequence was embedded into a double-stranded
c-MYC promoter sequence lacking the fragment complementary to 27G4 (c-MYC_35Uf/62Df_G4) (
Figure 3B). This approach allowing stabilization of G4 structure in a duplex context was elaborated to prevent its conformational transition to the more energetically favored B-DNA [
27] (
Figure 3A). The native
c-MYC promoter fragment (c-MYC_58Uf/58Df) adopts a B-form DNA structure under Dnmt3a-CD operating conditions.
We further established that Dnmt3a-CD forms a stable specific complex with 27G4f. Comparison of the Dnmt3a-CD binding affinity to this oligonucleotide and to a known 30-bp DNA substrate 30Uf/30Df revealed a marked preference for the parallel G4 structure formed by 27G4f. This result is in line with the high binding affinity of full-length Dnmt3a and other proteins to G4 structures [
22,
23,
37]. 27G4 was shown to promote significant inhibition of Dnmt3a-CD activity, in contrast to the control oligonucleotide of random sequence, 30X (
Figure 5B); these data confirm the binding specificity of Dnmt3a-CD to the quadruplex structure formed in 27G4. The mutant oligonucleotide 27G4-mut was shown to be a weak inhibitor of Dnmt3a-CD activity due the low G4 stability (
Figure 2B). Similar effects of specific binding and inhibition by
c-MYC oligonucleotide were found for the human maintenance MTase DNMT1[
23]. The IC
50 value for 27G4 (160 ± 10 nM) is comparable to the IC
50 values of known Dnmt3a-CD inhibitors, such as olivomycin A, which binds to the DNA minor groove [
38], and the DNA-intercalating curaxin CBL0137 [
36].
Inhibition of Dnmt3a-CD induced by 27G4 suggests hypomethylation of the
c-MYC oncogene promoter due to
c-MYC G4 binding to the enzyme. However, single-stranded oligonucleotides folded into G4 structures are not informative when studying the role of G4 in regions flanking CpG sites in the methylation process. Much more suitable are the combined duplex-quadruplex models, containing a
c-MYC G4 structure stabilized in the DNA duplex context (e.g., c-MYC_35Uf/62Df_G4). In addition, the created model made it possible to monitor the methylation of specific CpG sites. It was previously shown that G4 formation in long dsDNA is promoted by conditions of molecular crowding created by PEG [
39]. However, the G4 stabilization observed under these conditions occurs only during the process of
in vitro transcription or heat denaturation/renaturation and is sequence-dependent.
In the DNA duplex c-MYC_35Uf/62Df_G4, the CpG site, which is the methylation target, is located at a distance of 6 bp from G4 (
Table 1). It was shown that the degree of-MYC_35Uf/62Df_G4 methylation, decreased by approximately 7 times compared to the standard perfect DNA duplex c-MYC_58Uf/58Df (
Figure 7). This is the first direct evidence that parallel G4 down-regulates Dnmt3a activity.
To answer the question of whether the G4 effect on the methylation level persists when the CpG is moved away the quadruplex structure, we compared the degrees of methylation of 32Uf/51Df_G4 and 76Uf/95Df_G4, containing an embedded extrahelical (GGGT)
4 motif, which differed in the distance between G4 and the analyzed CpG site in flanking duplexes (
Figure 7). Removal from the G4 structure on 28 bp from the analyzed CpG site (76Uf/95Df_G4) significantly reduces the hypomethylation effect of G4. DNA duplexes 32Uf/51Df_G4 and c-MYC_35Uf/62Df_G4 with parallel G4 insertions within the duplex structure located at approximately the same short distance from the CpG site showed equally high reduction in methylation compared to DNA substrates lacking G4.
Thus, one would expect DNA hypomethylation near
c-MYC G4 structure and maintenance of elevated methylation levels in oncogene promoter regions distal to G4. Previous studies examined the impact of the relative position of quadruplex structures and CpG sites in gene promoters on the methylation status of CpG islands based on whole-genome bisulfite sequencing data [
23,
40]. It was noted that the hypomethylating effect of G4 was inversely proportional to the distance between G4 and CpG, but this analysis did not take into account the activity of DNA methyltransferases [
40].
Understanding the molecular-level interactions between G4 structures in the
c-MYC promoter region and Dnmt3a-CD remains unclear. High-resolution structural information on the nature of G4 interactions with various cell proteins and their selectivity in recognizing certain G4 structures is extremely limited [
5]. Structural studies of Dnmt3a complexes with G4 DNA have not been performed. Two main mechanisms by which various proteins interact and function with intracellular G4s have been discussed [
13,
41]: (i) G4-unfolding proteins can unfold the G4 structure after binding to it, acting as helicases, and (ii) G4-recruited proteins can bind to G4 structures in specific functional regions of the DNA. We hypothesize that G4 structures may also serve as an obstacle to proper formation of the DNA-protein interface. Obviously, the hypomethylation effect that we discovered
in vitro for Dnmt3a-CD is primarily due to the effective binding of MTase to the non-canonical structure in c-MYC_35Uf/62Df_G4. However, it is also necessary to take into account the ability of G4 to sterically interfere with Dnmt3a-mediated methylation of nearby CpG sites, preventing the formation of the correct DNA-protein complex. These assumptions are supported by the clear dependence of methylation efficiency on the distance between G4 and CpG (
Figure 7) and by comparative experiments with the prokaryotic MTase M.SssI.
Despite the substantial similarity of the primary structures of M.SssI and Dnmt3a-CD, namely the presence of ten conserved amino acid motifs responsible for the catalytic reaction [
7,
34], profound differences were revealed in their interaction with G4. In the case of M.SssI, the G4 impact on the methylation of duplex-quadruplex models was negligible (
Figure 7). Furthermore, M.SssI exhibits much weaker binding to 27G4f (
Figure 4B) or TT(GGGT)
4TT oligonucleotide [
35] compared to Dnmt3a-CD. The key difference between these two enzymes may be the variable structure and composition of their complexes with DNA substrate [
30,
42]: firstly, the oligomerization of Dnmt3a-CD leading to the inability to properly bind to the double-stranded region of our model c-MYC_35Uf/62Df_G4, in contrast to the standard binding of monomeric M.SssI; secondly, the specific structure of the DNA-binding cavity in Dnmt3a formed by four monomers of the enzyme. Overall, the ability to tetramerize is a key characteristic that determines the Dnmt3a-CD’s affinity for parallel G4 structures, resulting in the development of a DNA-binding surface that appears to be more specific for G-quadruplex structures compared to monomeric M.SssI.
5. Conclusions
The c-MYC oncogene, overexpressed in the majority of solid tumors, is a well-known as a gene with a G4-forming sequence in its promoter region.
In this study, we demonstrated strong specific binding of Dnmt3a-CD to the oligonucleotide 27G4 folded into a parallel
c-MYC G4 structure. The
c-MYC G4-enzyme complex inhibits Dnmt3a-CD activity by preventing standard DNA substrate binding to the enzyme. Using a specially designed DNA construct that mimics the G/C-rich promoter region of the
c-MYC oncogene and stabilizes
c-MYC G4 in a duplex context, we found that the presence of
c-MYC G4 reduces the methylation activity of Dnmt3a-CD. The hypomethylation effect depends on the distance between G4 and the methylated CpG site. It can be assumed that the G4 effect on the Dnmt3a-CD function is determined by the sequestration of the enzyme on this non-canonical structure and the disruption of enzyme’s proper binding to the double-stranded region of the
c-MYC promoter.
In vivo, G4 formation in the
c-MYC oncogene promoter may be one of the reasons of DNA hypomethylation resulting in overexpression of c-MYC protein and cancer progression. Overall, the G4 formation in gene promoters partially explains their hypomethylation compared to the rest of the genome [
23,
40].
Our findings and hypotheses contribute to the understanding of the relationships between the promoter G4′s ability to bind MTases, DNA methylation activity, and functional consequences. Future research should provide clear evidences of in vivo effect of G4 structures on the methylation machinery. It is known that genome G4 structures may interfere with the fidelity of DNA replication and repair. Their involvement in regulation of the DNA methylation revealed in this study suggests a complex network of G4-protein interactions that govern the main biological processes associated with genome stability and oncogenesis.