1. Introduction
The epigenetic machinery comprises genes that encode proteins involved in regulating chromatin organization, histone modifications, DNA methylation, non-coding RNA and RNA methylation [
1,
2]. Many of these genes are directly or indirectly linked to epigenetic regulation of gene expression. The components of the epigenetic machinery can establish and maintain the epigenetic programming of a cell or they can dynamically alter it, thereby affecting both the identity and function of cells. Either way, proteins of the epigenetic machinery collectively interact with complex interdependence and interactivity. Cancer genome sequencing has increased our understanding of epigenetic dysregulation as a feature of tumor development. Independent of genomic aberrations, prostate tumors display DNA methylation patterns that differ from normal tissue. For example, the tumor suppressor
GSTP1 promoter region is typically hypermethylated in prostate cancer (PCa), resulting in the loss of its expression [
3,
4]. Androgen stimulation in PCa has the capacity to recruit histone modifiers, triggering changes in the chromatin states of PCa cells [
5,
6]. Ultimately, these epigenetic changes confer a more active or inactive chromatin state, dysregulating gene expression, thereby causing downregulation or silencing of tumor suppressor genes or a loss of regulation of genes that promote carcinogenesis. Although aberrant DNA methylation and disordered chromatin organization have long been recognized as features of cancer, the exact mechanisms driving epigenetic dysregulation are only beginning to be understood.
A number of studies have shown that driver gene mutations in several cancer types are enriched for epigenetic machinery genes, including PCa. While most individual genes are mutated infrequently, epigenetic machinery genes, as a class, are some of the most frequently mutated in PCa [
7,
8]. However, there are a number of epigenetic modulators revealed to carry frequent and recurrent mutations. In PCa, variants of this nature have been identified in mediators of DNA methylation (e.g.,
TET2,
MBD1), histone acetylation (e.g.,
KAT6B,
ARID4A), histone methylation (e.g.,
KMT2C,
SETD2), as well as in chromatin remodelers (e.g.,
ARID1A,
SMARCA1) [
7,
8,
9,
10,
11]. The putative driver mutations are often truncating or missense [
8], giving rise to non-functional proteins or proteins with altered functionality. Besides small somatic variants, prostate tumors are prone to acquire more complex variation, including structural variations (SVs) and copy number (CN) aberrations [
12]. Of relevance to epigenetic machinery is double-stranded breaks, and the alteration of CpG methylation, local histone methylation and chromatin structure at DNA repair sites, with consequent altered gene expression [
13,
14,
15]. Zhang
et al., 2019 [
16] showed overall SV burden to be associated with global hypomethylation and increased expression of methyltransferase genes across cancer types. Epigenetic dysregulation by imbalanced genomic rearrangements has also been demonstrated in tumors, with the consequences of such rearrangements often being local, in that CN changes of a gene will alter the DNA methylation or gene expression of genomic regions close by [
17].
What remains to be considered is the potential contribution of somatic alterations within the epigenetic machinery and relevance to PCa health disparities. Notably, genetic ancestry is a significant risk factor for aggressive PCa, specifically African ancestry. Within the United States, African American men are 1.7 times more likely to be diagnosed, and over twice as likely to die from PCa than European ancestral American men, reaching 3.1-fold for men younger than 65 years at diagnosis [
18]. Globally, mortality rates are 2.7-fold greater for men from Sub-Saharan Africa [
19]. While both genetic (common and rare variants) [
20,
21] and non-genetic (socioeconomic and cultural) contributing factors have been proposed [
22], studies focused within populations from Sub-Saharan Africa have been scarce [
23,
24]. Conversely, ancestral differences in epigenetic aberrations have been observed for PCa, including genome-wide aberrant methylation patterns [
25,
26,
27,
28]. Gene-specific examples include hypermethylation of
CD44 in prostate tumors derived from African-American compared to European ancestral Americans which was positively correlated with tumor grade [
29], and hypermethylation of
RARB which was significantly associated with a higher risk of PCa in African-American over European ancestral American men [
30]. While genomic aberrations in epigenetic machinery components have been studied previously for PCa [
7,
8,
11], the contribution to ancestral differences associated with health outcomes is yet to be investigated, specifically within the context of Sub-Saharan Africa. Using a unique resource of prostate tumor genomic data derived from 113 men of Southern African and 53 men of European Australian ancestry [
31], we set out to decipher whether genomic aberrations in epigenetic machinery components could, at least in part, explain the ancestral disparity observed for PCa.
4. Discussion
Overall, compared with European derived tumors, African derived prostate tumors presented with a higher burden of variants and potentially damaging variants across epigenetic machinery genes. Although our findings were in line with our previous work, demonstrating a whole-genome African-elevated TMB [
50], the epigenetic burden within Africans was not significantly higher than that of Europeans. When considering all epigenetic machinery genes, the African derived tumors demonstrated a higher overall mutational frequency than the European derived tumors, although not significant. In contrast to a recent genome-wide study which found ~20% of prostate tumors harbored driver mutations across 12 epigenetic machinery genes [
8], here we report frequencies of 52.3% for African and 50.0% for European derived tumors, which may be explained by a larger inclusivity of epigenetic regulators in our study (656
versus only 12 genes). Irrespective of patient ancestry, we found
KMT2C to be the most frequently mutated PCa epigenetic regulator gene, concurring with previous studies reporting frequencies of ~5%-8% [
7,
8]. The type 2 histone lysine methyltransferase
KMT2C is one component of chromatin remodeling machinery responsible for DNA promoter and enhancer regulation, ultimately promoting active chromatin conformations. With strong links to numerous cancer types, mutations in these components confirm their roles as tumor suppressors [
51]. Irrespective of potentially damaging (Table 1) and recurrent (Table 3) driver gene classification, as was observed for the whole genome, African derived tumors showed a longer tail of African-specific epigenetic gene candidates. Besides
KMT2C, the only recurrent driver genes to be shared between the ancestries are the well-known tumor suppressor genes
KDM6A and
TP53. In contrast to a lack of European-specific recurrent drivers, 35 African-specific recurrent driver genes were observed. The latter included putative loss-of-function PCa mutations previously reported for
ARID1A,
ATRX,
CHD1,
CHD3,
HDAC4,
KMT2A,
KMT2D,
SETD2 and
SMARCA1 [
7], with
BRMS1,
CARM1,
EHMT2,
GLI3,
HDAC1,
KDM6B,
PRDM16,
RBBP5 and
REST having known roles in PCa [
52,
53,
54,
55,
56,
57,
58,
59,
60].
ARID1B,
ARID5B,
BRWD1,
EP300,
HDAC3,
NCOR2,
PSIP1,
SMARCA4,
STAG2 and
XPO1, although reported by PCAWG [
43] are new to PCa, leaving
CHD7,
DPF3,
ELP2,
GATAD2B,
NUP35,
SETD1B and
TADA2B as novel candidate drivers.
Taking a closer look at the epigenetic processes, in contrast to our whole genome data and EPGs 1, 2, 4 and 5 with an African ancestry-elevated burden, we consistently showed EPG3 alterations to be similar between the ancestries. Overall this group of DNA methylation gene regulators appears to be highly conserved, as previously reported [
61], with no recurrent drivers (Table 3) and potentially damaging variants in only two genes,
DNMT3B and
TDG (Table 1).
DNMT3B is a DNA methyltransferase (DNMT) enzyme responsible for establishing and maintaining methylation at satellite sequences and gene bodies [
62,
63]. DNMT polymorphisms are associated with PCa progression by means of downregulatory tumor suppressor gene promoter methylation [
64] and elevated
DNMT3B expression in aggressive
versus non-aggressive PCa cell lines [
65]. Similarly, a damaging variant in
DNMT1 was observed in an African sample.
TDG, or Thymine DNA Glycosylase, plays a key role in active DNA demethylation, and as a tumor suppressor, several polymorphisms in
TDG are associated with increased risk for cancer, although this gene has also been found to act as an oncogene, promoting tumorigenesis [
66,
67,
68]. It remains to be determined whether the
TDG damaging variants identified in our study have gain-of-function or loss-of-function properties. Ultimately, aberrant DNA methylation is a hallmark of cancer progression, and dysregulation of the DNA methylation machinery may lead to a reprogramming of the epigenomic landscape in cancer.
Using hierarchical consensus clustering for all somatic mutational types (small variants, SVs and CNAs), we describe two epigenetic PCa taxonomies (ECS and EcnCS), which independently showed significant agreement with our previously-reported GMSs [
31]. Showing extensive overlap amongst Europeans, both ECS3 and GMS-C tumors predicted poorer clinical outcome over ECS1 and GMS-A tumors, respectively, demonstrating the bias of each GMS to an ECS. As such, our identified ECSs validate the whole genome-derived GMSs and are able to relatively distinguish those global subtypes based on just a subset of the genome, indicating a significant role for epigenetic mechanisms in PCa development. While numbers for recorded PCa-associated death are arguably small (7/50 Australians), it is notable that all three Australian European derived tumors presenting with ECS2 succumbed to PCa, i.e., 42.9% of PCa deaths were associated with ECS2 tumors. Furthermore, as ECS2 is otherwise characterized by African predominance, specifically ISUP group grading > 3 PCa (78% of African derived ECS2 tumors
versus 73% of all African derived tumors), our data suggests that ECS2 is a predictor of poor outcome.
More aligned with the whole genome-derived GMSs [
31], the EcnCSs showed ancestral distinction including an African-specific subtype (EcnCS2), a European-predominant subtype (EcnCS3) and a shared subtype (EcnCS1). Notably, EcnCS2 defined by significant CN gain further defines ECS2, while almost exclusively incorporates all the African-specific GMS-B tumors (95.2%, 20/21). EcnCS3 further distinguished ECS3 and the poor outcome-associated GMS-C, as a singular cluster defined by epigenetic gene CN loss. Of the GMS-C tumors, EcnCS3 presented with a higher predominance of ISUP group grading 5 PCa (81.8%) over EcnCS1 (60.0%), with EcnCS3 predicting poorer outcome for BCR than EcnCS1, indicating a more aggressive presentation for EcnCS3-GMS-C tumors. Suggesting that epigenetic CNAs alone have the potential to predict patient outcomes in our study, the relationship between CNAs and DNA methylation in cancer has been examined previously [
17], although not at length. However, it is generally understood that a gene’s CNAs affect DNA methylation of genomic regions nearby. These two processes may be negatively associated (i.e., DNA methylation decreases with copy number gain and
vice versa), in which case the effect is localized to CpG islands; or they may be positively associated, in which case the open sea (genomic region beyond 4 kb from a CpG island border) is affected. Either way, it has been suggested that genome-wide DNA methylation changes in response to CNA events are likely initiated and maintained by some “generic” machinery. This is supported by the Sun
et al., (2018) [
17] finding that CNA events and their association with altered DNA methylation are similar across cancer types. Another observation common to several cancer types is ancestral differences in DNA methylation patterns. This has been observed in PCa, in which African-American tumors display a higher prevalence of DNA hypermethylation at disease-related loci compared to European-American tumors [
26]. Therefore, each of the epigenetic (copy number) cancer subtypes, with their distinct CNA events, likely give rise to distinct aberrant DNA methylation patterns. Whether those DNA methylation patterns would cluster in agreement with the CN patterns is yet to be determined. Of course, aberrant DNA methylation does not only arise in response to CN gain/loss events. However, inclusion of patient-matched DNA methylation data could decipher this.
As a function of hierarchical clustering, feature selection identified the top five genes for each variant type for potential driver gene classification. In rank order based on posterior probability, the top five features for small somatic variant data were
RAI1,
SETD1B,
SRCAP,
ARID1A and
MED26, for SV data were
SMYD4,
GATAD2B,
PPARG,
MEF2D and
SMARCAD1, and for CNAs,
HMGA2,
SMYD5,
SUMO3,
SP110 and
RAG2. Of the top five selected features for SV and CNA data, and for the 30 small somatic mutation-identified drivers (Table S14), the genes that appear new to PCa lacking in PCAWG and African-specific include
SP110,
GATAD2B,
RAI1,
MED26,
BRD1,
POLR1B,
VPS72,
ELP5,
UBTF,
MXD1,
DR1 and
ELL. Formerly considered to be a transcriptional regulator of circadian clock components in neuronal tissue, a recent study found
RAI1 to act as a tumor suppressor in esophageal cancer; prior to this finding, the functional role of
RAI1 in tumors was unknown [
69].
SETD1B, an essential component of a histone methyltransferase complex, believed to have essential, even housekeeping, functions within cells [
70], although having no clear role in malignancy, has been reportedly mutated in gastric and colorectal cancers [
71].
MED26, belonging to the Mediator complex (MED) gene family, while implicated in several cancer types does not include PCa [
72,
73]. Additionally, a number of the epigenetic regulators specific to African tumors have been identified as potential therapeutic targets. Chromatin remodeler
CHD7, a somatic driver candidate in colorectal cancer (CRC), promotes CRC cell growth by binding target gene promoters encouraging an open chromatin conformation and subsequent transcription, whereas
CHD7 knockdown inhibits CRC cell growth [
74]. Similarly,
POLR1B knockdown induces lung cancer cell apoptosis [
75],
VPS72 knockdown inhibits the proliferation, invasion and migration of hepatocellular carcinoma (HCC) cells [
76], and
UBTF silencing suppresses melanoma cell proliferation [
77].
DPF3, a chromatin remodeling cofactor significantly downregulated in breast cancer tissue, promoting the proliferation of breast cancer cells, has been suggested as a novel therapeutic target for breast cancer therapy [
78]. Increased
SETD1B expression in HCC positively correlated with tumor size, clinical stage and liver cirrhosis. Decreased
SETD1B expression was associated with increased patient survival times, identifying this histone methyltransferase as a potential therapeutic target in HCC [
79]. While several of the African-specific drivers show clinical relevance, the remaining genes are not well-studied as therapeutic targets in cancer [
80,
81,
82,
83].
Ultimately, alterations to genes encoding epigenetic machinery components are increasingly recognized in many cancer types, including PCa. From this study, based on genes containing potentially damaging variants as per functional impact prediction and/or recurrence as well as putative driver gene status as defined by feature selection during hierarchical clustering, we have summarized top genes in African and European derived tumors per EPG, that may be instrumental in epigenetic dysregulation and subsequent development and/or progression of PCa (
Figure 4). Identifying a number of putative drivers,
ARID1A,
CHD4,
HCFC1,
STAG2,
SMARCA4 and
NCOR1 are known cancer driver genes [
43]. Notably, there is extensive ancestral overlap amongst the top genes in all the EPGs. The assignment of numerous epigenetic machinery genes to more than one EPG is owed to the multifunctional nature of these genes. For example,
CHD4, or Chromodomain Helicase DNA Binding Protein 4, is the main component of the nucleosome remodeling and deacetylase (NuRD) complex that plays an important role in epigenetic transcriptional repression.
CHD4/NuRD also regulates RNA synthesis [
84]. As such, the multifunctionality of
CHD4 warrants its inclusion in EPGs 1, 2, 4 and 5. Rather than the top genes being epigenetic machinery components exclusive to a single EPG, this broad overlap is reminiscent of the previously-discussed CN-DNA methylation aberration events common to many cancer types arising from some “generic machinery”. Indeed, epigenetic regulators are well-conserved and mutated infrequently. However, should epigenetic regulation be disrupted, as a class, perhaps the genomic alteration of a common core group of multifunctional epigenetic regulators will achieve this, promoting tumorigenesis. Many of our top-identified genes have well-established roles across cancer types, further supporting the representation of these altered genes as a “generic machinery” promoting cancer. Yegnasubramanian describes alterations in epigenetic reprogramming to be almost universal in human cancers [
7]. However, it is clear that African derived tumors present with many more (ancestry-specific) possible cancer drivers than do European derived tumors, highlighting the diversity by which epigenetic dysregulation and consequent tumorigenesis may arise in Africans.