1. Introduction
Although there have been genomic profiling analyses of primary prostate cancers by The Cancer Genome Atlas [
1] and other genomic studies of metastatic castration-resistant prostate cancers (mCRPC) [
2,
3,
4,
5,
6], analyses of primary tumors and metastases from the same patients are relatively understudied. Most studies have used only primary tumors [
1], only metastatic tumors [
4,
6], or metastases and primary tumors from different patients [
3], but only a very few have analyzed metastases and primary tumors from the same patients [
7]. However, this study used targeted DNA sequencing and low-pass Whole Genome Sequencing for copy number alterations, but did not perform whole exome sequencing (WES) or RNAseq analyses [
7]. No studies have performed integrative WES and RNAseq using multiple sites from the same patients. Additionally, few studies have analyzed lymph node metastases (LNM) [
8], since most studies have focused on bone or visceral metastases [
9]. Analysis and understanding of LNM is critical because LNM represents one of the first steps in the systemwide metastatic process, and patients with positive lymph nodes at radical prostatectomy experience poor cancer specific survival rates [
10]. Furthermore, analyses of multiple independent tumor foci from the same patients together with corresponding LNM are lacking. Moreover, while studies have examined African-American (AA) men at the RNA [
11] or DNA levels [
12,
13], integrative genomic profiling of RNA and DNA has not been performed in highly diverse cohorts with high proportions of AA men. In this study, we address this gap in our knowledge of prostate cancer biology through integrative RNAseq and WES approaches to identify potential key driver mutations and important genomic expression changes involved in prostate cancer metastasis.
Prostate cancer is characterized by intratumoral heterogeneity on multiple levels [
14]. Primary prostate cancer is known to be multifocal, which underlies the rationale for the Gleason scoring system [
15,
16,
17,
18]. Previous studies have found that over 70% of patients have multifocal disease representing multiple tumor grades [
16]. Furthermore, different primary tumor foci can be composed of genetically distinct clones, suggesting independent carcinogenesis events within the prostate gland [
19,
20,
21,
22,
23]. Intratumoral heterogeneity has been identified even within single primary tumor foci in the prostate [
24], suggesting parallel evolution of independent clones with branching and divergent paths.
Intratumoral heterogeneity is a poorly understood phenomenon that is critically important for understanding tumor progression and the development of drug resistance, as different sub-clones can respond differently to microenvironmental changes and selection pressure from therapies. Recent studies have indicated that independent foci and biopsies can have markedly different performance in commercially available biomarker panels [
25]. Thus, understanding intratumoral heterogeneity is essential for biomarker development and validation, prognosis, and therapeutic decision-making for precision medicine.
Here we have performed WES of 137 samples and RNAseq analysis of 165 samples from 43 patients enrolled in a clinical trial [
26] that included both radical prostatectomy and extensive dissection of all pelvic lymph nodes (LNs). We present evidence that while there are some cases of clonally independent primary tumor foci, 87% of primary tumor foci are descended from a common ancestor, and that while metastases are heterogeneous and contain multiple subclones, they are genetically related to 87% of primary tumor foci within the same patients. We demonstrate that genes related to oxidative phosphorylation are transcriptionally upregulated in LNM, and that these genes are also upregulated in African-American patients relative to White patients. We further show that mutations in
TP53, FLT4, EYA1, CSMD3, and
PCDH15 are enriched in prostate cancer metastases. Moreover, we validate our findings in a meta-analysis of publicly available datasets from twelve different studies representing 3929 primary tumors and 2721 metastases. These findings have implications for our understanding of the molecular events required for prostate metastasis.
3. Results
To investigate the molecular changes, mutations, and signaling pathways that characterize the initial steps of prostate cancer metastasis to the lymph nodes, we analyzed 19 discrete LNM, 97 primary tumor foci, 39 benign lymph nodes, and 10 normal adjacent prostate tissue samples from 43 patients enrolled in a clinical trial (NCT01808222) of extended lymphadenectomy and metastasis excision guided by [18F]-fluciclovine PET imaging for high risk, newly diagnosed prostate cancer conducted at our institution [
26] (
Supplementary Table S1). These patients fell within high or very high-risk groups (T3a, Gleason score 8-10, or PSA greater than 20 ng/ml). Of these patients, 16 had LN metastases at least 4 mm in diameter, and 20 patients had no metastases. For metastatic patients, we performed RNA and Whole Exome Sequencing (WES) on three primary tumor foci, one benign LN, and all LN metastases greater than 4mm in diameter (see Methods). For non-metastatic patients, we sequenced two primary tumor foci and one benign LN. In total, RNAseq was performed on 165 tissue samples from 43 patients (
Supplementary Tables S2 & S3), and WES was performed on 137 samples from 35 patients (
Supplementary Table S4). A summary of all sequenced samples is provided in
Table 1.
3.1. RNA-seq Analysis Identifies Oxidative Phosphorylation Is Associated with LNM
To understand the signaling pathways essential for the first steps of metastasis, we performed RNAseq analysis. A median of 56 million paired-end reads were obtained per sample, and 44,289 transcripts were detected in at least 5% of samples. Samples were classified by several criteria including tissue site of origin (prostate vs lymph node), malignant vs. benign, primary tumor vs. metastasis, and deriving from a patient with metastases vs. a patient without metastases (met vs. non-met).
In the first analysis, we identified a set of 381 genes whose expression that trended consistently upward, significantly increasing from normal prostate tissue to primary tumor foci and from primary tumor foci to LNM, as well as a set of 323 genes that trended consistently downward from normal tissue to primary tumor foci to LNM (Figure 1A and Supplementary Table S7). The upregulated gene set included oncogenes such as PIK3CB, FGFRL1, and NCOA2; prostate cancer related genes such as SPON2, PCAT4, and SCHLAP1; and cell cycle genes such as MKI67, MCM4, KIF4A, CENPF, and FLNB. Downregulated genes included tumor suppressors such as PTEN and TGFBR3; adhesion and junction proteins such as SDC3, ITGA9, ITGA7, RSU1, and TGFBI; and apoptotic genes such as TNFRSF10B (DR5/TRAILR2) and RAPGEF3 (EPAC1). These consistent changes in gene expression across sample types suggests that they play key roles in metastasis to the lymph nodes.
Next, we compared 39 benign lymph nodes (BLN) to 19 LNM and identified 9396 transcripts differentially expressed (p-adj < 0.01) between these two sets of samples (Supplementary Table S5). As expected, the most differentially expressed genes included NKX3-1, KLK3 (PSA), KLK2, KLK4, TMPRSS2, AR, AMACR, FOLH1 (PSMA) and PCA3. Also of interest were several genes associated with prostate cancer metastases including FOXA1, EZH2, SCHLAP1, CDH1, SOX4, SOX9, HOXB13, and TGFBR3. Gene set enrichment analysis (GSEA) identified several pathways significantly associated with LNM including the Hallmark 50 pathways ANDROGEN_RESPONSE, MYC_TARGETS_V1, OXIDATIVE_PHOSPHORYLATION, MTORC1_SIGNALING, and MYC_TARGETS_V2 which all had an FDR < 2.2E-16 (Table 2 and Figure 1B). Altered expression of the oxidative phosphorylation pathway was also observed to be differentially expressed in AA men and in metastatic clones (see below).
Comparing primary tumor foci to LNM, 6203 transcripts were differentially expressed (p-adj < 0.01) (
Supplementary Table S6). Primary tumor foci were enriched relative to LNM in gene sets associated with Epithelial to Mesenchymal Transition (EMT), estrogen signaling, TGF beta signaling, Notch signaling, and the response to hypoxia. LNM were enriched in cell cycle progression pathways and immune function pathways such as interferon (IFN) gamma and IFN alpha responses (
Supplementary Figure S1).
Table 2.
GSEA Analysis of BLN vs. LNM.
Table 2.
GSEA Analysis of BLN vs. LNM.
Gene Set |
Size |
Leading Edge Number |
NES |
FDR |
HALLMARK_ANDROGEN_RESPONSE |
101 |
47 |
-2.3952 |
<2.2e-16 |
HALLMARK_MYC_TARGETS_V1 |
200 |
132 |
-2.2692 |
<2.2e-16 |
HALLMARK_OXIDATIVE_PHOSPHORYLATION |
199 |
117 |
-2.1852 |
<2.2e-16 |
HALLMARK_MTORC1_SIGNALING |
200 |
120 |
-2.0443 |
<2.2e-16 |
HALLMARK_MYC_TARGETS_V2 |
58 |
35 |
-1.9282 |
<2.2e-16 |
HALLMARK_FATTY_ACID_METABOLISM |
157 |
70 |
-1.8465 |
0.000112 |
HALLMARK_GLYCOLYSIS |
199 |
93 |
-1.8362 |
0.000096004 |
HALLMARK_UNFOLDED_PROTEIN_RESPONSE |
113 |
55 |
-1.8126 |
0.00033601 |
HALLMARK_PEROXISOME |
104 |
54 |
-1.806 |
0.00037335 |
HALLMARK_E2F_TARGETS |
200 |
103 |
-1.7584 |
0.00039202 |
HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION |
200 |
73 |
1.6463 |
0.0029856 |
HALLMARK_MYOGENESIS |
197 |
69 |
1.7325 |
0.002488 |
HALLMARK_KRAS_SIGNALING_UP |
198 |
79 |
1.7459 |
0.0023325 |
HALLMARK_TNFA_SIGNALING_VIA_NFKB |
200 |
88 |
1.8804 |
0.00053315 |
HALLMARK_INFLAMMATORY_RESPONSE |
199 |
86 |
1.9837 |
<2.2e-16 |
HALLMARK_INTERFERON_ALPHA_RESPONSE |
97 |
56 |
2.0873 |
<2.2e-16 |
HALLMARK_COMPLEMENT |
200 |
101 |
2.269 |
<2.2e-16 |
HALLMARK_IL6_JAK_STAT3_SIGNALING |
83 |
52 |
2.3259 |
<2.2e-16 |
HALLMARK_ALLOGRAFT_REJECTION |
199 |
116 |
2.4589 |
<2.2e-16 |
HALLMARK_INTERFERON_GAMMA_RESPONSE |
200 |
123 |
2.608 |
<2.2e-16 |
Figure 1.
RNAseq analysis of 165 prostate cancer patient samples identifies oxidative phosphorylation as differentially expressed between AA and White patients. A) Heat map of 704 genes that trend consistently upward or downward from normal to primary tumor foci to LNM. B) GSEA plot of the oxidative phosphorylation gene set from comparison of LNM to benign lymph nodes. C) GSEA plot of the oxidative phosphorylation gene set from analysis of African-American vs. White patients with Gleason score of 7. D) OXPHOS score increases in from normal prostate to primary tumor to LNM. E) FGFR3 is highly correlated with OXPHOS gene IDH1 F) FGFR3 is expressed at higher levels in AA PCa than EA PCa in our cohort. G) FGFR3 mRNA is significantly higher in patients with >50% African ancestry in the TCGA PRAD dataset. H) Ecotyper analysis of 131 samples assigned to one of ten unique carcinoma ecotypes (CE).
Figure 1.
RNAseq analysis of 165 prostate cancer patient samples identifies oxidative phosphorylation as differentially expressed between AA and White patients. A) Heat map of 704 genes that trend consistently upward or downward from normal to primary tumor foci to LNM. B) GSEA plot of the oxidative phosphorylation gene set from comparison of LNM to benign lymph nodes. C) GSEA plot of the oxidative phosphorylation gene set from analysis of African-American vs. White patients with Gleason score of 7. D) OXPHOS score increases in from normal prostate to primary tumor to LNM. E) FGFR3 is highly correlated with OXPHOS gene IDH1 F) FGFR3 is expressed at higher levels in AA PCa than EA PCa in our cohort. G) FGFR3 mRNA is significantly higher in patients with >50% African ancestry in the TCGA PRAD dataset. H) Ecotyper analysis of 131 samples assigned to one of ten unique carcinoma ecotypes (CE).
3.2. Oxidative Phosphorylation, Immune Signaling, and Hedgehog Pathways Are Differentially Expressed in AA Patients
The racial diversity of our cohort enabled us to identify differences in gene expression between (self-identified) African-American and White patients. Of the 165 samples analyzed by RNAseq, 30 primary tumor foci samples came from 14 African-American patients and 53 primary tumor foci samples came from 23 White patients. There were only three LNM from three African-American patients and 14 LNM from 12 White patients. To optimize sample power, we chose to focus on primary tumor foci to examine differences in gene expression based on race. We determined that the average Gleason score from the radical prostatectomies was higher in the White patients than in the African-American patients (8.61 vs 7.93, p=0.03), and therefore, when computing differences in gene expression between primary tumor foci from African-American and White patients, we performed subset analyses on patients with Gleason score 7 (16 samples from 7 African-American patients, 7 samples from 4 White patients) and Gleason score 9 (12 samples from 6 African-American patients, 43 samples from 18 White patients). We performed GSEA analysis on these differences and determined that for Gleason score 7 patients, primary tumor foci from African-Americans were enriched in gene sets relating to IFN alpha response, oxidative phosphorylation, and adipocyte development, among others (
Figure 1C). For Gleason score 9 patients, primary tumor foci from African-Americans were enriched in gene sets relating to Hedgehog signaling, muscle differentiation, and EMT. For both sets of analyses, primary tumor foci from African-Americans were enriched in gene sets associated with UV response, early estrogen response, adherens and tight junctions, and mitotic spindle assembly (
Supplementary Tables S8-9). Differential expression of DNA damage response, hormonal response, adhesion, and proliferation pathways in AA men are consistent with the typically worse outcomes observed in these patients [
35].
Three of the most significantly upregulated OXPHOS genes included IDH1, IDH2, and FGFR3. These data are intriguing in light of a recent study on the role of the FGFR3-TACC3 gene fusion driver mutation that has been observed in 3% of human glioblastoma cases [
36]. This study found that the FGFR3-TACC3 fusion protein drove increased mitochondria, ATP production, and oxidative phosphorylation [
36]. To better quantify the OXPHOS pathway activation in our RNAseq datasets, we constructed an OXPHOS score based on the z-score normalized gene expression values of the 105 genes in the leading edge of the OXPHOS gene set analyses. We computed the OXPHOS score for all samples and compared the expression in normal, primary tumor, and LNM samples and found increased OXPHOS scores as tumors progressed (
Figure 1D). Interestingly, FGFR3 was highly correlated with OXPHOS gene expression, including expression of IDH1 (
Figure 1E, p < 2.2e-16) and IDH2 (p = 3.7e-14), which is critical for and modulates the assembly of the entire OXPHOS system [
37]. We noted that FGFR3 was expressed at higher levels in AA samples than in EA samples in our cohort (
Figure 1F). To investigate whether this observation was confirmed in an independent cohort, we also analyzed the association of FGFR3 mRNA expression with %African Ancestry data from The Cancer Genome Atlas (TCGA) Prostate Adenocarcinoma (PRAD) dataset [
1] using data available on cBioPortal [
38,
39] and determined that FGFR3 was one of the genes most strongly associated with %African (%AFR) ancestry (Spearman rho = 0.2, p = 1.479e-5). We also determined that FGFR3 expression was higher in patients with >50% AFR ancestry compared to patients with < 50% AFR ancestry (
Figure 1G, adj.p = 0.027).
3.3. Ecotyper Analysis Identifies Ecotypes with Activated Oxidative Phosphorylation and Androgen Signaling Are Associated with LNM
The Ecotyper tool [
40] is a machine learning framework for analysis of gene expression patterns from bulk tumor RNA to identify cellular subtypes and the state of the tumor microenvironment. We applied Ecotyper analysis to our RNAseq expression data using the ten carcinoma ecotypes (CE) that were generated using 5,946 tumors from 16 TCGA studies [
40]. The Ecotyper tool was able to assign a dominant carcinoma ecotype to 131 of the 165 RNAseq samples (
Figure 1H and Table S2). In general, the CE’s that were assigned to each sample were consistent with the tissues from which they were derived. For example, samples from the lymph node were primarily assigned to CE10, which is associated with IL-2, IL-6, and allograft rejection, and sometimes CE9, which is associated with IFN-alpha and IFN-gamma signaling. Additionally, normal prostate tissue samples were most often assigned to CE6, which is associated with normal tissues. LNM were also assigned to CE5, CE7, and CE8. CE5 is associated with spermatogenesis, MYC signaling, and DNA repair pathways, while CE7 is associated with Androgen, PI3K, MTORC, and OXPHOS pathways, and CE8 is associated with spermatogenesis and CCL16. Many of the primary tumor foci were associated with CE1 and CE2, which are associated with EMT, TGFB, Notch, and Hedgehog signaling in CE1 and hypoxia and proliferation in CE2. CE7 and CE8 were enriched in patients who had metastases. For CE7 samples, 73% of samples were from patients with metastases, while for CE8, 67% of samples were from patients with metastases. Association of CE7 with metastatic samples is consistent with previous studies demonstrating activation of androgen and PI3K/MTOR pathways in mCRPC [
3,
9], and our identification here of the oxidative phosphorylation pathway association with LNM.
3.4. WES Analysis Determines SPOP, FLT4, and PCDH15 Mutations Are Enriched in LNM
To identify mutations associated with LNM, we performed WES analysis. A total of 137 tissue samples from 35 patients were subjected to WES analysis at a median exome depth of 149 reads. GATK analysis pipelines including BWA alignment and Mutect2 variant calling using a panel of 34 normal samples were used to generate variant calls as previously described[
1] (see Methods). We identified 1927 significant somatic mutations in 91 patient samples (
Supplementary Table S10). The top mutated genes based on the number of samples across each site are shown (
Figure 2A). The top ten mutated genes included SPOP, EYA1, NCOR2, CACNA1E, SOGA1, CSMD3, TP53, PCDH15, LRRC4C, and FOXP1 (
Figure 2B). We observed four different missense mutations in SPOP from five patients, all of which were located in the MATH domain, which is involved in receptor binding and oligomerization (
Figure 2C). Two patients (PCM021 and PCM033) had the same SPOP-F133V mutation. Furthermore, we identified specific mutations that were enriched in metastatic samples using maftools and determined that genes preferentially mutated in metastases included SOGA1, LRRC4C, TP53, COL5A1, PCDHA13, and SLC16A14 (
Figure 2D). Mutations were most frequent in RTK/RAS, Wnt, PI3K, Hippo, and Notch pathways (
Figure 2E).
Figure 2.
WES analysis of 137 prostate cancer patient samples identifies recurrent mutations associated with metastasis. A) Graphical depiction of the top ten most frequently mutated genes in this study. Each column represents an individual tissue sample. Percentages on the right indicate the percentage of samples with mutations in that gene. Tumor mutational burden (TMB) for each sample is plotted along the top. B) Lollipop plot of mutations detected in the SPOP gene. C) Forest plot of mutations preferentially detected in patients with metastases. D) Signaling pathways most highly impacted by the detected mutations.
Figure 2.
WES analysis of 137 prostate cancer patient samples identifies recurrent mutations associated with metastasis. A) Graphical depiction of the top ten most frequently mutated genes in this study. Each column represents an individual tissue sample. Percentages on the right indicate the percentage of samples with mutations in that gene. Tumor mutational burden (TMB) for each sample is plotted along the top. B) Lollipop plot of mutations detected in the SPOP gene. C) Forest plot of mutations preferentially detected in patients with metastases. D) Signaling pathways most highly impacted by the detected mutations.
To examine tumor heterogeneity, we performed clustering based on mutation allele frequencies. Mutational allele frequency clustering identified some cases of clonally independent primary tumor foci. While some patients (e.g. PCM034 and PCM003) had primary tumor foci that appear to be clonally independent from the metastases and other primary tumor foci (
Figure 3A), most primary tumor foci (e.g. PCM037 and PCM050) were clonally related to the metastases and derived from a common ancestral clone (
Figure 3A). We also identified mutations that were significantly different in allele frequency between LNM and all primary tumor foci from the same patients. We found that mutations in FLT4, LRRC4C, PCDH15, FAM47B, PIK3CD, MAP3K15, PIK3R6, and SPOP, among others, were significantly enriched in LNM relative to primary tumor foci, suggesting natural selection for these somatic mutations in metastatic clones (
Figure 3B). Of those genes, FLT4, SPOP, and PCDH15 were mutated in multiple patients. Association of FLT4 and PCDH15 with prostate cancer metastases has not been previously reported.
Figure 3.
Analysis of metastatic and non-metastatic clones identifies oxidative phosphorylation as associated with metastasis. A) Heatmaps of all identified mutations clustered based on allele frequencies. LNM are clonally related to most primary tumor foci. Some patients (PCM034 and PCM003) contained primary tumor foci that were not clonally related to others, although most patients did not. B) Heatmaps of genes with significantly increased allele frequencies in LNM relative to primary tumor foci within the same patients indicating potential clonal selection. C) Venn diagram of sets of significantly differentially expressed genes comparing normal prostate to metastatic primary tumor foci, normal prostate to non-metastatic primary tumor foci, and metastatic primary tumor foci to non-metastatic primary tumor foci. D) Volcano plot of over-representation analysis of the 810 genes in the center of the Venn Diagram in panel C. Oxidative phosphorylation was significantly overrepresented in these genes.
Figure 3.
Analysis of metastatic and non-metastatic clones identifies oxidative phosphorylation as associated with metastasis. A) Heatmaps of all identified mutations clustered based on allele frequencies. LNM are clonally related to most primary tumor foci. Some patients (PCM034 and PCM003) contained primary tumor foci that were not clonally related to others, although most patients did not. B) Heatmaps of genes with significantly increased allele frequencies in LNM relative to primary tumor foci within the same patients indicating potential clonal selection. C) Venn diagram of sets of significantly differentially expressed genes comparing normal prostate to metastatic primary tumor foci, normal prostate to non-metastatic primary tumor foci, and metastatic primary tumor foci to non-metastatic primary tumor foci. D) Volcano plot of over-representation analysis of the 810 genes in the center of the Venn Diagram in panel C. Oxidative phosphorylation was significantly overrepresented in these genes.
3.5. Integrative DNA/RNA Analysis Determines the Oxidative Phosphorylation Pathway, FLT4 Mutations, and PCDH15 Mutations Are Associated with Metastasis
To better understand molecular changes associated with metastasis, we performed integrated WES and RNAseq analyses. Based on shared genetic alterations of primary tumor foci to LNM, we categorized samples as either metastatic clones or non-metastatic clones (Supplementary Table S2) by examining the number of shared mutations and estimating if they were genetically related as being descended from a common ancestor containing the same somatic mutations. We performed DESeq2 analysis comparing these two groups to control for the fact that primary tumor foci and LNM are derived from different tissue microenvironments and different organ sites. We compared primary tumor foci from metastatic clones (primary tumor foci.met) to primary tumor foci from non-metastatic clones (primary tumor foci.nonmet) and both sets of primary tumor foci to normal prostate tissue (NP). Primary tumor foci.met samples were categorized based on mutation allele frequency as compared to LNM to determine if they were clonally related. Primary tumor foci.nonmet samples were either derived from patients without metastases or from patients with metastases that were not clonally related to corresponding LNM. The number of differentially expressed genes in each comparison are shown in the Venn Diagram in Figure 3C. Over-representation analysis of the 810 genes at the intersection of the three comparisons (i.e. genes different between all three groups of normal prostate and non-metastatic primary tumor and metastatic primary tumor samples, Supplementary Table S11) determined that they are enriched in genes involved in oxidative phosphorylation (Figure 3D). These genes included multiple ATP synthase genes (ATP5MF, ATP6V0B, ATP6V1G1), multiple cytochrome C oxidase and reductase genes (COX17, COX6A1, COX6B1, UQCRH, UQCRQ), and multiple NADH:ubiquinone oxidoreductase subunits (NDUFA3, NDUFB7, NDUFS6, and NDUFS8).
To further uncover gene expression differences associated with metastatic clones, we compared the metastatic primary tumor clones to the set of normal prostate and non-metastatic primary tumor clones. This analysis identified 1962 significantly different genes (p.adj < 0.05). GSEA analysis of these genes identified six gene sets associated with metabolism, including oxidative phosphorylation (hsa00190, FDR < 2.2e-16), seven gene sets associated DNA repair, twelve gene sets associated with cell cycle, seven gene sets associated with immune responses, and three gene sets associated with protein trafficking (Supplementary Table S12).
In addition, we analyzed associations of mutant alleles to Ecotyper carcinoma ecotypes. We restricted our analysis to mutations that were observed in at least two different patients and computed adjusted p-values using Bonferroni correction. Mutations in FLT4 were significantly associated with assignment of samples to CE7 (adj-p = 1.1E-4) and CE8 (adj-p = 1.5E-3), while mutations in PCDH15 were also significantly associated with the CE7 (adj-p = 7.8E-7) and CE8 (adj-p = 1.7E-3) ecotypes that included many LNM samples.
3.6. Tumor Heterogeneity Analysis Identifies Subclones Suggesting Gain of chr8 Prior to chr8p Loss
To examine subclones within each sample, we performed HATCHet analysis. The HATCHet software tool can compute tumor heterogeneity [
34], copy number aberrations (CNAs), and whole-genome duplications (WGDs) to investigate tumor evolution and metastatic seeding patterns. We applied HATCHet analysis to our WES dataset and observed that most samples had subclonal populations (
Figure 4). Patient PCM034 which had three LNM was of particular interest and had three subclones. Clone 1 was enriched in all three LNM and was characterized by gain of chromosomal arm 8q (containing c-MYC) and loss of chromosomal arm 8p (containing NKX3.1). The other subclones exhibited gain of chr8 but no loss of 8p, suggesting that gain of chr8 may have preceded loss of 8p during tumor evolution in this patient.
Figure 4.
HATCHet analysis of tumor heterogeneity for patient PCM034 identifies subclones with differential gains in chr8. Gains and losses for each chromosome are indicated as red or blue. Abundance of subclones is indicated with lighter colors showing less abundance and darker colors showing greater abundance. Four subclones were identified, with clone 1 containing loss of chr8p and gain of chr8q most prevalent in the LNM samples. Primary tumor foci were enriched in the normal subclone.
Figure 4.
HATCHet analysis of tumor heterogeneity for patient PCM034 identifies subclones with differential gains in chr8. Gains and losses for each chromosome are indicated as red or blue. Abundance of subclones is indicated with lighter colors showing less abundance and darker colors showing greater abundance. Four subclones were identified, with clone 1 containing loss of chr8p and gain of chr8q most prevalent in the LNM samples. Primary tumor foci were enriched in the normal subclone.
3.7. Validation in External Datasets Demonstrates EYA1 and CSMD3 Mutations Are Associated with Poor Survival
To evaluate the generalizability of our findings, we analyzed the mutations that we identified that appeared to be more frequent in LNM than in primary tumor foci or were frequently mutated in our cohort. We examined data from twelve published studies [
2,
3,
6,
41,
42,
43,
44,
45,
46,
47,
48,
49] that are publicly available in cBioPortal [
38,
39] comprising 8902 samples. Of those 8902 samples, 6650 samples (
Supplementary Table S13) were definitively annotated as either primary tumors (n = 3929) or metastases (n =2721). We found that mutations were enriched in metastases over primary tumors for TP53 (q = 2E-64), EYA1 (q = 2E-9), CSMD3 (q = 6.1E-6), MAP3K15 (q = 3.3-5), PIK3CD (q = 2.2E-4), SLC16A14 (q = 1.4E-3), FLT4 (q = 1.6E-3), PCDH15 (q = 2.1E-3), CACNA1E (q = 1.5E-2), and NCOR2 (q = 1.9E-2) by Fisher exact test adjusted for false discovery (
Figure 5A). We found SPOP mutations were enriched in primary tumors (q = 3.1E-3), and surprisingly, LRRC4C were also enriched in primary tumors (q = 3.1E-5). The complete list of mutations from this meta-analysis enriched in metastases or primary tumors includes well established oncogenes and tumor suppressors such as AR, TP53, PTEN, RB1, APC, MYC, and CTNNB1 and is provided in
Supplementary Table S14. Mutational comparisons for EYA1 and CSMD3 are shown in
Figure 5B,C. To examine the clinical significance of these mutations, we compared the disease-specific survival (492 patients) in the TCGA PanCancer dataset [
46], disease-free survival (334 patients), and progression-free survival (494 patients) for samples with and without EYA1 (
Figure 5D-F) or CSMD3 (
Figure 5G-I) mutations (including SNV, CNV, and structural variations) and found that mutations in both these genes were significantly associated with poor outcome.
Figure 5.
Validation of mutations using external datasets analyzed in cBioPortal demonstrates EYA1 and CSMD3 mutations are associated with patient outcome. A) Enrichment of mutations in metastases (n = 2721) or primary tumors (n = 3929). Asterisk indicates significant q-value based on Fisher exact tests. B) Lollipop plot of EYA1 mutations in metastases (top) compared to primary tumors (bottom). C) Lollipop plot of CSMD3 mutations in metastases (top) compared to primary tumors (bottom). D) Kaplan-Meier plot of disease-specific survival comparing EYA1-mutant cases (n = 34) to EYA1 wild-type cases (n = 458). E) Kaplan-Meier plot of disease-free survival comparing EYA1-mutant TCGA cases (n = 18) to EYA1 wild-type TCGA cases (n = 316). F) Kaplan-Meier plot of progression-free survival comparing EYA1-mutant TCGA cases (n = 34) to EYA1 wild-type TCGA cases (n = 460). G) Kaplan-Meier plot of disease-specific survival comparing CSMD3-mutant TCGA cases (n = 63) to CSMD3 wild-type TCGA cases (n = 429). H) Kaplan-Meier plot of disease-free survival comparing CSMD3-mutant TCGA cases (n = 39) to CSMD3 wild-type TCGA cases (n = 295). I) Kaplan-Meier plot of progression-free survival comparing CSMD3-mutant TCGA cases (n = 63) to CSMD3 wild-type TCGA cases (n = 431).
Figure 5.
Validation of mutations using external datasets analyzed in cBioPortal demonstrates EYA1 and CSMD3 mutations are associated with patient outcome. A) Enrichment of mutations in metastases (n = 2721) or primary tumors (n = 3929). Asterisk indicates significant q-value based on Fisher exact tests. B) Lollipop plot of EYA1 mutations in metastases (top) compared to primary tumors (bottom). C) Lollipop plot of CSMD3 mutations in metastases (top) compared to primary tumors (bottom). D) Kaplan-Meier plot of disease-specific survival comparing EYA1-mutant cases (n = 34) to EYA1 wild-type cases (n = 458). E) Kaplan-Meier plot of disease-free survival comparing EYA1-mutant TCGA cases (n = 18) to EYA1 wild-type TCGA cases (n = 316). F) Kaplan-Meier plot of progression-free survival comparing EYA1-mutant TCGA cases (n = 34) to EYA1 wild-type TCGA cases (n = 460). G) Kaplan-Meier plot of disease-specific survival comparing CSMD3-mutant TCGA cases (n = 63) to CSMD3 wild-type TCGA cases (n = 429). H) Kaplan-Meier plot of disease-free survival comparing CSMD3-mutant TCGA cases (n = 39) to CSMD3 wild-type TCGA cases (n = 295). I) Kaplan-Meier plot of progression-free survival comparing CSMD3-mutant TCGA cases (n = 63) to CSMD3 wild-type TCGA cases (n = 431).
4. Discussion
Mortality from prostate cancer is due to metastases, and thus, understanding the mechanisms of metastasis is essential for improving patient outcomes. The heterogeneity of metastases is poorly understood, and there are multiple opposing models of prostate cancer metastasis. Some data using copy number analyses support a monoclonal model of metastasis, in which most metastatic lesions derive from a single clone despite the multiclonal nature of the primary tumor [
50]. Additional studies have shown limited genomic diversity in multiple metastases from the same men [
51]. However, conflicting studies support polyclonal seeding of metastases based on whole genome sequencing of 51 tumors from 10 patients [
52]. Some of the differences between these studies may be due to varying degrees of detail and granularity in the methods employed to molecularly characterize prostate cancer metastases. In addition, it is not clear if the genomic variability observed in metastases in some studies are due to mutations that occur at the metastatic site or if polyclonal populations from different primary foci seeded those metastases. Most studies that have analyzed intratumoral heterogeneity have examined a limited number of patients and used only the index lesion of the primary tumor.
Here, we aimed to address this gap in our understanding of heterogeneity in prostate cancer metastasis and to leverage unique resources of many patients from a clinical trial that includes both radical prostatectomy and extensive dissection of all pelvic LNs. Since metastasis to surrounding LNs is one of the first steps in metastatic spread, understanding intratumoral heterogeneity in pelvic LNs provides unique insights into the initial mechanisms and heterogeneity of prostate cancer metastasis. Additionally, analysis of multiple primary foci greatly increases the richness of our dataset and enables an integrated functional and clinical genomics approach to reveal genes driving aggressive metastatic prostate cancer.
Our data suggest that LNM contain multiple subclones that are already present in multiple related primary tumor foci. However, some subclones are more abundant in the LNM than in primary tumor foci, suggesting natural selection for these subclones and enrichment for specific mutations. We also show here that in addition to well-established oncogenes such as AR and TP53, mutations in lesser studied genes such as EYA1, CSMD3, FLT4, NCOR2, and PCDH15 are enriched in prostate cancer metastases relative to primary tumors. Additionally, mutations in FLT4 and PCDH15 are associated with the CE7 and CE8 gene expression ecotypes associated with Androgen, PI3K, MTORC, and OXPHOS pathways. Finally, using publicly available datasets, we have shown that mutations in EYA1 and CSMD3 are not only more frequent in metastases, but also are associated with worse disease-specific survival, progression-free survival, and disease-free survival.
The most frequent somatic mutation that we observed occurred in the Speckle-type POZ Protein (
SPOP) gene, which is essential for ubiquitination and subsequent proteasomal degradation [
53,
54]. Functional SPOP is necessary for the degradation of AR, and mutated SPOP fails to ubiquitinate AR, allowing for an increase of AR protein [
53,
54]. Consistent with previous findings, we observed recurrent mutations in the MATH binding domain of
SPOP, which mediates protein-protein interactions with AR. Recent studies have also identified important interactions between
SPOP and c-JUN [
55], in which mutated SPOP can bind to and stabilize c-JUN, potentially leading to accelerated cell proliferation. Nevertheless, SPOP mutations are more frequent in primary tumors than metastases, and patients with SPOP mutations have improved outcomes [
2].
Our findings are consistent with previous studies, and expand upon them further. While we did not observe driver mutations in KIF4A and WDR62 that have been observed in metastatic prostate cancer [
56], we did observe mutations in KIF5C, KIF7, KIF14, KIF18A, KIF20B, and KIF22, as well as in WDR17, WDR78, and WDR87, suggesting that other members of these families may also serve as cancer drivers. Consistent with earlier studies, we also observed mutations in FOXA1 [
2,
3,
5], NCOR2 [
9], and APC [
9].
The nuclear receptor corepressor 2 (
NCOR2) gene encodes a nuclear co-repressor (NCOR2) that mediates gene silencing. NCOR2 interacts with nuclear receptors, such as AR, to promote gene repression. Recently, it has been shown that reduced NCOR2 expression accelerates failure of androgen deprivation therapy in prostate cancer patients [
57]. Mutations in
NCOR2 could interfere with the ability of NCOR2 to regulate and maintain the epigenome and result in worse patient outcomes.
EYA transcriptional coactivator and phosphatase 1 (
EYA1) encodes a transcription factor on chr8 that interacts with the SIX1 transcription factor and has intrinsic phosphatase activity important for SIX1 activity during development [
58]. Mutations in
EYA1 may cause dysregulation and act as a tumor promoter with SIX1 via activation of STAT3 signaling in thyroid carcinomas [
59].
CUB and Sushi multiple domains 3 (
CSDM3) has been shown to be one of the ten most recurrently mutated genes in prostate cancer [
60] and SNPs in
CSMD3 are associated with risk of prostate cancer in Hispanic men [
61]. CSMD3 is also located on chr8 and
CSMD3 was identified as the second most frequently somatically mutated gene (next to TP53) in non-small cell lung cancer [
62].
CSMD3 mutation is highly correlated with increased tumor mutational burden and poor clinical prognosis in ovarian cancer [
63].
FMS-related receptor tyrosine kinase 4 (
FLT4, also known as VEGFR-3) is a receptor tyrosine kinase that binds to VEGF-C and VEGF-D. Mutations in
FLT4 have been associated with sentinel lymph node metastases in prostate cancer [
64], and FLT4 plays a critical role in prostate lymphangiogenesis [
65]. Our observation that FLT4 mutations are associated with carcinoma ecotypes is intriguing and future studies will examine a potential causative role for these mutations.
Protocadherin-related 15 (
PCDH15) is a member of the cadherin superfamily that mediates cell-cell adhesion. Somatic mutation of PCDH15 has been associated with metastasis in ocular adnexal sebaceous carcinoma [
66]. PCDH15 knockdown significantly increases ERK phosphorylation and activation, increasing proliferation of oligodendrocyte progenitor cells [
67]. Our observation that PCDH15 mutations are associated with specific carcinoma ecotypes is an avenue of future investigation.
Through our RNAseq analyses, we determined that genes that were differentially expressed between normal prostate, metastatic clones in the prostate, and non-metastatic clones in the prostate were enriched for oxidative phosphorylation. Interestingly, lower grade African-American samples with Gleason score 7 were also enriched for this gene set. Both low and high-grade African-American samples had increased expression of genes associated with adherens and tight junctions and mitotic spindle assembly, which could be related to the generally more aggressive phenotypes and worse outcomes associated with African-American patients [
68,
69].
There are a number of limitations to this study. First, while this study includes tissues from only 43 patients, of which only 16 had metastases, it does provide RNAseq analysis of 165 samples and WES analysis of 137 samples, with at least three and up to seven samples for each patient. Second, the samples used were macrodissected from FFPE blocks and thus were analyzed by bulk sequencing. Finally, while many of our observations were either consistent with previous literature or validated in large external datasets, functional validation of the observed mutations and gene expression changes are needed to better understand the mechanisms of metastasis and poor outcome for prostate cancer patients.
Author Contributions
Concept and Design: CSM, DMS, MGS, and AOO. Data Acquisition: CSM, MA, CLW, IOL, and OAAO. Data Analysis: CSM, ERK, DP, and BGB. Statistical Analysis: CSM, ERK, BGB, DP, and YH. Drafting of manuscript: CSM, CLW, and ERK. Critical revision of the manuscript: CSM, ERK, MA, IOL, BGB, DP, YH, DMS, MGS, and AOO. Obtained funding: CSM and MGS.