1. Introduction
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease, originally termed by the French neurologist Jean-Martin Charcot in 1869 to describe muscular atrophy (amyotrophic) and tissue scarring and hardening of tissue within the lateral spinal cord [
1,
2]. ALS is the most common form of motor neuron disease (MND) and is characterized by the progressive deterioration of both upper and lower motor neurons within the brain and spinal cord [
2,
3]. The incidence of which is 1.75 – 3 per 100 000 people per year and an increased incidence of 4 – 8 per 100 000 people per year in the highest ALS risk age group (45 – 75 years old) [
4]. Categorised as 2 groups, ALS can present as familial ALS where at least one family member of the affected individual has ALS, accounting for up to 10% of ALS cases, or sporadic ALS (sALS) where the affected individual has no prior family history which accounts for 90 – 95% of cases [
5]. The clinical features of typical ALS patients consist of muscle spasticity, atrophy, muscle wasting, weakness and death due to respiratory failure, with the average survival after symptom onset lying between 3-5 years [
1,
4,
6].
Neurodegenerative diseases are complex disorders, involving both environmental and genetic factor interactions. Therefore, attempts to elucidate disease mechanisms and pathogenic genetic variants are essential. Previous research has identified more than 30 genes associated with ALS, highlighting four genes
SOD1, TARDBP (
TDP-43)
, C90RF72 and FUS for harbouring pathogenic mutations which cause the greatest number of ALS cases [
7,
8,
9,
10]. Although these four genes have been identified as major ALS-associated genes, a 2017 meta-analysis study demonstrated that within European and Asian populations, these genes only contribute to 47.7% and 5.2% of familial and sporadic cases, respectively [
11]. Following the identification of these pathogenic mutations, several pathological mechanisms have been implicated in ALS pathogenesis including oxidative stress, mitochondrial dysfunction, axonal transport, inflammation, toxic protein aggregation and RNA metabolism and toxicity [
7]. In addition to identified genetic variants only explaining a small fraction of sporadic aetiology, twin studies have highlighted the importance of genetic risk factors within sporadic ALS, estimating a 61% heritability [
12]. Therefore, as the exact causation of ALS is still undetermined, a better understanding of disease pathogenesis and identification of genetic biomarkers is essential. To further investigate the missing heritability of ALS, Theunissen et al., proposed structural variations (SVs) as an area of potential significance [
13]. SVs are classified as insertions, inversions, deletions and microsatellites, usually of repetitive structure, that are predominantly present within non-coding DNA regions and contribute towards genomic variation [
13,
14]. Furthermore, SVs have demonstrated the ability to modulate gene expression and have already been implicated in neurodegenerative diseases, such as the
C9orf72 repeat expansion in ALS and frontotemporal dementia (FTD) [
15,
16]. Hence, as 99% of the human genome is non-coding DNA, continued research into these regions is crucial to provide new insights into disease pathogenesis and to identify new potential targets for therapeutics.
Repetitive DNA is a major contributor to structural genomic variation. One form of repetitive DNA is a group of endogenous transposable elements (TEs), which can exist in both a static form and a mobile form. TEs are categorised into two classes known as DNA transposons and retrotransposons, whereby the later class possesses the ability to propagate throughout the genome via a ‘copy-and-paste’ mechanism, involving an RNA intermediate [
17]. This results in the insertion of a new retrotransposon copy at a new locus within the host genome. Although, DNA transposons are capable of mobilisation via a ‘cut-and-paste’ mechanism, they are not currently active within the human genome [
8,
18]. Originally dismissed as “junk” DNA, TEs are known to drive genetic diversity not only by contributing to regulation and evolution of the genome but also by contributing to genetic instability and disease progression [
18,
19]. Previous research by Prudencio et al., highlighted the potential implication of retrotransposons in ALS through the analysis of repetitive element expression using RNA sequencing data from both healthy controls as well as
C9orf72 expansion positive carriers and sporadic ALS patients (
C9orf72-negative) [
20]. This research revealed that repetitive element expression, including retrotransposons, was significantly increased in ALS patients with the C9orf72 expansion in comparison to C9orf72-negative patients and healthy controls, thus, suggesting retrotransposon involvement in ALS [
20]. However, until recently these elements have been largely overlooked in relation to neurodegenerative diseases, even though TEs constitute to around 45% of the human genome [
21].
Retrotransposons are further subdivided into two groups dependent on the presence of long terminal repeats (LTRs), known as LTR and non-LTR retrotransposons [
18]. SINE-VNTR-Alu (SVA) elements are a member of the non-LTR retrotransposon family, which are typically 0.7 – 4kb in length [
22]. SVAs are classified by evolutionary age based on their SINE-R region into sub-family groups A-F, whereby subfamily SVA-F is the youngest in evolutionary history [
23]. Full-length SVA elements contain a 5’ CT element, Alu-like region, GC-rich VNTR (variable number tandem repeat), SINE (short interspersed nuclear element)-R domain and a 3’ poly-A tail [
23] (
Figure 1b). SVAs are major contributors to genetic diversity through a variety of different mechanisms, including acting as transcriptional regulators by providing alternative splice sites, polyadenylation signals and promoters whilst harbouring sites for transcription factor (TF) binding, thus, modulating gene expression [
24]. SVAs are not only polymorphic in structure but ongoing mobilisation has resulted in SVAs being polymorphic for presence or absence within the genome and thus are termed retrotransposon insertion polymorphisms (RIPs) [
19]. This adds additional layers of complexity to gene expression dynamics, indicating that SVAs could be associated with predisposition to disease [
25].
Previous work within our group has shown that seven SVAs polymorphic for their presence or absence were significantly associated with Parkinson’s disease progression and differential gene expression in Parkinson’s disease (PD) patients using the Parkinson’s Progression Markers Initiative (PPMI) cohort [
22,
26]. This involved an SVA named SVA_67 which is located at the MAPT locus, a locus which has been implicated in neurodegenerative disease risk including PD, FTD and Alzheimer’s disease [
27,
28,
29]. By using a clustered regularly interspaced short palindromic repeats (CRISPR) cell line knock-out model for SVA_67, we demonstrated that this SVA was significantly associated with differential gene expression of three genes at the MAPT locus, associated with neurodegeneration [
30].
In this study, we analysed SVAs which are present in the human reference genome identified as RIPs in the cohort analysed, herein termed reference SVA RIPs to investigate differential gene expression patterns associated with ALS. For this analysis, we used whole genome sequencing (WGS) and transcriptomic data obtained from the New York Genome Center (NYGC) ALS Consortium, to elucidate the role of SVA insertion polymorphisms on gene expression in central nervous system (CNS) tissues in healthy controls and ALS patients (
Figure 1a). Our analysis demonstrated that reference SVA RIPs regulate single and multiple gene targets genome-wide within CNS tissues and display tissue-specific gene modulation by comparing spinal cord, cerebellum, motor cortex and frontal cortex tissue analyses. Furthermore, we discovered that SVA RIPs influence the expression of genes at loci (
HLA and
MAPT) previously associated with ALS, highlighting SVA regulation of genes at this locus could be a potential mechanism involved in ALS pathology.
3. Discussion
In this study, we evaluated the role of reference SVAs polymorphic for their presence in the human genome, to modulate gene expression within CNS tissues of ALS patients and healthy controls. Analysis of the NYGC ALS consortium dataset demonstrated that SVA RIPs significantly regulate gene expression genome-wide and in a tissue-specific manner. This study continues to expand on our previous findings, demonstrating the capability of SVAs to differentially regulate gene expression [
24,
33]. We have previously illustrated the capacity of SVAs to influence gene expression within Parkinson’s disease, highlighting the role of SVA_67 to modulate genes at the
MAPT locus within the PPMI cohort and a CRISPR deletion model [
22,
26,
30]. This analysis further validates our previous research, illustrating the functional capacity of SVA_67 within neurodegenerative disease and potentially expanding the importance of SVA_67 and the
MAPT locus to ALS (
Figure 5 and
Figure 6). Therefore, this study not only emphasizes the correlation between SVA presence or absence and differential gene expression but also the involvement of SVAs within disease pathology.
Using WGS data from the ALS consortium for matrix eQTL analysis, we identified that polymorphic SVA RIPs possess the ability to modify expression of multiple target genes, but also many SVAs can regulate gene expression of a single target. Of these SVA RIPs, a greater proportion of
trans regulatory effects were displayed in comparison to
cis. Similarly, eQTL studies by both Wang et al. (2017) and Koks et al. (2021) analysing RNA-seq data from the 1000 genome project and the PPMI cohort, respectively, identified the impact of TEs on gene expression. In line with our analysis, these studies highlighted that several TEs simultaneously modulate gene expression of a single gene, an individual TE can regulate the expression of multiple genes, and that a greater number of TEs analysed were in the
trans position [
33,
34]. A potential mechanism for
trans-acting eQTLs could be through the binding of CCCTC-binding factor (CTCF) to SVAs. Through the cooperation with protein complex cohesion, CTCF plays a key role in three-dimensional chromatin regulation and chromatin looping, bringing promoters and regulatory elements within close proximity to activate or repress gene expression [
35,
36]. An additional mechanism could be through indirect transcription factor (TF) mediated associations. This suggests that SVAs influencing the expression of TF could indirectly regulate one or multiple TF gene targets, ultimately activating or repressing TF target gene expression [
34].
Upon the examination of beta values from the eQTL analysis, we determined that within the combined analysis of CNS tissues SVA_55 demonstrated the greatest effect on myelin basic protein (
MBP) gene upregulation (
Figure 4). Following further analysis, we identified that through individual tissue analysis the top hits for gene upregulation and downregulation, were SVAs influencing
MBP gene expression (
Table 2 and
Table 3). However,
MBP expression was only significantly modulated within spinal cord tissue. MBP is a key protein involved in the myelination process, whereby myelin sheaths are formed around CNS axons by oligodendrocytes [
37,
38]. As oligodendrocyte loss and myelin dysfunction has recently been emphasized in neurodegenerative disease, including ALS, it is essential to investigate this relationship between SVAs and
MBP differential expression [
39,
40]. Lorente Pons et al. demonstrated the potential significance of MBP in ALS through post-mortem analysis of both sporadic ALS and C9orf72-related ALS cases, identifying a significant reduction in MBP protein abundance when normalised to proteolipid protein (PLP) in the spinal cord corticospinal tracts in ALS cases in comparison to controls [
41]. As
MBP mRNA is transported to the myelin compartment by the RNA transport granule and PLP is transported as a protein, this suggests that the reduction in MBP could be due to impaired mRNA transport [
41]. Our data suggests that certain SVAs (SVA_90 and SVA_5) act to downregulate
MBP gene expression, therefore SVAs could play a role in this mechanism leading to the reduction of MBP protein levels observed in ALS patients compared to controls.
Our analysis also demonstrated that SVAs can activate (SVA_37) and repress (SVA_87 and SVA_93)
PLP1, a form of PLP.
PLP1 has previously been implicated in Pelizaeus-Merzbacher disease (PMD), an X-linked neurodegenerative disease whereby mutations within this gene inhibit CNS myelination [
42,
43]. In addition, we demonstrate that the myelin-associated oligodendrocyte basic protein (
MOBP) gene, the locus of which has been highlighted for ALS risk, is again potentially regulated by multiple SVAs (SVA_5, 15, 37, 55, 84, 85, 87, 91 and 93) only within the spinal cord [
31]. Hence, SVA regulation could be a potential mechanism involved in ALS risk at the
MOBP locus. Furthermore, our tissue specific analysis displayed that multiple mitochondrial genes (
MT-ND1,2,3,4,5, MT-CYB,
MT-CO2,3 and pseudogene
MTCO1P12) were largely activated and repressed within all four tissues analysed. SVA_30 and SVA_70 displayed the greatest regulatory effects on mitochondrial genes modulating five and four gene targets, respectively. As mitochondrial dysfunction is known to be implicated in other neurodegenerative diseases including PD and Alzheimer’s disease, as well as ALS, SVA regulation resulting in differential expression could be an underlying mechanism involved in disease pathology [
44,
45].
Upon analysis of CNS tissues from both ALS individuals and healthy controls, seven of the top ten
cis-acting reference SVA RIPs imposed the greatest effects on the upregulation of genes at the human leukocyte antigen (
HLA) locus (
Figure 6A). Four SVAs are responsible for the effects on
HLA gene expression, whereby SVA_24 influences the expression of one gene (
HLA-A), SVA_25 two genes (
HLA-B and
HLA-C), SVA_27 two genes (
HLA-DRB1 and
HLA-DRB5) and SVA_88 one gene (
HLA-DQB1). HLA also referred to as the major histocompatibility complex (MHC), acts to regulate both innate and adaptive immunity involved in the human immune response [
46]. Since, the involvement of the immune response in neurological disease has been recognised the
HLA locus has been highlighted as a region of importance in numerous neurodegenerative disease, including ALS [
46]. Various studies have investigated the significance and mechanism of HLA in ALS, demonstrating increased frequencies of
HLA-A,
HLA-B and
HLA-C alleles in ALS cases compared to controls [
47,
48,
49,
50]. A recent large-scale GWAS conducted by Van Rheenen et al. has identified the
HLA region as a locus significantly associated with ALS, further highlighting the importance of this locus [
31]. Previous studies within our group have highlighted the capability of SVAs to modulate
HLA gene expression; analysis of whole genome sequencing and transcriptomic data obtained from the whole blood of individuals within the PPMI cohort discovered that SVA_24, SVA_25 and SVA_27 modulate the expression of
HLA-A,
HLA-B and
HLA-C, and
HLA-DRB1 and
HLA-DRB5, respectively [
33,
51]. This suggests that the modulation of
HLA genes by SVAs could be a common mechanism within neurodegenerative disease.
In conclusion, we show that SVAs demonstrate a significant impact on the expression of individual or numerous genes including those previously associated with neurodegenerative diseases, such as ALS. Ultimately, the ability of SVAs to act as a regulatory domain could highlight the importance of TEs in the missing heritability of neurodegenerative disease. However, due to limitations such as low n numbers for some SVA genotypes and CNS tissue types (occipital cortex, temporal cortex, and hippocampus) further research investigating the involvement of TEs in the pathogenesis of neurodegenerative disease, specifically ALS is crucial. In addition, due to low n numbers in the ALS and control groups, individual group analysis was not possible. Therefore, future experiments investigating the influence of SVAs in ALS patient and control groups individually is required. Furthermore, although this analysis continues to demonstrate the potential role of SVAs in neurodegenerative disease further experiments such as CRISPR are essential to validate SVA-specific influences. For example, we have previously shown that SVA_67 deletion in a CRISPR model resulted in a significant increase in
MAPT and
LRRC37A gene expression [
30].
Figure 1.
General study overview, structure of full-length SVA element and both cis- and trans-acting mechanisms. (a) This study incorporated whole genome sequencing and transcriptomic data from the New York Genome Centre ALS Consortium cohort, to investigate the ability of SVA retrotransposon insertion polymorphisms (RIPs) to act as expression quantitative trait loci (eQTL) within central nervous system (CNS) tissues. (b) Schematic of SVA structure, displaying a full-length SVA element consisting of a 5’ CT rich hexamer repeat, Alu-like region, variable number tandem repeat (VNTR), SINE (short interspersed nuclear element)-R domain and a 3’ poly-A tail. (c) The mechanism by which SVAs implement a cis or trans regulatory effect. Cis-regulatory effects are defined as effects observed by elements (SVA RIPs) which act to modulate the expression of genes less than 1 Mb away from the element site, whilst trans regulatory effects are defined as effects observed by elements which act to modulate the expression of genes greater than 1 Mb away from the element site.
Figure 1.
General study overview, structure of full-length SVA element and both cis- and trans-acting mechanisms. (a) This study incorporated whole genome sequencing and transcriptomic data from the New York Genome Centre ALS Consortium cohort, to investigate the ability of SVA retrotransposon insertion polymorphisms (RIPs) to act as expression quantitative trait loci (eQTL) within central nervous system (CNS) tissues. (b) Schematic of SVA structure, displaying a full-length SVA element consisting of a 5’ CT rich hexamer repeat, Alu-like region, variable number tandem repeat (VNTR), SINE (short interspersed nuclear element)-R domain and a 3’ poly-A tail. (c) The mechanism by which SVAs implement a cis or trans regulatory effect. Cis-regulatory effects are defined as effects observed by elements (SVA RIPs) which act to modulate the expression of genes less than 1 Mb away from the element site, whilst trans regulatory effects are defined as effects observed by elements which act to modulate the expression of genes greater than 1 Mb away from the element site.
Figure 2.
Composition of tissues used in this study. For this study, CNS tissue data from healthy controls (CO) and ALS patients combined (n=1903) composed of spinal cord (n=710), motor cortex (n=440), frontal cortex (n=335), cerebellum (n=240), occipital cortex (n=57), temporal cortex (n=58) and hippocampus (n=63) were included for our analysis.
Figure 2.
Composition of tissues used in this study. For this study, CNS tissue data from healthy controls (CO) and ALS patients combined (n=1903) composed of spinal cord (n=710), motor cortex (n=440), frontal cortex (n=335), cerebellum (n=240), occipital cortex (n=57), temporal cortex (n=58) and hippocampus (n=63) were included for our analysis.
Figure 3.
Overview of the number of genomic loci affected by SVA polymorphism following matrix eQTL analysis of all CNS tissues. (a) Pie chart representing the composition of all significant differentially regulated genetic loci (n=14830), displaying the number of cis-regulatory (n=167) and trans-regulatory (n=14663) effects exhibited by SVAs within all CNS tissues. (b) Bar chart displaying reference SVAs and the number of genome wide gene targets. Each of the 92 analysed SVAs had a significant impact on multiple targets, with the lowest number of targets for one SVA being 6 (FDR p<0.05). Only SVAs affecting more than 200 targets are displayed (n=26). For this analysis data from ALS individuals and healthy controls were combined.
Figure 3.
Overview of the number of genomic loci affected by SVA polymorphism following matrix eQTL analysis of all CNS tissues. (a) Pie chart representing the composition of all significant differentially regulated genetic loci (n=14830), displaying the number of cis-regulatory (n=167) and trans-regulatory (n=14663) effects exhibited by SVAs within all CNS tissues. (b) Bar chart displaying reference SVAs and the number of genome wide gene targets. Each of the 92 analysed SVAs had a significant impact on multiple targets, with the lowest number of targets for one SVA being 6 (FDR p<0.05). Only SVAs affecting more than 200 targets are displayed (n=26). For this analysis data from ALS individuals and healthy controls were combined.
Figure 4.
Reference SVA RIP elements with the greatest effect size from matrix eQTL analysis of all CNS tissues. (a/b) Clustered bar chart showing the top ten reference SVA RIPs across all CNS tissues with the greatest effect size on gene upregulation (positive beta values) (a) and gene downregulation (negative beta values) (b) from eQTL analysis. SVA_55 demonstrated the greatest increase in activation of the MBP gene with a beta coefficient of 272,847, whilst SVA_70 demonstrated the greatest repressive effect on the MT-ND1 gene with a beta coefficient of -36441. (c) Boxplot of SVA_55 indicating MBP gene expression stratified by SVA_55 genotype. Genotypes PP (n=2), PA (n=20) and AA (n=1731). Significant differences in MBP gene expression were observed between the PP and PA group (p=0.0218) and PP and AA group (p=0.0218). Subjects with the PP genotype displayed a 9.5-fold and 19.2-fold increase in MBP gene expression in comparison to subjects with AA and PA genotypes respectively. (d) Boxplot of SVA_70 displaying MT-ND1 gene expression, stratified by SVA_70 genotype. Genotypes PP (n=1581), PA (n=159) and AA (n=13). A significant repression in MT-ND1 gene expression of 55% was observed between the PP and PA subject group (p=0.0098). No statistical significance was obtained for differences between the PP and PA subject groups. For both boxplots the significance of gene expression changes between groups was determined using the Wilcoxon pairwise comparison with FDR adjusted p-values (FDR p<0.05). PP, PA, and AA groups represent when there are two copies of the SVA present, one copy of the SVA present and the complete absence of the SVA, respectively. * p<0.05, ** p<0.01.
Figure 4.
Reference SVA RIP elements with the greatest effect size from matrix eQTL analysis of all CNS tissues. (a/b) Clustered bar chart showing the top ten reference SVA RIPs across all CNS tissues with the greatest effect size on gene upregulation (positive beta values) (a) and gene downregulation (negative beta values) (b) from eQTL analysis. SVA_55 demonstrated the greatest increase in activation of the MBP gene with a beta coefficient of 272,847, whilst SVA_70 demonstrated the greatest repressive effect on the MT-ND1 gene with a beta coefficient of -36441. (c) Boxplot of SVA_55 indicating MBP gene expression stratified by SVA_55 genotype. Genotypes PP (n=2), PA (n=20) and AA (n=1731). Significant differences in MBP gene expression were observed between the PP and PA group (p=0.0218) and PP and AA group (p=0.0218). Subjects with the PP genotype displayed a 9.5-fold and 19.2-fold increase in MBP gene expression in comparison to subjects with AA and PA genotypes respectively. (d) Boxplot of SVA_70 displaying MT-ND1 gene expression, stratified by SVA_70 genotype. Genotypes PP (n=1581), PA (n=159) and AA (n=13). A significant repression in MT-ND1 gene expression of 55% was observed between the PP and PA subject group (p=0.0098). No statistical significance was obtained for differences between the PP and PA subject groups. For both boxplots the significance of gene expression changes between groups was determined using the Wilcoxon pairwise comparison with FDR adjusted p-values (FDR p<0.05). PP, PA, and AA groups represent when there are two copies of the SVA present, one copy of the SVA present and the complete absence of the SVA, respectively. * p<0.05, ** p<0.01.
Figure 5.
Boxplots of the two of the most significant SVA_67 interactions with MAPK8IP1P2 and LRRC37A obtained from matrix eQTL analysis. Both significant effects are cis-regulatory effects. Datapoints from both ALS individuals and healthy controls across all CNS tissues were combined and the significance of gene expression changes between groups was determined using the Wilcoxon pairwise comparison with FDR adjusted p-values (FDR<0.05). (a) Boxplot showing the association of SVA_67 genotype with MAPK8IP1P2 gene expression. Significant differences were observed between all groups for MAPK8IP1P2 expression, namely PP and PA (p=7.33E-237), PP and AA (p=4.26E-55) and PA and AA (p=1.33E-07). (b) Boxplot showing the association of SVA_67 genotype with LRRC37A gene expression. Significant differences were observed between all groups for LRRC37A gene expression, PP and PA (p=1.95E-204), namely PP and AA (p=1.06E-39) and PA and AA (p=2.13E-12). Fold change in expression of 262-fold and 6-fold for MAPK8IP1P2 and LRRC37A, respectively, was observed for individuals with AA genotype in comparison to PP genotype. PP (n=1183), PA (n=500) and AA (n=63). *** p<0.001.
Figure 5.
Boxplots of the two of the most significant SVA_67 interactions with MAPK8IP1P2 and LRRC37A obtained from matrix eQTL analysis. Both significant effects are cis-regulatory effects. Datapoints from both ALS individuals and healthy controls across all CNS tissues were combined and the significance of gene expression changes between groups was determined using the Wilcoxon pairwise comparison with FDR adjusted p-values (FDR<0.05). (a) Boxplot showing the association of SVA_67 genotype with MAPK8IP1P2 gene expression. Significant differences were observed between all groups for MAPK8IP1P2 expression, namely PP and PA (p=7.33E-237), PP and AA (p=4.26E-55) and PA and AA (p=1.33E-07). (b) Boxplot showing the association of SVA_67 genotype with LRRC37A gene expression. Significant differences were observed between all groups for LRRC37A gene expression, PP and PA (p=1.95E-204), namely PP and AA (p=1.06E-39) and PA and AA (p=2.13E-12). Fold change in expression of 262-fold and 6-fold for MAPK8IP1P2 and LRRC37A, respectively, was observed for individuals with AA genotype in comparison to PP genotype. PP (n=1183), PA (n=500) and AA (n=63). *** p<0.001.
Figure 6.
Top ten cis-acting reference SVA elements with the greatest effects on gene upregulation and downregulation in CNS tissues. (a) Clustered bar chart displaying ten reference SVA RIPs with the greatest cis-regulatory effects on gene activation (most positive beta values) from matrix eQTL analysis (FDR p<0.05). SVA_67 is responsible for the two greatest increases in activation of the genes LRRC37A4P and MAPT displaying beta coefficients of 1384 and 1101, respectively. SVA_24 (HLA-A), SVA_25 (HLA-C and HLA-B), SVA_27 (HLA-DRB1, HLA-DRB5 and HLA-DQB1) and SVA_88 (HLA-DPA1) showed to be responsible for the large increases in activation of a series of HLA genes. (b) Clustered bar chart displaying ten reference SVA RIPs with the greatest cis-regulatory effects on gene downregulation (most negative beta values) from matrix eQTL analysis (FDR p<0.05). The two greatest repressive effects were on the FCGBP gene, regulated by both SVA_73 and SVA_72 demonstrating a beta coefficient of -2843 and -668, respectively. SVA_67 was responsible for four of these effects, by showing a downregulating effect, for the genes KANSL1, LRRC37A, LRRC37A2 and MAPK8IP1P2. This analysis combined ALS individuals and healthy controls datapoints combined.
Figure 6.
Top ten cis-acting reference SVA elements with the greatest effects on gene upregulation and downregulation in CNS tissues. (a) Clustered bar chart displaying ten reference SVA RIPs with the greatest cis-regulatory effects on gene activation (most positive beta values) from matrix eQTL analysis (FDR p<0.05). SVA_67 is responsible for the two greatest increases in activation of the genes LRRC37A4P and MAPT displaying beta coefficients of 1384 and 1101, respectively. SVA_24 (HLA-A), SVA_25 (HLA-C and HLA-B), SVA_27 (HLA-DRB1, HLA-DRB5 and HLA-DQB1) and SVA_88 (HLA-DPA1) showed to be responsible for the large increases in activation of a series of HLA genes. (b) Clustered bar chart displaying ten reference SVA RIPs with the greatest cis-regulatory effects on gene downregulation (most negative beta values) from matrix eQTL analysis (FDR p<0.05). The two greatest repressive effects were on the FCGBP gene, regulated by both SVA_73 and SVA_72 demonstrating a beta coefficient of -2843 and -668, respectively. SVA_67 was responsible for four of these effects, by showing a downregulating effect, for the genes KANSL1, LRRC37A, LRRC37A2 and MAPK8IP1P2. This analysis combined ALS individuals and healthy controls datapoints combined.
Table 1.
Top ten most significant reference SVA RIP effects from matrix eQTL analysis of all CNS tissues. This analysis included the combination of both cis and trans effects as well as datapoints from both ALS individuals and healthy controls.
Table 1.
Top ten most significant reference SVA RIP effects from matrix eQTL analysis of all CNS tissues. This analysis included the combination of both cis and trans effects as well as datapoints from both ALS individuals and healthy controls.
SVA |
beta value |
False Discovery Rate (FDR) |
Target gene |
cis/trans effect |
SVA_67 |
-131.2 |
1.93E-303 |
MAPK8IP1P2 |
cis |
SVA_67 |
-5.2 |
1.93E-303 |
ENSG00000285668.1 |
cis |
SVA_67 |
-315.1 |
5.12E-299 |
LRRC37A |
cis |
SVA_87 |
-126.2 |
3.02E-224 |
MTND4P24 |
trans |
SVA_93 |
-126.2 |
4.22E-224 |
MTND4P24 |
trans |
SVA_84 |
-63.1 |
4.22E-224 |
MTND4P24 |
trans |
SVA_58 |
4.1 |
3.89E-211 |
LLPH-DT |
cis |
SVA_24 |
12.3 |
5.06E-201 |
HLA-K |
cis |
SVA_15 |
59.6 |
3.13E-189 |
MTND4P24 |
trans |
SVA_33 |
48.4 |
8.84E-188 |
ZFAND2A-DT |
cis |
Table 2.
Top 40 significant reference SVA RIPs with the greatest effects on gene upregulation (most positive beta values) from tissue specific matrix eQTL analysis. This analysis included the combination of both cis and trans effects as well as datapoints from both ALS individuals and healthy controls.
Table 2.
Top 40 significant reference SVA RIPs with the greatest effects on gene upregulation (most positive beta values) from tissue specific matrix eQTL analysis. This analysis included the combination of both cis and trans effects as well as datapoints from both ALS individuals and healthy controls.
SVA |
Gene ID |
FDR p-value |
Beta value |
Gene |
Chr |
Cis/trans |
Tissue |
SVA_55 |
ENSG00000197971.16 |
5.99E-10 |
638804.662 |
MBP |
18 |
trans |
Spinal Cord |
SVA_15 |
ENSG00000197971.16 |
3.59E-07 |
620026.724 |
MBP |
18 |
trans |
Spinal Cord |
SVA_37 |
ENSG00000197971.16 |
8.76E-04 |
585791.502 |
MBP |
18 |
trans |
Spinal Cord |
SVA_85 |
ENSG00000197971.16 |
2.08E-03 |
560691.911 |
MBP |
18 |
trans |
Spinal Cord |
SVA_37 |
ENSG00000123560.14 |
3.72E-12 |
168441.474 |
PLP1 |
X |
trans |
Spinal Cord |
SVA_55 |
ENSG00000198888.2 |
1.47E-05 |
153947.476 |
MT-ND1 |
MT |
trans |
Spinal Cord |
SVA_55 |
ENSG00000203930.12 |
5.97E-04 |
148616.177 |
LINC00632 |
X |
trans |
Motor Cortex |
SVA_15 |
ENSG00000198888.2 |
1.62E-03 |
142961.051 |
MT-ND1 |
MT |
trans |
Spinal Cord |
SVA_55 |
ENSG00000203930.12 |
1.25E-16 |
115977.491 |
LINC00632 |
X |
trans |
Spinal Cord |
SVA_15 |
ENSG00000203930.12 |
1.86E-08 |
96910.7666 |
LINC00632 |
X |
trans |
Spinal Cord |
SVA_55 |
ENSG00000180354.16 |
1.34E-22 |
90266.4311 |
MTURN |
7 |
trans |
Spinal Cord |
SVA_15 |
ENSG00000123560.14 |
3.43E-03 |
80380.5623 |
PLP1 |
X |
trans |
Spinal Cord |
SVA_85 |
ENSG00000203930.12 |
4.76E-03 |
78457.5282 |
LINC00632 |
X |
trans |
Spinal Cord |
SVA_15 |
ENSG00000180354.16 |
1.86E-12 |
77839.9088 |
MTURN |
7 |
trans |
Spinal Cord |
SVA_55 |
ENSG00000198712.1 |
3.04E-03 |
76318.8326 |
MT-CO2 |
MT |
trans |
Spinal Cord |
SVA_85 |
ENSG00000180354.16 |
1.35E-05 |
67900.9382 |
MTURN |
7 |
trans |
Spinal Cord |
SVA_85 |
ENSG00000237973.1 |
4.10E-33 |
66702.2584 |
MTCO1P12 |
1 |
trans |
Motor Cortex |
SVA_37 |
ENSG00000168314.18 |
3.50E-09 |
65555.4973 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_85 |
ENSG00000237973.1 |
1.01E-32 |
53997.5748 |
MTCO1P12 |
1 |
trans |
Frontal Cortex |
SVA_15 |
ENSG00000168314.18 |
5.02E-09 |
53362.8451 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_55 |
ENSG00000237973.1 |
3.61E-22 |
52048.0183 |
MTCO1P12 |
1 |
trans |
Motor Cortex |
SVA_55 |
ENSG00000168314.18 |
1.38E-09 |
49216.1425 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_15 |
ENSG00000237973.1 |
2.83E-26 |
48449.3666 |
MTCO1P12 |
1 |
trans |
Motor Cortex |
SVA_85 |
ENSG00000168314.18 |
3.76E-04 |
47350.2242 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_85 |
ENSG00000237973.1 |
8.82E-42 |
39521.0532 |
MTCO1P12 |
1 |
trans |
Spinal Cord |
SVA_55 |
ENSG00000064787.13 |
1.62E-30 |
36504.4209 |
BCAS1 |
20 |
trans |
Spinal Cord |
SVA_37 |
ENSG00000091513.16 |
1.65E-06 |
35838.5557 |
TF |
3 |
trans |
Spinal Cord |
SVA_55 |
ENSG00000237973.1 |
1.49E-59 |
33728.2747 |
MTCO1P12 |
1 |
trans |
Spinal Cord |
SVA_4 |
ENSG00000237973.1 |
3.59E-17 |
32837.5825 |
MTCO1P12 |
1 |
trans |
Frontal Cortex |
SVA_37 |
ENSG00000237973.1 |
1.52E-23 |
32783.3654 |
MTCO1P12 |
1 |
trans |
Motor Cortex |
SVA_37 |
ENSG00000237973.1 |
1.72E-31 |
32728.512 |
MTCO1P12 |
1 |
trans |
Frontal Cortex |
SVA_15 |
ENSG00000237973.1 |
8.40E-41 |
32193.3558 |
MTCO1P12 |
1 |
trans |
Spinal Cord |
SVA_37 |
ENSG00000099194.6 |
2.82E-04 |
31333.7253 |
SCD |
10 |
trans |
Spinal Cord |
SVA_85 |
ENSG00000064787.13 |
3.78E-10 |
30888.7058 |
BCAS1 |
20 |
trans |
Spinal Cord |
SVA_15 |
ENSG00000237973.1 |
3.46E-14 |
30602.307 |
MTCO1P12 |
1 |
trans |
Frontal Cortex |
SVA_15 |
ENSG00000064787.13 |
1.97E-15 |
30356.5865 |
BCAS1 |
20 |
trans |
Spinal Cord |
SVA_37 |
ENSG00000237973.1 |
1.28E-06 |
30025.949 |
MTCO1P12 |
1 |
trans |
Cerebellum |
SVA_15 |
ENSG00000237973.1 |
5.29E-13 |
29472.617 |
MTCO1P12 |
1 |
trans |
Cerebellum |
SVA_55 |
ENSG00000237973.1 |
2.00E-08 |
27693.2393 |
MTCO1P12 |
1 |
trans |
Frontal Cortex |
SVA_4 |
ENSG00000237973.1 |
2.56E-09 |
27128.1241 |
MTCO1P12 |
1 |
trans |
Motor Cortex |
Table 3.
Top 40 significant reference SVA RIPs with the greatest effects on gene downregulation (most negative beta values) from tissue specific matrix eQTL analysis. This analysis included the combination of both cis and trans effects as well as datapoints from both ALS individuals and healthy controls.
Table 3.
Top 40 significant reference SVA RIPs with the greatest effects on gene downregulation (most negative beta values) from tissue specific matrix eQTL analysis. This analysis included the combination of both cis and trans effects as well as datapoints from both ALS individuals and healthy controls.
SVA |
Gene ID |
FDR p-value |
Beta value |
Gene |
Chr |
Cis/trans |
Tissue |
SVA_90 |
ENSG00000197971.16 |
3.28E-03 |
-1337611.1 |
MBP |
18 |
trans |
Spinal Cord |
SVA_5 |
ENSG00000197971.16 |
3.06E-03 |
-351064.11 |
MBP |
18 |
trans |
Spinal Cord |
SVA_87 |
ENSG00000123560.14 |
1.98E-07 |
-193423.77 |
PLP1 |
X |
trans |
Spinal Cord |
SVA_93 |
ENSG00000123560.14 |
2.05E-07 |
-193266.83 |
PLP1 |
X |
trans |
Spinal Cord |
SVA_30 |
ENSG00000198888.2 |
8.47E-03 |
-143159.48 |
MT-ND1 |
MT |
trans |
Cerebellum |
SVA_70 |
ENSG00000198886.2 |
1.029E-02 |
-133464.74 |
MT-ND4 |
MT |
trans |
Cerebellum |
SVA_30 |
ENSG00000198763.3 |
7.85E-04 |
-131182.45 |
MT-ND2 |
MT |
trans |
Motor Cortex |
SVA_30 |
ENSG00000198938.2 |
6.86E-03 |
-111339.82 |
MT-CO3 |
MT |
trans |
Cerebellum |
SVA_30 |
ENSG00000198727.2 |
5.66E-03 |
-104147.35 |
MT-CYB |
MT |
trans |
Cerebellum |
SVA_84 |
ENSG00000123560.14 |
2.11E-07 |
-96556.036 |
PLP1 |
X |
trans |
Spinal Cord |
SVA_30 |
ENSG00000198786.2 |
7.25E-03 |
-94294.133 |
MT-ND5 |
MT |
trans |
Motor Cortex |
SVA_70 |
ENSG00000198763.3 |
1.29E-02 |
-82929.066 |
MT-ND2 |
MT |
trans |
Cerebellum |
SVA_90 |
ENSG00000259001.3 |
6.14E-03 |
-73740.939 |
ENSG00000259001 |
14 |
trans |
Spinal Cord |
SVA_91 |
ENSG00000203930.12 |
2.02E-03 |
-71147.071 |
LINC00632 |
X |
trans |
Spinal Cord |
SVA_87 |
ENSG00000168314.18 |
3.60E-04 |
-67162.942 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_93 |
ENSG00000168314.18 |
3.67E-04 |
-67112.229 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_91 |
ENSG00000180354.16 |
2.99E-05 |
-57438.97 |
MTURN |
7 |
trans |
Spinal Cord |
SVA_90 |
ENSG00000168309.18 |
4.26E-03 |
-48145.904 |
FAM107A |
3 |
trans |
Spinal Cord |
SVA_5 |
ENSG00000180354.16 |
1.55E-05 |
-43141.656 |
MTURN |
7 |
trans |
Spinal Cord |
SVA_70 |
ENSG00000198712.1 |
6.03E-03 |
-43049.326 |
MT-CO2 |
MT |
trans |
Cerebellum |
SVA_90 |
ENSG00000168309.18 |
2.45E-03 |
-40637.548 |
FAM107A |
3 |
trans |
Motor Cortex |
SVA_91 |
ENSG00000168314.18 |
5.48E-04 |
-40457.332 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_90 |
ENSG00000177575.13 |
2.99E-19 |
-36552.585 |
CD163 |
12 |
trans |
Spinal Cord |
SVA_84 |
ENSG00000168314.18 |
3.83E-04 |
-33494.891 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_90 |
ENSG00000087086.15 |
1.64E-08 |
-32820.222 |
FTL |
19 |
trans |
Motor Cortex |
SVA_5 |
ENSG00000168314.18 |
1.72E-04 |
-31165.253 |
MOBP |
3 |
trans |
Spinal Cord |
SVA_16 |
ENSG00000123560.14 |
8.97E-07 |
-31144.963 |
PLP1 |
X |
trans |
Spinal Cord |
SVA_91 |
ENSG00000237973.1 |
6.16E-30 |
-29834.037 |
MTCO1P12 |
1 |
trans |
Spinal Cord |
SVA_87 |
ENSG00000173786.17 |
5.88E-04 |
-28556.333 |
CNP |
17 |
trans |
Spinal Cord |
SVA_93 |
ENSG00000173786.17 |
5.98E-04 |
-28536.883 |
CNP |
17 |
trans |
Spinal Cord |
SVA_90 |
ENSG00000137285.11 |
1.48E-40 |
-27173.64 |
TUBB2B |
6 |
trans |
Motor Cortex |
SVA_5 |
ENSG00000237973.1 |
3.92E-10 |
-24995.474 |
MTCO1P12 |
1 |
trans |
Motor Cortex |
SVA_5 |
ENSG00000237973.1 |
8.77E-13 |
-24871.705 |
MTCO1P12 |
1 |
trans |
Frontal Cortex |
SVA_91 |
ENSG00000064787.13 |
5.47E-07 |
-22960.148 |
BCAS1 |
20 |
trans |
Spinal Cord |
SVA_93 |
ENSG00000198840.2 |
3.12E-03 |
-22049.981 |
MT-ND3 |
MT |
trans |
Spinal Cord |
SVA_87 |
ENSG00000198840.2 |
3.13E-03 |
-22046.442 |
MT-ND3 |
MT |
trans |
Spinal Cord |
SVA_90 |
ENSG00000164733.22 |
1.39E-04 |
-21828.547 |
CTSB |
8 |
trans |
Spinal Cord |
SVA_90 |
ENSG00000079215.15 |
1.69E-05 |
-21020.491 |
SLC1A3 |
5 |
trans |
Motor Cortex |
SVA_5 |
ENSG00000237973.1 |
8.40E-24 |
-19907.219 |
MTCO1P12 |
1 |
trans |
Spinal Cord |
SVA_87 |
ENSG00000136541.15 |
1.45E-10 |
-19551.323 |
ERMN |
2 |
trans |
Spinal Cord |