1. Introduction
Pulmonary arterial hypertension (PAH) is a devastating and complex disease in the worldwide. According to the World Health Organization (WHO) classification system, a resting mean pulmonary artery pressure (mPAP) of PAH is more than 25 mm Hg, pulmonary capillary wedge pressure (PCWP) of PAH is less than 15 mm Hg, and a pulmonary vascular resistance (PVR) of PAH is more than three Wood units.[
1,
2,
3] PAH affects the pulmonary vasculature by remodeling of the pulmonary vasculature primarily, which can eventually cause vascular dysfunction and structural remodeling of the small pulmonary arteries, then ultimately leading to right ventricular (RV) failure.[
4,
5] Median survival is only 5–7 years for PAH patients.[
6] The pathophysiology of PAH is complex, for this condition can be influenced by multiple processes including DNA damage, genetic modifying factors, environmental factors, microRNAs (miRs), oxidative stress, sex hormones and altered cell metabolism.[
7,
8,
9,
10] Even though great efforts have been made in this field during the past few years, the underlying mechanism of PAH is not entirely clear.
Weighted gene co-expression network analysis (WGCNA), which is a widely used technique to detect modules and hub genes that are based on comparison between two groups and complex network analysis.[
11] WGCNA method is based on some Network Medicine hypothesis.[
12] To gain further insight into the mechanisms and factors leading to the development of PAH, we performed WGCNA to identify co-expression modules and hub genes for PAH using transcriptome microarray data. One microarray dataset (GSE113439) was obtained from the Gene Expression Omnibus (GEO) database for WGCNA analysis. The results of WGCNA found that 11 modules were identified and the MEgreenyellow module showed a significantly positive correlation with PAH (r = 0.93, P = 1e−06). In this module, ten hub genes were identified as potential biomarkers for PAH, including CTNNB1, NIPBL, ROCK2, ROCK1, SCAF11, JAK1, BIRC6, KIF5B, PPP1R12A and XRN1. These hub genes might be potential targets for clinical therapy against PAH.
3. Results
3.1. Identification Key Gene Modules by WGCNA
In this study, we screened gene expression profiles from the GEO dataset for WGCNA analysis. The GSE113439 dataset was obtained from the GEO website. GSE113439, which was established on the platform of the Human Gene 1.0 set array (Affymetrix Inc., Santa Clara, CA, USA), was submitted by Mura. There were 26 samples in this data set, including 15 PAH samples and 11 normal controls. In our study, we comprehensively analyzed the expression data of GSE113439. The limma package was used for data normalization and quality control. Then, the Pearson’s correlation coefficient was used to cluster the samples in GSE113439. 45 was set as the cut-off height value. The samples that did not meet this criterion were removed. To construct a scale-free network, power of β = 17 (scale-free R
2=0.9) was selected as the soft-thresholding power. We then constructed the gene co-expression network by WGCNA package in R software, and 11 modules were detected and constructed (
Figure 1). The sizes ranges of the 11 modules were from 75 to 551 genes. The correlations between each module and clinical trait (PAH or normal) were also calculated to determine significant associations (
Figure 2). The results revealed that the greenyellow module was the most significantly correlated with PAH.
3.2. Functional GO and KEGG Pathway Enrichment Analyses of the Key Module
To study the function of the greenyellow module, we conducted a GO analysis of all genes in this module. A total of 13 significantly enriched biological process (BP) were observed in the current study, such as regulation of establishment of endothelial barrier,regulation of stress fiber assembly, ubiquitin-dependent protein catabolic process,protein phosphorylation,resolution of recombination intermediates,regulation of gene expression,negative regulation of bicellular tight junction assembly,regulation of cell adhesion,negative regulation of myosin-light-chain-phosphatase activity,actin cytoskeleton reorganization,regulation of actin cytoskeleton organization,cellular response to DNA damage stimulus,positive regulation of DNA-templated transcription, initiation (GO IDs: 1903140,0051492,0006511,0006468,0071139,0010468,1903347,0030155,0035509,0031532,0032956,0006974,2000144, respectively). A total of 8 significantly enriched cellular component (CC) were observed in the current study, such as nucleoplasm, nucleus, membrane, intracellular, cytoplasm, cell cortex, centrosome, transcriptional repressor complex (GO IDs: 0005654, 0005634, 0016020, 0005622, 0005737, 0005938, 0005813, 0017053, respectively). A total of 8 significantly enriched molecular function (MF) were observed in the current study, such as protein binding, ATP binding, cysteine-type endopeptidase activity, thiol-dependent ubiquitin-specific protease activity, nucleic acid binding, SMAD binding, poly(A) RNA binding, DNA binding (GO IDs: 0005515, 0005524, 0004197, 0004843, :0003676, 0046332, 0044822, 0003677, respectively).
We also conducted a KEGG analysis of all genes in the greenyellow module.
Figure 3 shows that the most significantly enriched KEGG pathway was mitophagy (pathway ID: 04137), which is directly associated with PAH. Moreover, the present study detected the others significantly enriched KEGG pathways, including signaling pathways regulating pluripotency of stem cells, necroptosis, salmonella infection, cGMP-PKG signaling pathway, leukocyte transendothelial migration, focal adhesion, proteoglycans in cancer, platelet activation, notch signaling pathway, cAMP signaling pathway and regulation of actin cytoskeleton (
Figure 3).
3.3. PPI Network Construction
To define the protein interactions, STRING was used to create PPI network. With the cytoHubba plugin, the top 10 hub genes were identified as CTNNB1, NIPBL, rho-associated coiled-coil–containing protein kinase (ROCK)2, ROCK1, SCAF11, JAK1, BIRC6, KIF5B, PPP1R12A and XRN1(
Figure 4). KEGG analysis revealed that the mainly enriched pathways for the 10 hub genes were proteoglycans in cancer, focal adhesion and vascular smooth muscle contraction.
4. Discussion
PAH is a complex and devastating disorder. Histopathological examination of PAH reveals that proliferation of pulmonary artery endothelial cells (PAECs), fibroblasts and pulmonary artery smooth muscle cells (PASMCs) might occlude pulmonary arterioles, then eventually leads to right heart hypertrophy and heart failure.[
18,
19] The exact pathophysiology and mechanisms for PAH are largely unknown. Therefore, it is of great important to identify the underlying cellular and molecular mechanism of PAH, and might be the potential targets for clinical treatment of PAH.
In this study, we screened key gene modules and hub genes from the GSE113439 dataset through WGCNA analysis, and 11 modules were detected and constructed. The greenyellow module was highly correlated with PAH. We then conducted GO and KEGG analysis of all genes in the greenyellow module. The most significantly enriched KEGG pathway was mitophagy, which is directly associated with PAH. Mitophagy is a process referred to mitochondria degradation via autophagy. A number of studies have demonstrated that enhanced autophagic pathway promotes the development of PAH through several mechanisms, such as the removal of misfolded proteins and the preservation of mitochondrial homeostasis.[
20]-[
22] The present study also detected the others significantly hub pathways, including signaling pathways regulating pluripotency of stem cells, necroptosis, salmonella infection, cGMP-PKG signaling pathway, leukocyte transendothelial migration, focal adhesion, proteoglycans in cancer, platelet activation, notch signaling pathway, cAMP signaling pathway and regulation of actin cytoskeleton. Recently, Hashimoto et al.[
3] demonstrated an unexpected function for bone marrow-derived hematopoietic stem cells (HSCs) in the pathogenesis of PAH, which is consist with our findings. In a recent animal study, necroptosis pathway was found associated with the pathogenesis of PAH through the RNA-seq data set and bioinformatics approach.[
23] In this study, we also identify the necroptosis pathway was significantly enriched in the greenyellow module. Previous studies have shown that the platelets of PAH patients are activated.[
24,
25] Similarly, in the current study, we have also found that platelet activation pathway was significantly enriched. These finding suggests that the hub pathways may be involved in modulation of PAH and have a therapeutic potential.
STRING was used to create PPI network of the genes in the greenyellow module to define the protein interactions. With the cytoHubba plugin, the top 10 hub genes were screened as CTNNB1, NIPBL, ROCK2, ROCK1, SCAF11, JAK1, BIRC6, KIF5B, PPP1R12A and XRN1. Previous studies have analyzed the same public microarray dataset (GSE113439), and have indicated different results. Luo et al. found that SMC2, CDK1, SMC4, CENPE, and KIF23 might be closely associated with the pathogenesis of PAH.[
26] Li et al. identified nine hub genes related to PAH, including the PLK4 and SMC2 genes.[
27] In another study, Ma et al. suggested that SMC4, TOP2A, SMC2, ANLN, KIF11, KIF23, SMC3, ARHGAP11A, RAD50 and SMC6 may involve in the pathogenesis of PAH.[
28] In a recent study, Yao et al. revealed several hub genes associated with PAH, including HSP90AA1, ANGPT2, HSPD1, HSPH1, TTN, SPP1, SMC4, EEA1, and DKC1.[
29] Another recent study found that 5 hub genes (CCL5, CXCL12, VCAM1, CXCR1, and SPP1) might play an important role in the development of PAH.[
30] All of the above studies compared the differentially expressed genes (DEGs) between control samples and PAH samples. The methodology is different from our study. According to the tutorials of WGCNA (
https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/ ), the input data should be the unfiltered expression matrix. Different approaches often give different results. To our knowledge, this is the first time using WGCNA methods to comprehensively reveal the biological processes of PAH in patients.
Among the top 10 hub genes in this study, CTNNB1, also known as catenin beta 1 or β-catenin, has an important role in adherens junctions. Yu et al.[
31] have found that the mRNA and protein levels of β-catenin were remarkably elevated in the lung tissues of monocrotaline (MCT)-induced pulmonary arterial hypertension (PAH) rats group than control group. A case report showed that the variation of CTNNB1 is associated with abnormal lung growth and pulmonary hypertension.[
32] NIPBL is a multifunctional protein and functions as a loading factor for cohesion.[
33] Publications have reported that loss of NIPBL promotes autophagy through impairs the DNA damage response and decrease lung cancer cells proliferation, migration and invasion.[
34] Previous reports have shown that ROCK1 and ROCK2 were involved in autophagy and fibrosis of the heart.[
35] Reduced expression of either ROCK1 or ROCK2 might be a protective factor of pulmonary fibrosis.[
36] These findings suggest that identification of the hub genes will allow greater understanding of the mechanisms and prognosis of PAH, as well as promote discovery of novel diagnostic and therapeutic direction.
In this study, there are some limitations. Firstly, the sample size was still relatively small. Second, although this study was based on WGCNA approach, which is a useful systems biology method, it was still limited to a certain computer analysis. Finally, there is still lack of experimental verification.
In the future, large amounts of data will be needed to further verify the hub genes and co-expression networks relationships of PAH, and the results need to be verified by experiments in vivo and in vitro.