1. Introduction
AD is an extremely common neurodegenerative disease, which is the leading cause of dementia. It typically begins with deterioration in memory and is characterized by a progressive decline in cognitive function [
1]. With the aging of the population and longer lifespans, the incidence of the disease continues to rise. There are approximately 50 million people worldwide with AD [
2], and this number is expected to increase rapidly in the coming decades. Currently, there is no curative treatment for AD and the best therapy is early diagnosis and the delay of the disease progression [
3]. Therefore, AD risk prediction is in urgent need of effective biomarkers.
Diagnosis of AD involves a variety of methods, including clinical presentation, cognitive tests, brain imaging, cerebrospinal fluid analysis, and blood testing. Clinical presentation involves observing the patient's symptoms, including cognitive and memory impairment, as well as behavioral and emotional changes. Cognitive tests, such as the Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA) are used to evaluate a patient's cognitive ability. Brain imaging techniques, such as Positron Emission Tomography (PET) and Magnetic Resonance Imaging (MRI) scans, can reveal structural and functional changes in the brain, i.e., brain atrophy and accumulation of beta-amyloid plaques. But these diagnostic methods are time-consuming, costly and subjective based on the clinic doctors’ experience [
4]. Particularly, the US National Institute on Aging and the Alzheimer’s Association proposed using biomarkers as purely biological definition of AD [
5]. For example, Cerebrospinal fluid examination (CSF) can detect the accumulation of β-amyloid protein plaques and other biomarkers associated with AD i.e., Aβ42, T-tau, and P-tau [
6]. Although CSF test is effective for AD, its highly invasive character remains challenging for AD patients, especially for elderly patients. More importantly, establishment of reliable biomarker based on CSF core biomarkers, i.e., Aβ and tau, has culminated in a debate derived from conflicting results and theories [
7]. It is urgent, thereby, to identify novel biomarkers for early diagnosis of AD, as well as potential targets for therapeutic methods in AD. Recently, accumulating evidence indicated that detection of fluid biomarkers from blood as diagnostic tools for AD is definitely a practical solution [
8,
9]. Blood testing detects specific proteins or other biomarkers in blood and thus can be used to early predict a patient's risk of developing AD. This approach has the advantage of being convenient, fast, and non-invasive compared with other methods for AD diagnosis.
Although the exact cause of AD is still not fully understood, many studies have suggested that metabolic abnormalities are associated with the development of AD [
10]. There has been growing interest in the role of metabolic dysfunction, particularly, lipid, glucose, and energy metabolism in the development and progression of AD [
11,
12,
13]. Abnormalities in lipid metabolism in AD refer to a series of aspects, including high cholesterol, high triglycerides, and low-density lipoprotein [
14]. These abnormalities can lead to atherosclerosis and cardiovascular disease [
15], greatly increasing the risk of patients developing AD. Some studies have further demonstrated that high cholesterol may lead to the formation of β-amyloid protein plaques, which are one of the typical features of AD. β-amyloid protein plaques can damage neurons in the brain, leading to cognitive impairment, memory loss, and neuronal death [
16]. The apolipoprotein E (ApoE) gene has been identified as a major risk marker of AD, which could repair synapsis and maintain neuronal structure [
17]. Currently, many studies also indicated that glucose and energy metabolism significantly associate with AD, such as tricarboxylic acid (TCA) cycle, oxidative phosphorylation deficits, and pentose phosphate pathway impairment [
18]. Glucose is important energy substrate for brain and neurons in brain need a great quantity of energy to sustain the normal activity [
18]. But a decrease in glucose and energy metabolism is also observed in AD patients by research [
19]. In addition, oxygen and glucose metabolic rates are significantly changed in AD because of the alterations of glycolytic pathway and TCA cycle [
20]. Reasonably, abnormality of metabolism exhibits a closely association with the onset and progression of AD, and identification of novel metabolism-related biomarkers is a workable strategy for diagnosis of AD.
In the present study, we hypothesized that molecular metabolism abnormalities in AD might reflect in metabolic gene expression of peripheral blood, and characterizing those unconventionally metabolic genes in blood may give rise to a promising non-invasive biomarker for diagnosis of AD, particularly in early stage. Initially, we unveiled the difference of peripheral blood gene expression between AD and non-AD patients based on the high-throughput RNA sequencing data, along with the relevant biological processes and pathways they involved. Subsequently, inspired by Lixin Cheng et al’ s study [
21], we proposed a novel approach to quantify the difference between a pair of metabolic pathways within each individual sample (including AD and non-AD patients). The main merit of this approach can well avoid the batch effect derived from different datasets. This analysis successfully figured out several metabolic pathway pairwise (MPP) signatures associated to AD. Furthermore, all the AD patients could be classified into two subgroups via the unsupervised clustering analysis based on the MPP signature matrix, which exhibits distinct patterns of immunity and metabolism. Eventually, we utilized multiple machine learning methods to screen out key MPP signatures correlated to AD and establish a metabolic pathway pairwise scoring system (MPPSS) for AD of diagnosis (
Figure 1). The model achieved a high AUC in not only test data but also the independent validation datasets. In conclusion, we developed reliable and sensitive biomarkers for AD early diagnosis and intervention, it holds significantly potential value in helping people deeply understand the disease mechanisms and influencing factors of AD and will be of practical clinical use.
4. Discussion
AD is an incurable neurodegenerative disorder associated with aging, and its underlying mechanisms are not yet fully understood [
40]. Early diagnosis and delay of disease process are regarded as the best treatment for AD. In our study we aimed to develop a diagnostic scoring system (MPPSS) for AD patients based on the blood gene expression data. The advantages of blood-based biomarker diagnosis for AD include its non-invasiveness, safety, ease of use, low cost, and high accuracy, compared to other traditional diagnostic methods which require invasive procedures such as lumbar puncture or intracranial injection to collect samples.
Recent studies suggest that metabolic pathways including lipid, glucose and energy metabolism may play a role in the development of AD [
18,
41]. Therefore, we conducted the establishment of MMP signatures for the characterization of the interplay between metabolic pathway pairs. Based on the MMP signatures, we identified two subsets (S1 and S2) of AD patients via NMF clustering. In the S1, S2, and non-AD groups, the down-regulated genes are mostly related to immunity, neurogenesis, and signal transduction, while the up-regulated genes are mostly related to mitochondrial respiration and RNA splicing. Furthermore, we conducted the immune infiltration analysis for three groups, and found that the S2 group had lower immune proportion, which might suggest a strong correlation between AD progression and immunity. Finally, we constructed MPPSS for the AD diagnosis. Compared with single marker-based diagnostic model, MMP signature-based diagnostic model has more power of characterization of the interaction among metabolic pathways in AD onset and development. The MPPSS hold considerable potential for assisting doctors in diagnosing elderly patients. It also suggests that MPP signatures may be used as diagnostic biomarkers in clinic.
Overall, these findings suggest that metabolic pathways may provide potential diagnostic biomarkers for AD, particularly through blood-based analysis. Moreover, the involvement of cytochromes P450 in lipid homeostasis and detoxification processes further supports the role of metabolism in AD development [
42]. Many studies have shown Cytochromes P450 of the liver are involved in maintenance of lipid homeostasis, such as cholesterol, vitamin D, oxysterol and bile acid metabolism [
43,
44]. And in detoxification processes of endogenous compounds such as bile acids [
45]. The correlation provides evidence in support of our research findings. The core metabolic network metabolism of xenobiotics by cytochrome P450 (hsa00980) and Drug metabolism-cytochrome P450 (hsa00982), which are involved in the metabolic mechanisms associated with cytochrome P450. The metabolism of xenobiotics by cytochrome P450 appeared as an important core metabolic pathway in both comparison of AD vs non-AD (
Figure 2b) and S1 vs S2 (
Figure 3g), and drug metabolism-cytochrome P450 appeared in S1 vs S2 (
Figure 3g) individually.
The activity of Cytochrome P450 protein is also regulated by the lipid environment [
46]. The lipid level may have an important impact on the onset and development of AD [
47,
48]. In our study the differential enrichment of lipid metabolism pathways such as Steroid biosynthesis, Sphingolipid metabolism, Glycerolipid metabolism etc. (
Figure 2e) supported this point of view.
Alzheimer's disease is believed to be caused by Reactive Oxidative Stress (ROS), which occurs prior to the formation of Aβ-plaques and neurofibrillary tangles [
49]. The core metabolism pathway, that is Biosynthesis of unsaturated fatty acids (hsa01040) identified in the present study have been demonstrated to be associated with the ROS production [
50]. Another core metabolism involved in the metabolism of unsaturated fatty acids was reported to be considerably disrupted in the brains of individuals with different levels of Alzheimer’s pathology [
51]. What’s more, Cysteine and Methionine metabolism (hsa00270) also plays an essential role in ROS, it can be oxidized and has been implicated in caloric restriction and aging [
52]. These results were shown in
Figure 2b.
Among these metabolism pathways, oxidative phosphorylation (
Figure 2e) plays a crucial role in brain cell energy metabolism [
53] and has been shown to be involved in the pathogenesis of AD [
54]. Other pathways, including pyruvate metabolism [
55], porphyrin metabolism [
56](
Figure 2e), and fatty acid biosynthesis [
43] (
Figure 3e), have also been found to be implicated in AD. The dysregulation of these pathways may lead to cellular energy metabolism disruption, oxidative stress, and cell death, which may negatively affect the occurrence and development of AD [
43,
55,
56,
57].
Through analyzing the proportions of different immune cells in whole blood, a better understanding of the pathogenesis of AD can be gained. For example, inflammation may be an important trigger for AD, and certain immune cells such as macrophages and T cells are associated with inflammation. The comparison of T cells and Macrophage among three groups demonstrated the AD patients in S2 has low accumulation. T cells memory activated and T cells CD4 memory resting was significantly lower in S2 patients, while T cells CD4 naïve infiltration was significantly higher than that of S1 patients. Memory T cells are a subset of T cells that can encounters with foreign substances antigens and become activated more effectively, in the meanwhile, CD4 T cells helps coordinate immune responses by releasing cytokines and other signaling molecules [
58], implying the patients in S2 exhibit lower immunity. There was a significant difference in the level of gamma delta immune infiltration among the S2 group compared to the other groups, with the S2 group exhibiting the lowest level. T cells with gamma delta receptor form small percentage of lymphocytes in healthy individuals, whereas their number increases in persons with immunological disorders. Also, we found the patients in S2 possessed the highest proportion of regulatory T cells (Treg), which is hallmark of immunological suppression These findings suggested that the patients in S2 might have a more severe progression of AD.
Additionally, there are significant differences in the enrichment of Mast cells among the three groups. Concretely, Mast cell activation were significantly higher accumulated in S2 group, while the Mast cell resting of S2 group is on the counterpart. Derived from the myeloid lineage, mast cells are a category of immune cells that exist in connective tissues across the body [
59]. Fibrillar Aβ peptides are known to play a significant role in the development of AD [
60], and some studies have suggested that accumulation of them can trigger mast cells and elicit exocytosis and phagocytosis [
61,
62], which supports our finding that the patients in S2 exhibit a higher proportion of Mast cell activation. It should be noted that our results were based on analysis of blood samples. This finding indicates that the impact of AD on mast cells can be reflected in the whole blood.
In our study, we utilized multiple machine learning approaches to establish and test the predictive model, respectively, with the aim of screening out the optimal model for AD diagnosis. Specifically, this strategy utilizes various feature selection algorithms (such as LASSO and random forest, etc.) to select features and evaluate the predictive capability of models via AUC index. This strategy could well eliminate the bias which may exist in a single feature selection algorithm, which improve the robustness and sensitiveness of the predictive model.
However, several limitations exist and should be noted. Firstly, the missing some key clinic information, i.e., survival time, survival status, of AD patients limits our ability to fully analyze the clinic features between S1 and S2 groups. We expectantly collect more clinical data of AD patients in our future work. Secondly, although MPPSS exhibits decent predictive performance no matter in testing data or independent validated data (including blood dataset and brain datasets), there is still a lack of large-scale verification via prospective studies with large sample sizes. The MPPSS might be a valuable clinical tool aiding doctors in accurately diagnosing AD, especially for the elderly patients after rigorous evaluation and validation. Additionally, the lack of blood samples prevented us from conducting more stable external validation specifically for blood-based analysis. Nonetheless, we included the samples from other brain tissues for validation, which further demonstrated the generalizability of our model. Finally, the functional role of the reliable MPP signatures we identified requires further molecular experiments, which facilitates a better understanding of their biological significance implicated in AD.
In summary, we conducted comparative analysis based on blood gene expression data between AD and non-AD groups. Characterization of the DEGs, and pathway associated with AD disclosed potential correlation of metabolism with onset and progression of AD. Based on blood transcriptome data, we constructed new metabolic marker, referring to as MPP signatures. Subsequently, we revealed the molecular subtype of AD based on NMF clustering and detected the differences within AD subset distribution. Network analysis was applied to differential MPP signatures to detect the core metabolic network of AD. Eventually, we established MPPSS for AD diagnosis which exhibited a good performance on train, test and validation datasets. Our study provides insights into the association between AD and metabolism, and MPPSS shows the important implications for the AD diagnosis and treatment.