1. Introduction
Next-generation cancer research is increasingly moving towards the full integration of big data, machine learning (ML) approaches (including deep learning, DL), and computational systems biology methods, with the latter concentrating on constructing, curating, interpreting, and validating various multimodal biological network models [
1]. One of the primary challenges in ongoing and future computational cancer and oncology research is the appropriate selection and integration of the many complementary yet overlapping high-dimensional multiscale analysis and modeling methods, usually vaguely gathered together under the umbrella of "AI". A practitioner, be it a cancer researcher, a clinician, or a physician-scientist, is often overwhelmed by the sheer repertoire of the AI/ML/network-centered analysis and modeling methodology at their disposal; moreover, this repertoire is growing daily. While presenting an enormous opportunity, such methodological cornucopia is also a challenge, requiring a clear understanding of the scope, applicability, and limitations of the computational algorithms and tools. This is exacerbated by the frequently equivocal terminology, reflecting parallel research progress in computer science and AI, multivariable statistics, and graph theory and network science.
One of the most interesting, and promising, recent developments in DL has been the advent of graph neural networks (GNNs). Although combining graph structures with DL was codified as early as 2005-2009 [
2,
3,
4], GNNs did not attract broad attention in the bioinformatics, computational biology, and computational chemistry communities until 2019-2020 (following the general explosion of DL, and DL applications in life sciences). A recent (October 2023) MEDLINE/PubMed search query ("graph neural network" OR "graph neural networks") AND ("oncology" OR "cancer") generated 151 results (4 in 2020, 26 in 2021, 59 in 2022, 67 in 2023), suggesting an emerging trend. Dissecting this trend is the principal goal of this review.
Application of GNNs in cancer research and oncology holds an immediate appeal because GNNs are intuitively understood as a synthesis of graph structures (naturally representing, for example, multiscale biological networks, or molecular structures, or knowledge graphs) and powerful DL approaches; however, there is a certain amount of confusion about the relationship between GNNs and other, more "conventional" in the life sciences context, network-centered methods — such as co-expression networks, gene regulatory networks, network enrichment analysis, Bayesian networks, Markov networks, etc. This confusion leads to the often-asked question: how are the GNNs different from the other network methods, and should they supersede the latter in a prototypical cancer researcher’s computational systems biology toolkit? Concurrently, another question arises: what is the added value that the GNNs can bring to a cancer researcher, compared to the other leading-edge DL techniques that can accommodate non-homogenous, structured data? These two inquiries provided the original impetus for the present review.
In this review, we aim to specifically address the following questions:
1. What are the emerging trends in the application of GNN methodology to cancer and oncology research? Are there any fields and sub-fields in which the GNNs are poised to predominate? 2. Consequently, should the cancer and oncology researchers consider GNNs in addition to, or instead of, more established DL approaches? And if yes, under which scenarios and circumstances? What are the added benefits, if any? 3. Likewise, should the cancer and oncology researcher community reevaluate more established non-DL network modeling approaches and consider augmenting, or replacing them with GNNs? The structure of this review is as follows: first, we introduce the GNN methodology fundamentals, and compare them to graphical models. Then, we survey the recent trends in GNN applications in cancer research and oncology and highlight several fields in which the GNN approach appears to be the most efficacious. Finally, we compare and contrast GNNs with non-graph DL and non-DL network-centered methods, and conclude by identifying promising future trends and research directions.
It should be emphasized that this review is intentionally focused in its scope, namely on the
practical applications of GNNs in the context of cancer and oncology research. As such, this review is aimed at the practitioners asking a very specific question: should they incorporate the novel GNN methodology in their cancer and oncology research pipelines? To gain a broad and complementary perspective on AI/DL in cancer and oncology research beyond the scope of this communication, we refer the reader to the recent reviews on explainable AI in oncology [
5], AI in lung cancer [
6], interpretable DL in oncology [
7], DL in imaging/cancer diagnosis [
8], GNNs in imaging/histopathology [
9,
10,
11], GNNs in bioinformatics [
12], AI in cancer multi-omics [
1], DL in drug response prediction in cancer cell lines [
13], and DL in biological networks [
14].
2. GNN Fundamentals
Graph, or network, is a data structure with high expressive power that consists of nodes and edges (reflecting the relationships between nodes). In life sciences, such networks can be very high-dimensional (-omics data), or very multimodal (from molecular data to clinical data to communities and social networks), or both. Merging graph representation with DL can be achieved by adapting DL’s inputs and outputs to non-Euclidian data, wherein various graph features (nodes, edges, sub-networks or whole graphs) are transformed into low-dimensional vectors in the process of graph embedding. However, contextual topological information might be lost in encoding/embedding; a more "generalist" GNN approach iteratively updates node states in the graph via message passing between the nodes in a manner similar to DL but with a local topology (i.e., a complement of neighboring nodes) taken into account. A variety of GNN models have been proposed with some of the more prominent ones being spectral-based and spatial-based GCNs (Graph Convolution Networks) [
15,
16], Graph RNNs (Graph Recurrent Neural Networks) [
17], GATs (Graph ATtention networks) [
18], and GAEs (Graph AutoEncoders) [
19]. We refer the reader to the excellent recent reviews on GNNs [
12,
20,
21] for technical details and classification of different GNN approaches and implementations; here, we will only note that, similar to non-DL network modeling methods, graph topology can be pre-set (e.g., representing a molecular structure, a spatially resolved image, or expert knowledge in the domain), or can be learned from data, via model selection. Likewise, the learning tasks/outputs of GNNs are similar to those in the non-DL network analyses: node-level (value of a node of interest), graph-level (property of the entire graph), and edge-level (edge detection) predictions, with the latter generalizing to the aforementioned learning of the (sparse) graph topologies from data. In summary, GNNs promise to combine the high expressivity and inherent interpretability of graph structures (and their natural congruence with many life science research and clinical data types) with the predictive/learning power of DL.
3. GNNs and Graphical Models
GNNs are superficially similar to graphical models, in that both perform learning over graph structures. Bayesian networks (BNs), or probabilistic directed acyclic graphs (DAGs) learned from the data, are arguably the most popular graphical models in life science applications. BNs can incorporate both data-driven learning and existing knowledge, and allow for probabilistic reasoning and propagation over the DAGs. A major feature of BNs is that they filter out superficial (transitive, non-direct) dependencies, thus arriving at sparse DAGs suggesting directional causalities [
22,
23,
24]. A question is often asked: what are the principal differences between BNs and GNNs, especially from the life sciences application perspective? Here, we compare the underlying fundamentals of a GNN (specifically, a GCN) and a BN.
In many cases, a graph is simply an abstraction defined over another model that can be written algebraically. This is the case for graphical models, such as BN, where a probabilistic model is the basis for its graphical representation. Additional constraints of directionality and acyclicity are imposed on the graph representation by the underlying probabilistic model (hence, a DAG), although, generally, not in a unique way. Besides the usual pairwise interactions, BNs are capable of modeling probabilistic dependencies of very high-order and almost arbitrary depth. This is one of the reasons that BNs are often perceived to stand in correspondence with causal structures. While causal inference is certainly possible with BNs in some circumstances, the notion of causation is usually much narrower than probabilistic dependency. BNs are well-equipped for probabilistic reasoning in contexts with a high degree of uncertainty where little a priori information about the nature of the interaction in question is available. Although BNs do not expect temporal ordering required for causal inference, they can readily accept causal constraints.
While both rely on graph representations, GNNs are quite different from BNs. Typically, a GNN relies on information diffusion techniques, e.g. graph convolution in the case of a GCN (
Figure 1), to accomplish a graph-relevant predictive task such as the classification of nodes. In the simplest configuration a feedforward GCN, for example, maps an aspect of a graph to a numerical scale of an appropriate dimension. A GCN with backpropagation (
Figure 2) can approximate the mapping between certain aspects of a graph and its class assignment from examples. From this perspective, a GNN is one of many generic approximation methods that establish a relationship between a graphical model and its implications.
A BN, on the other hand, is a dependency model, or, more precisely, a way to specify the probabilistic model for essential dependencies between various observables. A properly defined BN contains all the information necessary to reconstruct the associated joint probability distribution and, therefore, makes node-wise prediction a matter of probabilistic inference. Estimation of BN structure and parameters from observations constitutes an inverse problem that can be approached in a variety of ways. Once a BN model is obtained the information contained therein can be interpreted directly, without the aid of additional methodological devices, and utilized for probabilistic inference as well as for construction of classifiers, predictors, and other tools for a particular knowledge or problem domain.
GNNs and BNs serve largely complimentary purposes with little (but not insignificant) instrumental overlap. Under some assumptions, a BN can be aided in its specification by a GNN [
25] in a way similar to classical parameter estimation methods [
26]. But once a BN is completely specified, it is a more efficient stand-alone tool for any kind of inference tasks over the problem domain, including prediction and classification. More importantly, it makes the accumulated problem domain knowledge explicit and directly interpretable, which enables the design of highly efficient problem-specific methods. In this, a multi-scale BN stands in contrast to the largely "black box" nature of a DL model, even the one containing GNN components.
An illuminating analogy that makes this apparent difference clear is an approximation of a signal, or an image via expansion into basis functions as opposed to the straight interpolation. Here the basis may have a domain-specific meaning, e.g. trigonometric basis in Fourier series representation, as opposed to a more generic spline interpolation which, in some sense, may do equally well in the approximation, but leaves out of picture the explicit interplay of parameters that occurs in the frequency domain along with the possibility of spectral manipulation.
In summary, the application of BNs accentuates inference over the problem domain, knowledge representation, and construction of narratives and hypotheses. GNNs, on the other hand, are well-equipped to deal with generic approximation tasks where the way this approximation can be achieved and how informative it must be are not the primary concerns.
We will discuss the practical considerations behind the choice between GNNs and graphical models below in
Section 5.
4. GNN Applications in Cancer Research and Oncology
In surveying the field, two major themes emerge: interpretability and multimodality. The graph structure representation underpinning GNNs is inherently interpretable in contrast to the ex post facto explainability in DL (aka explainable AI, or XAI) [
27], and can naturally combine different modalities/data types within a single analysis framework. In addition, and on different abstraction levels, graph representation is a natural fit with the molecular structures and the image data types. These three advantages of the GNNs — (i) inherent interpretability, or intelligibility (providing a potential pathway to causal discovery); (ii) combining different modalities/data types/scales; (iii) natural representation of molecular structures and images — led to the recent and ongoing (2019-2023) cancer and oncology research GNN-centered work predominantly concentrating in the following (partially overlapping) six major areas of activity:
1. Using multimodal data (including imaging, histopathology and digital pathology) for cancer diagnosis, prognosis, survival and therapy response prediction.
2. Cancer classification, subtyping and grading.
3. Granular spatial approaches (including transcriptomics and proteomics).
4. Cancer drug selection, repurposing, and profiling; prediction of cancer drug interactions and combinations, response and resistance.
5. Synthetic lethality prediction.
6. Prediction of ncRNA (miRNA, piRNA, lncRNA) and circRNA - cancer associations.
It should be emphasized that, although intrinsic interpretability is a significant pragmatic consideration, the actual performance (in prediction/classification) of the GNN-based approaches often proves superior to the conventional DL as well; this can be explained by the higher congruence of the graph structure representations with the mechanistic/causal structure of the domain, thus making the inputs’ encoding/embedding less prone to the information loss (that occurs due to the data type conversions and contextual information loss). In addition, just as with DL in general, GNNs tend to perform better than "classic" ML on the large datasets.
4.1. Using Multimodal Data (Including Imaging, Histopathology and Digital Pathology) for
Cancer Diagnosis, Prognosis, Survival and Therapy Response Prediction
Early work in this area focused on using GCNs [
28] and GATs [
29] to predict cancer phenotypes [
29] and survival [
28] from multimodal genetic, genomic and clinical data, such as available in The Cancer Genome Atlas (TCGA). These approaches showed incrementally but significantly superior performance on prediction tasks compared to the conventional ML and DL methods. Gao et al. [
30] and Kim [
31] extended the basic framework to model inter-patient groupings, "patient similarity networks", likewise achieving performance improvements in survival prediction on different cancer datasets. Liang et al [
32] incorporated topological features of pathway representation of the transcriptomic data into the cancer survival prediction models for four cancers, taking advantage of the natural pathway-graph structure mapping. Again, prediction performance was superior to that of conventional ML/DL, with an added value of most predictive pathways’ delineation.
Subsequent work gradually incorporated imaging, histopathology and digital pathology data — modalities that are particularly amenable to the graph structure representations. Lian et al. [
33] used GCN with CT imaging data to predict lung cancer survival, achieving superior generalization prediction accuracy. Lee et al. [
34] used GAT with digital pathology data (whole slide images, WSIs) to dissect features of the heterogeneous tumor microenvironment and predict the prognosis for four different types of cancer; importantly, the resulting models were interpretable at the contextual features level, underscoring the conceptual advantages of GNNs over the typical "black box" DL predictors. Lian et al. [
35] combined imaging data with clinical modalities in a transformer-GNN model to achieve superior risk and survival prediction performance for the early stage non-small cell lung carcinoma. Wang et al. [
36] integrated multiplexed immunohistochemistry images into GNN models, thus enabling precise (binary and ternary classes) survival prediction in gastric cancer, with high multivariate prediction accuracy. Combining histopathology with computed topological features in a GNN model led to a significant improvement (0.962 AUC) in the accuracy of differential diagnosis of pancreatic ductal adenocarcinoma, a notoriously lethal human cancer [
37].
Ding et al [
38] integrated CT data and clinical factors in a GAT model to achieve lymph node metastasis prediction superior (0.872 AUC) to that of single-modality approaches. Likewise, Hu et al. [
39] developed a GNN forest model for highly accurate lymph node metastasis prediction that combined CT imaging, clinical features, and expert knowledge; an interesting aspect of this study was the medical experts’ involvement in the intermediate analysis stage (construction of the imaging-clinical super-graph). The WSI data-based GNN model for the abnormal (non-neoplastic and neoplastic) endoscopic large bowel biopsy diagnosis developed by Graham et al. [
40] also included an iterative interaction between a human expert (pathologist) and purely data-driven decision-making; to paraphrase a common witticism, the future might lie not in AI replacing human experts, but rather in human experts augmented by AI outperforming those without.
Recently, more complex, specialized GNN architectures have been proposed in the context of cancer prognosis/survival prediction. Fu et al. [
41] developed a two-module GNN model combining clinical features with the highly multiplexed imaging data that improved survival prediction on public breast cancer datasets. Zhu et al. [
42] incorporated geometric features into sparse DL architectures, thus devising "geometric" GNNs that demonstrated high survival prediction accuracy on eleven different cancer types based on multi-omic data. Zhang et al. [
43] proposed a complex feature generation / GNN architecture to improve cancer prognosis prediction by combining multi-omic data and molecular interactions in biological networks. Li et al. [
44] developed a convolutional neural network (CNN)-GNN architecture for multimodal diagnosis of lung adenocarcinoma that used fused feature vectors to localize information transmission patterns, thus improving explainability. Notably, these four studies demonstrate how a more complex, customized GNN/DL architecture can outperform "out-of-the-box" GNN solutions, signifying an emerging trend and suggesting that GNN applications in cancer and oncology research have reached maturity. Another sign of this growing maturity is an increasing emphasis on inferring causality, which naturally dovetails with the GNN paradigm: for example, Li et al. [
45] set out to disentangle causative and non-causative tumor features in the context of GNNs using CT imaging data for early diagnosis of pancreatic cancer. Yet another direction for the GNNs refinement is the training mode: Azher et al. [
46] compared different pretraining strategies for multimodal (methylation, expression, histopathology) GNN-based cancer prognostication, and concluded that appropriate pretraining strategies might be more important than innovations in model architectures for highly accurate prediction.
Prediction of cancer therapy response is another task that is well-suited for the multimodal GNN application. Wang et al. [
47] utilized a CNN-GNN model to predict response to neoadjuvant therapy in rectal cancer using digital pathology data (WSIs), achieving high generalization prediction accuracy. Integrating multiple prior knowledge networks (gene-gene interaction graphs) in a GNN model enabled Zhao et al. [
48] to attain superior prediction accuracy (up to 0.85 AUC) for immunotherapy (immune checkpoint inhibitor) response across different cancer types; this study showcases the GNN’s ability to seamlessly incorporate prior knowledge (which is often hard-coded in a graph structure form).
In summary, the application of GNNs to cancer diagnosis, prognosis, survival and therapy response prediction is now a mature field. The emphasis is shifting from the straightforward implementations to various refinements of GNN architectures (and multi-stack DL architectures containing GNN modules) and training regimes, specific to the cancer-related predictive features and modalities. Two additional emerging trends are (i) inferring causality, and (ii) an iterative human expert - AI predictor dialog, with both drawing on the inherent interpretability of the GNN representation.
4.2. Cancer classification, Subtyping and Grading
Methodologically, these applications overlap with
Section 4.1 above, and have evolved in parallel. Early work [
49] laid out the foundations for the typical analysis pipeline: use a GCN in conjunction with high-resolution (revealing a micro-architecture) histology images to construct large cell-level graphs incorporating multi-level features for grading of colorectal cancer. Likewise, Lu et al. [
50] combined high-resolution digital pathology (WSI) data with a customized GNN architecture to predict HER2 status in breast cancer, thus moving from the "patch" to the "entire WSI" level. Pati et al. [
51] developed a multi-scale, hierarchical "cell-to-tissue" GNN for histopathological image classification, and comprehensively surveyed early (2019-2021) work on graph structures (including GNN approaches) in digital pathology. In a similar vein, Wang et al. [
52] added another hierarchical level — "cell communities" and their topological features — to the GNN analysis framework; emphasis on the topological data analysis led to a higher performance on pathology image classification and disease grading tasks with multiple cancer types. Going one step further, Abbas et al. [
53] developed a multi-cell type and multi-level graph aggregation architecture that takes into account both local and global cell-cell interactions, and outperforms both CNNs and GNNs on cancer grading of digital pathology images.
Zhang et al. [
54] used a different modality, distance-based features extracted from limited CT samples, to develop a GNN predictor for pancreatic cystic neoplasm classification; the dataflow followed an established by now scheme — use CNN to generate features and GNN to complete classification. Similarly, Ravinder et al. [
55] combined CNN and GNN to improve brain tumor type classification using MRI images; whereas Ma et al. [
56] proposed a dual GCN-GAT architecture for MRI brain tumor segmentation. Yin et al. [
57] used yet another modality, multi-omics, to demonstrate a superior breast and stomach cancer subtyping accuracy when integrating -omics in a GCN-based predictor. Likewise, Kesimoglu and Bozdag [
58] used multi-omics data combined with other raw features for GCN-based prediction of breast cancer subtypes; interestingly, this approach improved on the "ground truth" depending on a single modality/datatype, thus providing additional evidence in support of the superiority of the multimodal analyses.
Fittingly, the latest work in this area combines information from multiple multimodal diagnostic disciplines in a single analysis scheme, taking advantage of GNN model representation flexibility and inter-domain transfer learning. For example, Furtney et al. [
59] utilized radiographic images, genomics data, and other modalities to classify breast cancer subtypes via personalized breast cancer patient graphs.
In summary, we observe two trends: multi-level digital pathology data analysis, and a broad, multi-modal, approach to classification (that would ideally incorporate multi-level digital pathology, multi-omics, and other features). While the former appears to be sufficiently mature, the latter is an emerging and promising trend; both significantly benefit from the ability of GNNs (often in cooperation with other DL modules) to combine different data types/modalities in a unified framework.
4.3. Granular Spatial Approaches (Including Transcriptomics and Proteomics)
Here, we are primarily concerned with the spatial single-cell analysis, and spatial heterogeneity, in tumor microenvironments. Early work in this area utilized "generalist" GNN-based approaches to spatially resolved gene expression analysis [
60]; for example, Solorzano et al. [
61] used GNNs for cell niche characterization in the glioma tissue. Subsequently, more complex, dedicated, GNN models were developed to be applied in the cancer/oncology context. Zeng et al. [
62] proposed a CNN-transformer-GNN architecture to capture spatially resolved RNA-seq expression from histology images, demonstrating high prediction accuracy for both gene expression and spatial region identification in cancer vs. normal datasets. Chang et al. [
63] used graph autoencoder/GNN for spatially resolved transcriptomics in glioblastoma tissues, robustly classifying different regions. Qiu et al. [
64] combined a variety of prognostic biomarkers (including molecular types) to model "intratumor GNN" that captures spatial heterogeneity on different levels; this model’s prognostic performance proved superior on a retrospective breast cancer dataset. Likewise, Ding et al. [
65] interweaved spatial profiles at different levels (WSI data, protein expression profiles, mutational profiles) to construct "spatially aware" multi-level GNN models from the TCGA colon and rectum cancer data using a customized five-module GNN architecture; these models demonstrated high cross-level molecular profiles’ prediction accuracy on TCGA datasets. Wu et al. [
66] used multiplex immunofluorescence imaging to show that a GNN leveraging spatial protein profiles adequately models tumor microenvironment via local subgraphs; such subgraphs were found to be predictive for patient outcomes.
In summary, utilizing GNNs to "build the bridge" from cell-level spatial heterogeneity in tumor microenvironments to spatial region identification to cancer patient-level prediction tasks is a novel but highly promising research direction. We expect GNNs to play a crucial instrumental role in this area, as they are a seamless fit with multi-level spatial representations.
4.4. Cancer Drug Selection, Repurposing, and Profiling; Prediction of Cancer Drug
Interactions and Combinations, Response and Resistance
This broad area is especially amenable to GNN application, due to the graph structures being the naturally commensurate representations for the chemical structures, drug-drug networks, and other multimodal networks incorporating diverse drug-relevant information. It is, therefore, not surprising that some of the earliest work in cancer/oncology GNNs was focused on graph models for drug and drug interaction representations. Cui et al. [
67] adapted a generalist GCN to the task of drug repurposing against breast cancer, merging drug-drug networks with the drug-exposure gene expression data; the resulting models outperformed both "classic" ML and standard DL approaches. In a reverse scenario, Gonzales et al. [
68] used a GCN model to predict anticancer molecules within food ("superfoods") based on a graph (human interactome) drug representation similarities to those of FDA-approved anticancer drugs, with the resulting models demonstrating both high prediction accuracy and interpretability. In parallel, Gao et al. [
69] utilized multi-level (from atomic to molecule) drug structure graph representations to select candidate breast cancer drugs, thus underscoring the two-pronged (molecular structure and drug-relevant networks) utility of the GNN approaches.
Another prominent activity, complementing
Section 4.1. above (therapy response prediction), is the prediction of a patient’s response to anticancer drug therapy, or a cancer cell line response to a drug. Zuo et al. [
70] combined molecular structure graphs and gene features (expression, mutation) in a GNN-CNN model that showed superior performance on the benchmark Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) datasets. Zhu et al. [
71] added a different modality, protein-protein-interaction (PPI) networks (from the STRING database), and combined PPI and molecular graphs in a two-encoder GNN architecture for anticancer drug response prediction, likewise demonstrating performance advantages over the baseline non-GNN methods across different cancer cell line datasets. Liu et al. [
72] added multi-omics cancer cell line profiles to the GNN model, achieving performance improvements as well. Narrowing the focus to a specific group of drugs, Pu et al. [
73] integrated genomics, biological networks, inhibitor profiling, and gene-disease associations in a unified GNN model to predict response to kinase inhibitors across various cancer tissues/cell lines. Similarly, Singha et al. [
74] integrated multiple heterogeneous data in a GAT model for evaluating kinase inhibitors across different cancer cell lines. Emphasizing the interpretability of GNNs, Shin et al. [
75] incorporated expert/domain knowledge (on biological pathways) in the multiple subgraphs-transformer model for anticancer drug response prediction, demonstrating improved performance on the GDSC datasets. Wang et al. [
76] focused on the interpretability as well, applying a pruning mechanism to their multimodal drug response prediction GNN-based model. In general, it appears that adding additional heterogeneous information types to the drug response prediction GNN models increases their generalization performance; it is serendipitous that GNNs are especially well-suited to such expansion.
An interesting variation on the theme was proposed by Peng et al. [
77], wherein feature representations of drugs and cell lines are directly integrated in a heterogeneous network (instead of a bipartite graph); this model performed especially well on the GDSC and CCLE datasets. In parallel, Liu et al. [
78] proposed a novel GNN architecture constructed around the multi-view graphs, with each input data type (various omics, PPI) contributing a separate "view" to the multimodal drug response prediction. Automated optimization of the GNN architectures in the cancer drug response prediction context is the latest trend in this research area, pointing to its relative maturity; recently, Oloulade et al. [
79] developed a framework for automated GNN hyperparameter/architecture optimization specifically tailored to each particular drug sensitivity dataset that consistently outperformed baseline methods from the first optimization epoch.
Moving on from the single drugs to drug combinations, Wang et al. [
80] used a GAT model to predict drug-drug synergy on cancer cells from feature embeddings of drug molecular structure and gene expression; the model showed both high performance (+16 percent predictive precision over non-GNN methods on AstraZeneca independent dataset) and interpretability. Notably, the latter led to gaining useful insights into the chemical substructure of anticancer drugs, yet again illustrating the added value of GNNs in contrast to the "black box" methods. Bao et al. [
81] also emphasized GNN’s interpretability aiding in identifying molecular substructures contributing to drug synergy; an interesting additional aspect of this work was accounting for asymmetries in drug input, thus increasing predictive performance. Dong et al. [
82] took this approach one step further, explicitly concentrating on identifying the mechanisms of synergy by dissecting the most salient molecular substructures revealed in their GAT model. Conversely, Ren et al. [
83] constructed a GNN-based "biomedical knowledge graph" model with NLP input to predict drug-drug interactions; the model showed high performance on cancer-related tasks.
In summary, GNN’s ability to combine both molecular structure-level and network-level data in interpretable models bodes well for significant further progress in this domain. Two particularly promising research directions are (i) automated optimization of GNN architecture customized for particular scenarios, and (ii) identification of molecular substructures most salient for anticancer drug synergism.
4.5. Synthetic Lethality Prediction
Synthetic lethality (SL) is a situation where defects in two genes impair cell viability, but a defect in a single gene (of a pair) does not; if one gene is a cancer-specific defective gene, then targeting the other gene will lead to cancer cell death, while sparing non-cancerous, normal, cells. Thus, in silico SL prediction emerged as one of the most effective methods for anticancer drug identification. Cai et. al. [
84], Wang et al. [
85] and Lai et al. [
86] pioneered multimodal GCN application to SL prediction, and demonstrated superior performance on the human SL datasets compared to the non-graph-representation in silico SL prediction methods. Liu et al. [
87] added features extracted from the multi-omics data to the GNN framework, thus expanding gene representation for SL prediction; notably, this work exploited the interpretability of the graph representation to explain the SL mechanism. Likewise, Zhu et al. [
88] focused predominantly on the gene-related knowledge graph interpretability (without losing the predictive performance). Most recently, Fan et al. [
89] developed a more complex, multi-view, GCN architecture, incorporating five biological modalities in a high-performance SL predictor.
In summary, SL prediction with gene graph representation is a relatively young but highly promising research area. We expect future research to concentrate on (i) refinement of the GNN architectures beyond "vanilla" GCNs, (ii) dissection of the SL mechanisms, enabled by the GNN’s interpretability, and (iii) integration of additional modalities in gene graph representations.
4.6. Prediction of ncRNA (miRNA, piRNA, lncRNA) and circRNA—Cancer Associations
Prediction of ncRNA-disease associations is a robust and well-established computational biology research field. GNNs can efficiently represent the interplay between ncRNA similarity networks and disease similarity networks; this potential has been recognized early on in GNN emergence [
90,
91,
92,
93]. Subsequent and recent work in the context of cancer included using GNN models for miRNA-cancer association prediction [
94,
95,
96,
97,
98,
99,
100], piRNA-cancer association prediction [
101], lncRNA-cancer association prediction [
102,
103,
104,
105], and circRNA-cancer association prediction [
106]. An interesting recent development is using multimodal GNNs to predict association not with disease but with anticancer drug resistance — for example, Liu et al. [
107] incorporated disease-related information into the multimodal GNN predictor of circRNA-drug resistance associations, whereas Gao and Shang [
108] used a GAT model for identifying lncRNA-drug resistance associations.
In summary, applying GNNs to dissect ncRNA-cancer associations is a mature field. We see future research progress as largely incremental, with further architecture refinements and extensions in the multivariate directions (e.g., identifying ncRNA-cancer associations together with ncRNA-anticancer drug resistance associations, identifying ncRNA-multi-disease associations).
4.7. Other Research Directions, Activities, and Modalities
There is a variety of innovative and promising GNN-based cancer and oncology research situated outside of the above six categories (
Section 4.1 - 4.6). Some of the earliest work in the cancer-GNN junction aimed at the prediction of cancer driver genes with GCNs [
109]; this was followed with a comprehensive study by Song et al. [
110] developing a robust multimodal (36 features plus PPI) GAT-centered framework for identification of driver genes across different cancers. Yang et al. [
111] focused on a narrower problem of identifying a small number of genes for a cancer-specific tumor mutational burden estimation panel, essential for estimating the potential effectiveness of immune checkpoint inhibitor therapy. On the subject of immunotherapy, Wu et al. [
112] developed a multimodal GAT-centered platform for neoantigen immunogenicity prediction; combined with a comprehensive database of experimentally validated neoantigens it provides a bridge to the clinical application of neoantigen-based cancer immunotherapy.
Chen et al. [
113] used a GCN-SVM (support vector machine) architecture to combine disease similarity networks with metabolite similarity networks in order to identify ovarian cancer-related metabolites. Fradkin et al. [
114] developed a GAT model for molecule carcinogenicity prediction demonstrating high generalization prediction accuracy. These two studies once again demonstrate the multi-level representation scope of GNN models, from the ontology networks down to molecular structures.
Several recent studies applied GNN representation and learning to radiotherapy optimization and planning. Kafaei et al. [
115] developed a GNN / reinforcement learning model for simultaneous beam orientation and trajectory optimization of Cyberknife, achieving shorter treatment times without compromising the efficacy of radiotherapy. Shao et al. [
116] used a GNN representation (from a single onboard x-ray projection) of a liver surface model that accurately translated, via real-time biomechanical modeling, to liver tumor localization, thus optimizing image-guided radiotherapy. Subsequently, Shao et al. [
117] incorporated surface imaging in the above framework. A clinical decision support system for response-adaptive radiotherapy developed by Niraula et al. [
118] used GNNs to model inter-predictive-feature relationships and avoid nonphysical treatment response, demonstrating performance improvements on clinical decision-making.
In summary, there are still many hitherto unexplored (or explored to a limited degree, such as in the case of radiotherapy planning) areas of application of GNNs to cancer. Broadly speaking, if the input data/information can be naturally represented in a graph structure form, and if the dataset size/dimensionality suggests DL, the investigators should consider GNNs. Even if only one data type or modality fits better with a graph representation, adding a GNN module to a complex DL architecture might improve both overall performance and interpretability. Alternatively, oftentimes features generated from the non-graph modalities can best be integrated in a graph form. Higher interpretability and multi-level or multi-modal representation are the crucial added value that GNNs contribute to the analysis pipeline.
5. Discussion and Conclusions
5.1. Pragmatic Considerations for GNN Deployment
The question of whether to use GNNs (as opposed, or in addition, to "vanilla" deep learning) in the predictive analysis and modeling of cancer and oncology research-related big data largely comes down to the datatypes and modalities. If one or more of the latter are more naturally represented in a graph structure form, then GNNs are indicated. Such data may include chemical structures, gene co-expression networks, PPI networks, drug-drug networks, spatially resolved imaging data, digital pathology data, patient networks in various clinical and epidemiological contexts, knowledge graphs, and multimodal biological networks in general. The actual modus operandi might be a GNN used for feature extraction followed by a DL predictor, or diffusion of information over a multimodal graph, or incorporation of a knowledge graph in the DL architecture; numerous increasingly sophisticated multi-cluster GNN-containing DL architectures are being currently developed to address diverse cancer and oncology research problems in a customized fashion.
There are three major advantages to GNNs, with two of them largely self-evident: intrinsic capability for multimodality (handling different datatypes in the same analysis framework) and interpretability (graph structures are more intuitive than layers and weights). The third advantage, higher predictive performance, is less immediately obvious, but has been amply demonstrated across the different tasks (
Section 4.1-4.7), and can probably be at least partially attributed to less contextual information loss in the GNN/DL pipelines, and easier harmonization of the different datatypes. It is important to remember that, although higher interpretability and more natural data structure representations are always desirable in and of themselves, the primary goal remains higher predictive accuracy — and it is gratifying to observe that GNNs-centered architectures are at least as high-performing as more established baseline and state-of-the-art non-graph DL models.
The choice of GNNs vs. graphical models is less straightforward. Here, the two primary considerations should be the main activity (prediction vs. model selection/dissection, respectively) and data dimensionality. GNNs, and DL in general, achieve high predictive performance on the large datasets, but their mechanistic and causal interpretability is still limited (even in the case of GNNs) in comparison with probabilistic graphical models. A big part of it is the ability of probabilistic graphical models, such as BNs, to propagate probabilistic inference, and to model perturbations in silico. On a fundamental level, this reflects the principal difference in connectivity representation: belief propagation in probabilistic graphical models vs. message parsing in GNNs. GNNs are more efficient learners when the graph structures (topologies) are largely preset, such as when the networks (chemical structures, gene co-expression networks, PPI networks, drug-drug networks, hard-coded knowledge graphs, etc.) are imported from the other analyses. Of course, GNNs can also be used for the data-driven model (topology) selection, via edge-level tasks, just as graphical models can be used for node prediction and graph-level tasks, but these are not the primary motivations behind their respective applications.
To give a broad recommendation, if the features are well-defined, the datasets are not gigantic, and the primary activity is the mechanistic model selection with subsequent dissection/interpretation, graphical models might be a more natural choice. However, if the investigators are more interested in high predictive accuracy, some of the topologies are known or hard-coded (at least initially), and the data is big and features diffuse, GNN/DL approach appears to be superior (and faster). That being said, the latest work in the field suggests a trend towards bridging the gap between the graphical and causal models on the one hand and GNNs on the other; for example, Li et al. [
45] used GNNs to infer causative tumor features from CT data. More broadly, Vu and Thai [
119] and Hua et al. [
120] elaborated on the probabilistic explainability of GNNs and the potential GNNs-probabilistic graphical models synergies, with the ultimate goal being "probabilistic graphical models-enhanced GNNs" or, conversely, "GNN-enhanced probabilistic graphical models".
5.2. Challenges and Future Directions
We see two major interrelated challenges to the broader acceptance and deployment of the GNN methodology in cancer and oncology research settings. First, the sheer novelty of the technique(s) — it is unclear if the potential performance benefits over "traditional" big data DL make it worthwhile to explore new and more complex architectures. To address this concern, in this review we have demonstrated that GNNs tend to outperform non-graph DL across the board when the datatypes/modalities are amenable to the graph representation, with the added benefit of interpretability. However, this brings us to the second, more daunting, challenge — an absence of the independent and comprehensive cross-benchmarking studies for many, if not most, cancer and oncology-related data analysis activities enumerated in
Section 4.1-4.7. Having such studies is customary in the more mature fields in computational biology and medicine, ranging from phylogenetic analysis methods to gene regulatory network inference to tumor imaging segmentation, to name just a few. Conducting such studies in this domain will go a long way toward the wider acceptance of GNNs. Our intuition is that GNNs will indeed prove superior overall, but this remains to be convincingly demonstrated to a broad audience. There is a wealth of appropriate well-established benchmark datasets and "ground truth" knowledge in the domains covered in
Section 4.1-4.7, and so we are optimistic that the comprehensive independent cross-benchmarking studies are forthcoming. They are sorely needed.
That being said, in our surveying of the field we have identified at least six sufficiently mature research direction (
Section 4.1-4.6). In our opinion, the most promising future methodological research directions for the next few years will be (i) development of the "boutique" GNN-containing DL architectures specifically tailored to various combinations of modalities and predictive tasks, (ii) automated optimization of said architectures and training regimes, (iii) direct incorporation of human expertise into prediction and decision pipelines, (iv) incorporation of additional modalities, on many levels, into multiscale graphs and models, and (v) extension to multivariate predictions. As far as the actual cancer and oncology research tasks are concerned, we expect strong and growing research efforts in the areas of (i) cancer classification and subtyping using digital pathology augmented by other modalities, (ii) dissection of spatial heterogeneity in tumor microenvironments with an eye towards patient-level predictions, (iii) identification of molecular sub-structures most salient for anticancer drug synergism and synthetic lethality prediction, (iv) real-time radiotherapy planning, and (v) multimodal prediction of immunotherapy response.
5.3. Conclusions
GNNs appear to be superior to non-graph DL in many cancer and oncology research settings, particularly when the data is at least partially structured and multimodal, and when interpretability is desired. We anticipate that the future availability of independent and comprehensive cross-benchmarking studies will stimulate the broader acceptance of the GNN methodology in the field. From a different perspective, GNNs largely complement probabilistic graphical models, and we expect the increasing synergy between these two groups of models in the future. Cancer and oncology researchers and physician-scientists should consider GNNs as their principal secondary data analysis and predictive modeling tool if the data is big, multimodal, and one or more of the datatypes/modalities can be naturally represented as graph structures.
Author Contributions
Conceptualization, G.G. and A.S.R.; writing—original draft preparation, G.G. and A.S.R. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by NIH NLM grant number R01LM013138. G.G. was funded by Dr. Susumu Ohno Distinguished Investigator Fellowship. A.S.R. was funded by Dr. Susumu Ohno Chair in Theoretical Biology.
Acknowledgments
The authors are grateful to Sergio Branciamore, Russell C. Rockne, Peter P. Lee, Colton Ladbury and Nagarajan Vaidehi for stimulating discussions and useful comments regarding graph structure representations in various biomedical research domains.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the in the writing of the manuscript; in its conclusions; or in the decision to publish.
References
- Park, Y.; Heider, D.; Hauschild, A.C. Integrative Analysis of Next-Generation Sequencing for Next-Generation Cancer Research toward Artificial Intelligence. Cancers (Basel) 2021, 13. [Google Scholar] [CrossRef] [PubMed]
- Gori, M.; Monfardini, G.; Scarselli, F. A new model for learning in graph domains. Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., 2005, Vol. 2, pp. 729–734 vol. 2. [CrossRef]
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans Neural Netw 2009, 20, 61–80. [Google Scholar] [CrossRef] [PubMed]
- Micheli, A. Neural network for graphs: a contextual constructive approach. IEEE Trans Neural Netw 2009, 20, 498–511. [Google Scholar] [CrossRef]
- Ladbury, C.; Zarinshenas, R.; Semwal, H.; Tam, A.; Vaidehi, N.; Rodin, A.S.; Liu, A.; Glaser, S.; Salgia, R.; Amini, A. Utilization of model-agnostic explainable artificial intelligence frameworks in oncology: a narrative review. Transl Cancer Res 2022, 11, 3853–3868. [Google Scholar] [CrossRef] [PubMed]
- Ladbury, C.; Amini, A.; Govindarajan, A.; Mambetsariev, I.; Raz, D.J.; Massarelli, E.; Williams, T.; Rodin, A.; Salgia, R. Integration of artificial intelligence in lung cancer: Rise of the machine. Cell Rep Med 2023, 4, 100933. [Google Scholar] [CrossRef] [PubMed]
- Wysocka, M.; Wysocki, O.; Zufferey, M.; Landers, D.; Freitas, A. A systematic review of biologically-informed deep learning models for cancer: fundamental trends for encoding and interpreting oncology data. BMC Bioinformatics 2023, 24, 198. [Google Scholar] [CrossRef] [PubMed]
- Jiang, X.; Hu, Z.; Wang, S.; Zhang, Y. Deep Learning for Medical Image-Based Cancer Diagnosis. Cancers (Basel) 2023, 15. [Google Scholar] [CrossRef] [PubMed]
- Meng, X.; Zou, T. Clinical applications of graph neural networks in computational histopathology: A review. Comput Biol Med 2023, 164, 107201. [Google Scholar] [CrossRef]
- Levy, J.; Haudenschild, C.; Barwick, C.; Christensen, B.; Vaickus, L. Topological Feature Extraction and Visualization of Whole Slide Images using Graph Neural Networks. Pac Symp Biocomput 2021, 26, 285–296. [Google Scholar]
- He, Y.; Zhao, H.; Wong, S.T.C. Deep learning powers cancer diagnosis in digital pathology. Comput Med Imaging Graph 2021, 88, 101820. [Google Scholar] [CrossRef]
- Zhang, X.M.; Liang, L.; Liu, L.; Tang, M.J. Graph Neural Networks and Their Current Applications in Bioinformatics. Front Genet 2021, 12, 690049. [Google Scholar] [CrossRef]
- Chen, Y.; Zhang, L. How much can deep learning improve prediction of the responses to drugs in cancer cell lines? Brief Bioinform 2022, 23. [Google Scholar] [CrossRef]
- Jin, S.; Zeng, X.; Xia, F.; Huang, W.; Liu, X. Application of deep learning methods in biological networks. Brief Bioinform 2021, 22, 1902–1917. [Google Scholar] [CrossRef] [PubMed]
- Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y. Spectral Networks and Locally Connected Networks on Graphs. 2014; arXiv:cs.LG/1312.6203]. [Google Scholar]
- Atwood, J.; Towsley, D. Diffusion-Convolutional Neural Networks. 2016; arXiv:cs.LG/1511.02136]. [Google Scholar]
- Zhang, Z.; Cui, P.; Zhu, W. Deep Learning on Graphs: A Survey. 2020; arXiv:cs.LG/1812.04202]. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph Attention Networks. 2018; arXiv:stat.ML/1710.10903]. [Google Scholar]
- Tian, F.; Gao, B.; Cui, Q.; Chen, E.; Liu, T.Y. Learning Deep Representations for Graph Clustering. Proceedings of the AAAI Conference on Artificial Intelligence 2014, 28. [Google Scholar] [CrossRef]
- Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph Neural Networks: A Review of Methods and Applications. 2021; arXiv:cs.LG/1812.08434]. [Google Scholar]
- Ju, W.; Fang, Z.; Gu, Y.; Liu, Z.; Long, Q.; Qiao, Z.; Qin, Y.; Shen, J.; Sun, F.; Xiao, Z.; Yang, J.; Yuan, J.; Zhao, Y.; Luo, X.; Zhang, M. A Comprehensive Survey on Deep Graph Representation Learning. 2023; arXiv:cs.LG/2304.05055]. [Google Scholar]
- Pearl, J. Probabilistic reasoning in intelligent systems: networks of plausible inference; Morgan Kaufmann, 1988.
- Pearl, J. Causality: Models, Reasoning, and Inference; Cambridge University Press, 2000.
- Gogoshin, G.; Boerwinkle, E.; Rodin, A.S. New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data. J Comput Biol 2017, 24, 340–356. [Google Scholar] [CrossRef]
- Yu, Y.; Chen, J.; Gao, T.; Yu, M. DAG-GNN: DAG Structure Learning with Graph Neural Networks. 2019; arXiv:cs.LG/1904.10098]. [Google Scholar]
- Zheng, X.; Aragam, B.; Ravikumar, P.; Xing, E.P. DAGs with NO TEARS: Continuous Optimization for Structure Learning. 2018; arXiv:stat.ML/1803.01422]. [Google Scholar]
- Rudin, C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. 2019; arXiv:stat.ML/1811.10154]. [Google Scholar]
- Wang, C.; Guo, J.; Zhao, N.; Liu, Y.; Liu, X.; Liu, G.; Guo, M. A Cancer Survival Prediction Method Based on Graph Convolutional Network. IEEE Trans Nanobioscience 2020, 19, 117–126. [Google Scholar] [CrossRef]
- Qiu, L.; Li, H.; Wang, M.; Wang, X. Gated Graph Attention Network for Cancer Prediction. Sensors (Basel) 2021, 21. [Google Scholar] [CrossRef] [PubMed]
- Gao, J.; Lyu, T.; Xiong, F.; Wang, J.; Ke, W.; Li, Z. Predicting the Survival of Cancer Patients With Multimodal Graph Neural Network. IEEE/ACM Trans Comput Biol Bioinform 2022, 19, 699–709. [Google Scholar]
- Kim, S.Y. GNN-surv: Discrete-Time Survival Prediction Using Graph Neural Networks. Bioengineering (Basel) 2023, 10. [Google Scholar] [CrossRef] [PubMed]
- Liang, B.; Gong, H.; Lu, L.; Xu, J. Risk stratification and pathway analysis based on graph neural network and interpretable algorithm. BMC Bioinformatics 2022, 23, 394. [Google Scholar] [CrossRef]
- Lian, J.; Long, Y.; Huang, F.; Ng, K.S.; Lee, F.M.Y.; Lam, D.C.L.; Fang, B.X.L.; Dou, Q.; Vardhanabhuti, V. Imaging-Based Deep Graph Neural Networks for Survival Analysis in Early Stage Lung Cancer Using CT: A Multicenter Study. Front Oncol 2022, 12, 868186. [Google Scholar] [CrossRef] [PubMed]
- Lee, Y.; Park, J.H.; Oh, S.; Shin, K.; Sun, J.; Jung, M.; Lee, C.; Kim, H.; Chung, J.H.; Moon, K.C.; Kwon, S. Derivation of prognostic contextual histopathological features from whole-slide images of tumours via graph deep learning. Nat Biomed Eng 2022. [Google Scholar] [CrossRef] [PubMed]
- Lian, J.; Deng, J.; Hui, E.S.; Koohi-Moghadam, M.; She, Y.; Chen, C.; Vardhanabhuti, V. Early stage NSCLS patients’ prognostic prediction with multi-information using transformer and graph neural network model. Elife 2022, 11. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Wang, Y.G.; Hu, C.; Li, M.; Fan, Y.; Otter, N.; Sam, I.; Gou, H.; Hu, Y.; Kwok, T.; Zalcberg, J.; Boussioutas, A.; Daly, R.J.; far, G.; ò, P.; Xu, D.; Webb, G.I.; Song, J. Cell graph neural networks enable the precise prediction of patient survival in gastric cancer. NPJ Precis Oncol 2022, 6, 45. [Google Scholar] [CrossRef] [PubMed]
- Li, B.; Nelson, M.S.; Savari, O.; Loeffler, A.G.; Eliceiri, K.W. Differentiation of pancreatic ductal adenocarcinoma and chronic pancreatitis using graph neural networks on histopathology and collagen fiber features. J Pathol Inform 2022, 13, 100158. [Google Scholar] [CrossRef] [PubMed]
- Ding, M.; Cui, H.; Li, B.; Zou, B.; Fan, B.; Ma, L.; Wang, Z.; Li, W.; Yu, J.; Wang, L. Integrating Preoperative Computed Tomography and Clinical Factors for Lymph Node Metastasis Prediction in Esophageal Squamous Cell Carcinoma by Feature-Wise Attentional Graph Neural Network. Int J Radiat Oncol Biol Phys 2023, 116, 676–689. [Google Scholar] [CrossRef]
- Hu, D.; Li, S.; Wu, N.; Lu, X. A Multi-modal Heterogeneous Graph Forest to Predict Lymph Node Metastasis of Non-small Cell Lung Cancer. IEEE J Biomed Health Inform 2023, PP. [Google Scholar] [CrossRef] [PubMed]
- Graham, S.; Minhas, F.; Bilal, M.; Ali, M.; Tsang, Y.W.; Eastwood, M.; Wahab, N.; Jahanifar, M.; Hero, E.; Dodd, K.; Sahota, H.; Wu, S.; Lu, W.; Azam, A.; Benes, K.; Nimir, M.; Hewitt, K.; Bhalerao, A.; Robinson, A.; Eldaly, H.; Raza, S.E.A.; Gopalakrishnan, K.; Snead, D.; Rajpoot, N. Screening of normal endoscopic large bowel biopsies with interpretable graph learning: a retrospective study. Gut 2023, 72, 1709–1721. [Google Scholar] [CrossRef]
- Fu, X.; Patrick, E.; Yang, J.Y.H.; Feng, D.D.; Kim, J. Deep multimodal graph-based network for survival prediction from highly multiplexed images and patient variables. Comput Biol Med 2023, 154, 106576. [Google Scholar] [CrossRef]
- Zhu, J.; Oh, J.H.; Simhal, A.K.; Elkin, R.; Norton, L.; Deasy, J.O.; Tannenbaum, A. Geometric graph neural networks on multi-omics data to predict cancer survival outcomes. Comput Biol Med 2023, 163, 107117. [Google Scholar] [CrossRef]
- Zhang, Y.; Xiong, S.; Wang, Z.; Liu, Y.; Luo, H.; Li, B.; Zou, Q. Local augmented graph neural network for multi-omics cancer prognosis prediction and analysis. Methods 2023, 213, 1–9. [Google Scholar] [CrossRef] [PubMed]
- Li, R.; Zhou, L.; Wang, Y.; Shan, F.; Chen, X.; Liu, L. A graph neural network model for the diagnosis of lung adenocarcinoma based on multimodal features and an edge-generation network. Quant Imaging Med Surg 2023, 13, 5333–5348. [Google Scholar] [CrossRef] [PubMed]
- Li, X.; Guo, R.; Lu, J.; Chen, T.; Qian, X. Causality-Driven Graph Neural Network for Early Diagnosis of Pancreatic Cancer in Non-Contrast Computerized Tomography. IEEE Trans Med Imaging 2023, 42, 1656–1667. [Google Scholar] [CrossRef]
- Azher, Z.L.; Suvarna, A.; Chen, J.Q.; Zhang, Z.; Christensen, B.C.; Salas, L.A.; Vaickus, L.J.; Levy, J.J. Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication. BioData Min 2023, 16, 23. [Google Scholar] [CrossRef] [PubMed]
- Wang, A.; Ding, R.; Zhang, J.; Zhang, B.; Huang, X.; Zhou, H. Machine Learning of Histomorphological Features Predict Response to Neoadjuvant Therapy in Locally Advanced Rectal Cancer. J Gastrointest Surg 2023, 27, 162–165. [Google Scholar] [CrossRef] [PubMed]
- Zhao, L.; Qi, X.; Chen, Y.; Qiao, Y.; Bu, D.; Wu, Y.; Luo, Y.; Wang, S.; Zhang, R.; Zhao, Y. Biological knowledge graph-guided investigation of immune therapy response in cancer with graph neural network. Brief Bioinform 2023, 24. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Y.; Graham, S.; Koohbanani, N.A.; Shaban, M.; Heng, P.A.; Rajpoot, N. CGC-Net: Cell Graph Convolutional Network for Grading of Colorectal Cancer Histology Images. 2019; arXiv:eess.IV/1909.01068]. [Google Scholar]
- Lu, W.; Toss, M.; Dawood, M.; Rakha, E.; Rajpoot, N.; Minhas, F. : Whole slide image level graphs to predict HER2 status in breast cancer. Med Image Anal 2022, 80, 102486. [Google Scholar] [CrossRef] [PubMed]
- Pati, P.; Jaume, G.; guez, A.; Feroce, F.; Anniciello, A.M.; Scognamiglio, G.; Brancati, N.; Fiche, M.; Dubruc, E.; Riccio, D.; Di Bonito, M.; De Pietro, G.; Botti, G.; Thiran, J.P.; Frucci, M.; Goksel, O.; Gabrani, M. Hierarchical graph representations in digital pathology. Med Image Anal 2022, 75, 102264. [Google Scholar] [CrossRef]
- Wang, H.; Huang, G.; Zhao, Z.; Cheng, L.; Juncker-Jensen, A.; Nagy, M.L.; Lu, X.; Zhang, X.; Chen, D.Z. CCF-GNN: A Unified Model Aggregating Appearance, Microenvironment, and Topology for Pathology Image Classification. IEEE Trans Med Imaging 2023, PP. [Google Scholar] [CrossRef]
- Abbas, S.F.; Vuong, T.T.L.; Kim, K.; Song, B.; Kwak, J.T. Multi-cell type and multi-level graph aggregation network for cancer grading in pathology images. Med Image Anal 2023, 90, 102936. [Google Scholar] [CrossRef]
- Zhang, J.; Mao, Y.; Li, J.; Li, Y.; Luo, J. A metric learning-based method using graph neural network for pancreatic cystic neoplasm classification from CTs. Med Phys 2022, 49, 5523–5536. [Google Scholar] [CrossRef] [PubMed]
- Ravinder, M.; Saluja, G.; Allabun, S.; Alqahtani, M.S.; Abbas, M.; Othman, M.; Soufiene, B.O. Enhanced brain tumor classification using graph convolutional neural network architecture. Sci Rep 2023, 13, 14938. [Google Scholar] [CrossRef] [PubMed]
- Ma, Q.; Zhou, S.; Li, C.; Liu, F.; Liu, Y.; Hou, M.; Zhang, Y. DGRUnit: Dual graph reasoning unit for brain tumor segmentation. Comput Biol Med 2022, 149, 106079. [Google Scholar] [CrossRef] [PubMed]
- Yin, C.; Cao, Y.; Sun, P.; Zhang, H.; Li, Z.; Xu, Y.; Sun, H. Molecular Subtyping of Cancer Based on Robust Graph Neural Network and Multi-Omics Data Integration. Front Genet 2022, 13, 884028. [Google Scholar] [CrossRef] [PubMed]
- Kesimoglu, Z.N.; Bozdag, S. SUPREME: multiomics data integration using graph convolutional networks. NAR Genom Bioinform 2023, 5, lqad063. [Google Scholar] [CrossRef] [PubMed]
- Furtney, I.; Bradley, R.; Kabuka, M.R. Patient Graph Deep Learning to Predict Breast Cancer Molecular Subtype. IEEE/ACM Trans Comput Biol Bioinform 2023, 20, 3117–3127. [Google Scholar] [CrossRef] [PubMed]
- Partel, G.; hlby, C. Spage2vec: Unsupervised representation of localized spatial gene expression signatures. FEBS J 2021, 288, 1859–1870. [Google Scholar] [CrossRef]
- Solorzano, L.; Wik, L.; Olsson Bontell, T.; Wang, Y.; Klemm, A.H.; fverstedt, J.; Jakola, A.S.; stman, A.; hlby, C. Machine learning for cell classification and neighborhood analysis in glioma tissue. Cytometry A 2021, 99, 1176–1186. [Google Scholar] [CrossRef] [PubMed]
- Zeng, Y.; Wei, Z.; Yu, W.; Yin, R.; Yuan, Y.; Li, B.; Tang, Z.; Lu, Y.; Yang, Y. Spatial transcriptomics prediction from histology jointly through Transformer and graph neural networks. Brief Bioinform 2022, 23. [Google Scholar] [CrossRef]
- Chang, Y.; He, F.; Wang, J.; Chen, S.; Li, J.; Liu, J.; Yu, Y.; Su, L.; Ma, A.; Allen, C.; Lin, Y.; Sun, S.; Liu, B.; Javier Otero, J.; Chung, D.; Fu, H.; Li, Z.; Xu, D.; Ma, Q. Define and visualize pathological architectures of human tissues from spatially resolved transcriptomics using deep learning. Comput Struct Biotechnol J 2022, 20, 4600–4617. [Google Scholar] [CrossRef]
- Qiu, L.; Kang, D.; Wang, C.; Guo, W.; Fu, F.; Wu, Q.; Xi, G.; He, J.; Zheng, L.; Zhang, Q.; Liao, X.; Li, L.; Chen, J.; Tu, H. Intratumor graph neural network recovers hidden prognostic value of multi-biomarker spatial heterogeneity. Nat Commun 2022, 13, 4250. [Google Scholar] [CrossRef] [PubMed]
- Ding, K.; Zhou, M.; Wang, H.; Zhang, S.; Metaxas, D.N. Spatially aware graph neural networks and cross-level molecular profile prediction in colon cancer histopathology: a retrospective multi-cohort study. Lancet Digit Health 2022, 4, e787–e795. [Google Scholar] [CrossRef]
- Wu, Z.; Trevino, A.E.; Wu, E.; Swanson, K.; Kim, H.J.; D’Angio, H.B.; Preska, R.; Charville, G.W.; Dalerba, P.D.; Egloff, A.M.; Uppaluri, R.; Duvvuri, U.; Mayer, A.T.; Zou, J. Graph deep learning for the characterization of tumour microenvironments from spatial protein profiles in tissue specimens. Nat Biomed Eng 2022, 6, 1435–1448. [Google Scholar] [CrossRef] [PubMed]
- Cui, C.; Ding, X.; Wang, D.; Chen, L.; Xiao, F.; Xu, T.; Zheng, M.; Luo, X.; Jiang, H.; Chen, K. Drug repurposing against breast cancer by integrating drug-exposure expression profiles and drug-drug links based on graph neural network. Bioinformatics 2021, 37, 2930–2937. [Google Scholar] [CrossRef] [PubMed]
- Gonzalez, G.; Gong, S.; Laponogov, I.; Bronstein, M.; Veselkov, K. Predicting anticancer hyperfoods with graph convolutional networks. Hum Genomics 2021, 15, 33. [Google Scholar] [CrossRef] [PubMed]
- Gao, Y.; Chen, S.; Tong, J.; Fu, X. Topology-enhanced molecular graph representation for anti-breast cancer drug selection. BMC Bioinformatics 2022, 23, 382. [Google Scholar] [CrossRef] [PubMed]
- Zuo, Z.; Wang, P.; Chen, X.; Tian, L.; Ge, H.; Qian, D. SWnet: a deep learning model for drug response prediction from cancer genomic signatures and compound chemical structures. BMC Bioinformatics 2021, 22, 434. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Y.; Ouyang, Z.; Chen, W.; Feng, R.; Chen, D.Z.; Cao, J.; Wu, J. TGSA: protein-protein association-based twin graph neural networks for drug response prediction with similarity augmentation. Bioinformatics 2022, 38, 461–468. [Google Scholar] [CrossRef]
- Liu, X.; Song, C.; Huang, F.; Fu, H.; Xiao, W.; Zhang, W. GraphCDR: a graph neural network method with contrastive learning for cancer drug response prediction. Brief Bioinform 2022, 23. [Google Scholar] [CrossRef]
- Pu, L.; Singha, M.; Ramanujam, J.; Brylinski, M. CancerOmicsNet: a multi-omics network-based approach to anti-cancer drug profiling. Oncotarget 2022, 13, 695–706. [Google Scholar] [CrossRef] [PubMed]
- Singha, M.; Pu, L.; Stanfield, B.A.; Uche, I.K.; Rider, P.J.F.; Kousoulas, K.G.; Ramanujam, J.; Brylinski, M. Artificial intelligence to guide precision anticancer therapy with multitargeted kinase inhibitors. BMC Cancer 2022, 22, 1211. [Google Scholar] [CrossRef] [PubMed]
- Shin, J.; Piao, Y.; Bang, D.; Kim, S.; Jo, K. DRPreter: Interpretable Anticancer Drug Response Prediction Using Knowledge-Guided Graph Neural Networks and Transformer. Int J Mol Sci 2022, 23. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Zhou, Y.; Zhang, Y.; Mo, Y.K.; Wang, Y. XMR: an explainable multimodal neural network for drug response prediction. Front Bioinform 2023, 3, 1164482. [Google Scholar] [CrossRef] [PubMed]
- Peng, W.; Liu, H.; Dai, W.; Yu, N.; Wang, J. Predicting cancer drug response using parallel heterogeneous graph convolutional networks with neighborhood interactions. Bioinformatics 2022, 38, 4546–4553. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Tong, S.; Chen, Y. HMM-GDAN: Hybrid multi-view and multi-scale graph duplex-attention networks for drug response prediction in cancer. Neural Netw 2023, 167, 213–222. [Google Scholar] [CrossRef] [PubMed]
- Oloulade, B.M.; Gao, J.; Chen, J.; Al-Sabri, R.; Wu, Z. Cancer drug response prediction with surrogate modeling-based graph neural architecture search. Bioinformatics 2023, 39. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Liu, X.; Shen, S.; Deng, L.; Liu, H. DeepDDS: deep graph neural network with attention mechanism to predict synergistic drug combinations. Brief Bioinform 2022, 23. [Google Scholar] [CrossRef] [PubMed]
- Bao, X.; Sun, J.; Yi, M.; Qiu, J.; Chen, X.; Shuai, S.C.; Zhao, Q. MPFFPSDC: A multi-pooling feature fusion model for predicting synergistic drug combinations. Methods 2023, 217, 1–9. [Google Scholar] [CrossRef]
- Dong, Z.; Zhang, H.; Chen, Y.; Payne, P.R.O.; Li, F. Interpreting the Mechanism of Synergism for Drug Combinations Using Attention-Based Hierarchical Graph Pooling. Cancers (Basel) 2023, 15. [Google Scholar] [CrossRef]
- Ren, Z.H.; You, Z.H.; Yu, C.Q.; Li, L.P.; Guan, Y.J.; Guo, L.X.; Pan, J. A biomedical knowledge graph-based method for drug-drug interactions prediction through combining local and global features with deep neural networks. Brief Bioinform 2022, 23. [Google Scholar] [CrossRef]
- Cai, R.; Chen, X.; Fang, Y.; Wu, M.; Hao, Y. Dual-dropout graph convolutional network for predicting synthetic lethality in human cancers. Bioinformatics 2020, 36, 4458–4465. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Xu, F.; Li, Y.; Wang, J.; Zhang, K.; Liu, Y.; Wu, M.; Zheng, J. KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2021, 37, i418–i425. [Google Scholar] [CrossRef] [PubMed]
- Lai, M.; Chen, G.; Yang, H.; Yang, J.; Jiang, Z.; Wu, M.; Zheng, J. Predicting Synthetic Lethality in Human Cancers via Multi-Graph Ensemble Neural Network. Annu Int Conf IEEE Eng Med Biol Soc 2021, 2021, 1731–1734. [Google Scholar] [PubMed]
- Liu, X.; Yu, J.; Tao, S.; Yang, B.; Wang, S.; Wang, L.; Bai, F.; Zheng, J. PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2022, 38, ii106–ii112. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Y.; Zhou, Y.; Liu, Y.; Wang, X.; Li, J. SLGNN: synthetic lethality prediction in human cancers based on factor-aware knowledge graph neural network. Bioinformatics 2023, 39. [Google Scholar] [CrossRef]
- Fan, K.; Tang, S.; ğ, B.; Cheng, L.; Li, L. Multi-view graph convolutional network for cancer cell-specific synthetic lethality prediction. Front Genet 2022, 13, 1103092. [Google Scholar] [CrossRef]
- Li, C.; Liu, H.; Hu, Q.; Que, J.; Yao, J. A Novel Computational Model for Predicting microRNA-Disease Associations Based on Heterogeneous Graph Convolutional Networks. Cells 2019, 8. [Google Scholar] [CrossRef]
- Li, J.; Zhang, S.; Liu, T.; Ning, C.; Zhang, Z.; Zhou, W. Neural inductive matrix completion with graph convolutional networks for miRNA-disease association prediction. Bioinformatics 2020, 36, 2538–2546. [Google Scholar] [CrossRef]
- Xuan, P.; Pan, S.; Zhang, T.; Liu, Y.; Sun, H. Graph Convolutional Network and Convolutional Neural Network Based Method for Predicting lncRNA-Disease Associations. Cells 2019, 8. [Google Scholar] [CrossRef]
- Li, J.; Li, Z.; Nie, R.; You, Z.; Bao, W. FCGCNMDA: predicting miRNA-disease associations by applying fully connected graph convolutional networks. Mol Genet Genomics 2020, 295, 1197–1209. [Google Scholar] [CrossRef]
- Wang, J.; Li, J.; Yue, K.; Wang, L.; Ma, Y.; Li, Q. NMCMDA: neural multicategory MiRNA-disease association prediction. Brief Bioinform 2021, 22. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Li, J.; Nie, R.; You, Z.H.; Bao, W. A graph auto-encoder model for miRNA-disease associations prediction. Brief Bioinform 2021, 22. [Google Scholar] [CrossRef] [PubMed]
- Ma, M.; Na, S.; Zhang, X.; Chen, C.; Xu, J. SFGAE: a self-feature-based graph autoencoder model for miRNA-disease associations prediction. Brief Bioinform 2022, 23. [Google Scholar] [CrossRef] [PubMed]
- Li, M.; Fan, Y.; Zhang, Y.; Lv, Z. Using Sequence Similarity Based on CKSNP Features and a Graph Neural Network Model to Identify miRNA-Disease Associations. Genes (Basel) 2022, 13. [Google Scholar] [CrossRef]
- Huang, C.; Cen, K.; Zhang, Y.; Liu, B.; Wang, Y.; Li, J. MEAHNE: miRNA-Disease Association Prediction Based on Semantic Information in a Heterogeneous Network. Life (Basel) 2022, 12. [Google Scholar] [CrossRef]
- Yu, L.; Ju, B.; Ren, S. HLGNN-MDA: Heuristic Learning Based on Graph Neural Networks for miRNA-Disease Association Prediction. Int J Mol Sci 2022, 23. [Google Scholar] [CrossRef] [PubMed]
- Hu, H.; Zhao, H.; Zhong, T.; Dong, X.; Wang, L.; Han, P.; Li, Z. Adaptive deep propagation graph neural network for predicting miRNA-disease associations. Brief Funct Genomics 2023. [Google Scholar] [CrossRef]
- Zheng, K.; Zhang, X.L.; Wang, L.; You, Z.H.; Zhan, Z.H.; Li, H.Y. Line graph attention networks for predicting disease-associated Piwi-interacting RNAs. Brief Bioinform 2022, 23. [Google Scholar] [CrossRef]
- Xuan, P.; Zhan, L.; Cui, H.; Zhang, T.; Nakaguchi, T.; Zhang, W. Graph Triple-Attention Network for Disease-Related LncRNA Prediction. IEEE J Biomed Health Inform 2022, 26, 2839–2849. [Google Scholar] [CrossRef]
- Wang, L.; Zhong, C. gGATLDA: lncRNA-disease association prediction based on graph-level graph attention network. BMC Bioinformatics 2022, 23, 11. [Google Scholar] [CrossRef]
- Xuan, P.; Wang, S.; Cui, H.; Zhao, Y.; Zhang, T.; Wu, P. Learning global dependencies and multi-semantics within heterogeneous graph for predicting disease-related lncRNAs. Brief Bioinform 2022, 23. [Google Scholar] [CrossRef] [PubMed]
- Xuan, P.; Bai, H.; Cui, H.; Zhang, X.; Nakaguchi, T.; Zhang, T. Specific topology and topological connection sensitivity enhanced graph learning for lncRNA-disease association prediction. Comput Biol Med 2023, 164, 107265. [Google Scholar] [CrossRef] [PubMed]
- Guo, Y.; Yi, M. THGNCDA: circRNA-disease association prediction based on triple heterogeneous graph network. Brief Funct Genomics 2023. [Google Scholar] [CrossRef]
- Liu, Z.; Dai, Q.; Yu, X.; Duan, X.; Wang, C. Predicting circRNA-drug resistance associations based on a multimodal graph representation learning framework. IEEE J Biomed Health Inform 2023, PP. [Google Scholar] [CrossRef] [PubMed]
- Gao, M.; Shang, X. Identification of associations between lncRNA and drug resistance based on deep learning and attention mechanism. Front Microbiol 2023, 14, 1147778. [Google Scholar] [CrossRef]
- Schulte-Sasse, R.; Budach, S.; Hnisz, D.; Marsico, A. Graph Convolutional Networks Improve the Prediction of Cancer Driver Genes. Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions: 28th International Conference on Artificial Neural Networks, Munich, Germany, September 17–19, 2019, Proceedings; Springer-Verlag: Berlin, Heidelberg, 2019; pp. 658–668. [Google Scholar] [CrossRef]
- Song, H.; Yin, C.; Li, Z.; Feng, K.; Cao, Y.; Gu, Y.; Sun, H. Identification of Cancer Driver Genes by Integrating Multiomics Data with Graph Neural Networks. Metabolites 2023, 13. [Google Scholar] [CrossRef] [PubMed]
- Yang, W.; Qiang, Y.; Wu, W.; Xin, J. Graph-ETMB: A graph neural network-based model for tumour mutation burden estimation. Comput Biol Chem 2023, 105, 107900. [Google Scholar] [CrossRef] [PubMed]
- Wu, T.; Chen, J.; Diao, K.; Wang, G.; Wang, J.; Yao, H.; Liu, X.S. Neodb: a comprehensive neoantigen database and discovery platform for cancer immunotherapy. Database (Oxford) 2023, 2023. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Chen, Y.; Sun, K.; Wang, Y.; He, H.; Sun, L.; Ha, S.; Li, X.; Ou, Y.; Zhang, X.; Bi, Y. Prediction of Ovarian Cancer-Related Metabolites Based on Graph Neural Network. Front Cell Dev Biol 2021, 9, 753221. [Google Scholar] [CrossRef]
- Fradkin, P.; Young, A.; Atanackovic, L.; Frey, B.; Lee, L.J.; Wang, B. A graph neural network approach for molecule carcinogenicity prediction. Bioinformatics 2022, 38, i84–i91. [Google Scholar] [CrossRef]
- Kafaei, P.; Cappart, Q.; Renaud, M.A.; Chapados, N.; Rousseau, L.M. Graph neural networks and deep reinforcement learning for simultaneous beam orientation and trajectory optimization of Cyberknife. Phys Med Biol 2021, 66. [Google Scholar] [CrossRef] [PubMed]
- Shao, H.C.; Wang, J.; Bai, T.; Chun, J.; Park, J.C.; Jiang, S.; Zhang, Y. Real-time liver tumor localization via a single x-ray projection using deep graph neural network-assisted biomechanical modeling. Phys Med Biol 2022, 67. [Google Scholar] [CrossRef] [PubMed]
- Shao, H.C.; Li, Y.; Wang, J.; Jiang, S.; Zhang, Y. Real-time liver tumor localization via combined surface imaging and a single x-ray projection. Phys Med Biol 2023, 68. [Google Scholar] [CrossRef] [PubMed]
- Niraula, D.; Sun, W.; Jin, J.; Dinov, I.D.; Cuneo, K.; Jamaluddin, J.; Matuszak, M.M.; Luo, Y.; Lawrence, T.S.; Jolly, S.; Ten Haken, R.K.; El Naqa, I. A clinical decision support system for AI-assisted decision-making in response-adaptive radiotherapy (ARCliDS). Sci Rep 2023, 13, 5279. [Google Scholar] [CrossRef] [PubMed]
- Vu, M.N.; Thai, M.T. PGM-Explainer: Probabilistic Graphical Model Explanations for Graph Neural
Networks. 2020; arXiv:cs.LG/2010.05788]. [Google Scholar]
- Hua, C.; Luan, S.; Zhang, Q.; Fu, J. A: Neural Networks Intersect Probabilistic Graphical Models, 2023; arXiv:cs.AI/2206.06089].
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).