2.1. Simulated Annealing
Simulated Annealing (SA) is one of the most well-known single-solution based metaheuristics, inspired by the annealing process in metallurgy. In this process, metals are heated and then cooled in a controlled manner to decrease the energy states of the material, ultimately leading to a more stable structure. Analogously, SA explores the solution space by initially allowing worse solutions to be accepted with a higher probability, which helps in escaping local optima. As the algorithm proceeds, the acceptance probability decreases, focusing the search on areas around the current solution. In machine learning applications, such as GNNs, SA can be used to optimize parameters or network structures, helping the model avoid local minima during training. [
1]
Simulated Annealing (SA) is highly adaptable to graph-based problems, especially in cases where the solution space involves a combinatorial structure. For instance, in network routing problems, such as the shortest path problem or network flow optimization, SA can explore different routing configurations to minimize the cost or time associated with network traversal. It is particularly useful in graph partitioning tasks, where the goal is to divide a graph into smaller, balanced subgraphs while minimizing the number of edges cut between them. SA efficiently explores the space of possible partitions by allowing suboptimal partitions initially, which helps avoid getting stuck in local minima.
SA can also be employed in community detection within social networks. This problem involves identifying subgroups of nodes that are more densely connected internally than to the rest of the network. SA iteratively improves the partitioning, making it well-suited for handling the non-convex nature of the solution space often seen in social network structures.
2.2. Tabu Search
Tabu Search (TS) is another powerful single-solution based metaheuristic, which is distinguished by its use of a memory structure known as a "tabu list." The tabu search was first introduced in [
2]. The tabu list stores recently visited solutions, preventing the algorithm from revisiting these points in the search space. This encourages exploration of new areas and helps avoid cycling between solutions. In the context of GNNs, TS can be employed to optimize hyperparameters or the model architecture, enhancing the ability to escape local optima in high-dimensional search spaces where complex dependencies exist between nodes and edges in a graph. The memory feature makes TS particularly effective when applied to problems where the search space contains many local minima, as it promotes exploration beyond them.
Tabu Search (TS) is particularly effective for solving graph-related optimization problems where cycles or revisits to previously explored configurations can hinder the search process. In vehicle routing problems (VRP), TS can optimize delivery routes by maintaining a memory of previous solutions (tabu list) to avoid revisiting inefficient routing configurations. This is especially useful in dynamic routing scenarios where traffic conditions or demands change frequently, as TS can help explore alternative routes without revisiting the same paths.
In communication networks, where the goal is to maximize throughput or minimize latency, TS can optimize network configurations by preventing the exploration of suboptimal configurations that have been recently evaluated. Similarly, in graph coloring, where the aim is to color the nodes of a graph such that no two adjacent nodes share the same color with the least number of colors, TS ensures efficient exploration of coloring patterns without getting stuck in cycles of repeated configurations.
2.3. Variable Neighborhood Search
Variable Neighborhood Search (VNS) differs from both SA and TS by systematically changing the neighborhood structure during the search process. The core idea is that different neighborhood structures may reveal new, better solutions that are not accessible when using a single neighborhood definition. VNS alternates between exploring a local neighborhood of the current solution and jumping to a different neighborhood when stuck in a local optimum. This adaptive mechanism is well-suited for optimizing GNN-based models, as it can dynamically adjust the search process depending on the complexity of the graph’s structure. For example, different neighborhoods can represent various graph topologies or node/edge features, allowing VNS to efficiently explore the solution space.
Variable Neighborhood Search (VNS) is particularly well-suited for graph networks where the problem’s complexity requires exploring multiple neighborhood structures. In supply chain networks, for instance, VNS can be applied to optimize the location of facilities or distribution centers, switching between different neighborhood structures representing local and global supply chain configurations. By dynamically changing the neighborhood structure, VNS can escape local optima and find better configurations for minimizing transportation costs or improving service levels.
VNS is also highly applicable in network reliability problems. These problems often involve determining the most robust network configuration, given the possibility of node or edge failures. VNS systematically explores different subgraphs, trying to maximize network connectivity and robustness under varying failure scenarios. The flexibility of VNS in switching between neighborhoods allows it to adapt to the changing complexity of real-world network conditions.
In maximum clique problems, where the goal is to find the largest fully connected subgraph, VNS efficiently explores different subgraph configurations by switching between neighborhoods defined by clique sizes or edge densities. This makes it suitable for applications in social networks, biological networks, and other graph structures where cliques are a central feature.
2.4. Graph Based Machine Learning
Graph-based machine learning has gained significant attention due to its ability to model complex, relational data that is naturally represented as graphs. Unlike traditional data structures such as arrays or matrices, graphs can capture both the entities (nodes) and their relationships (edges) in a structured manner. This unique ability makes graph-based approaches well-suited for a wide range of applications, including social networks, biological networks, recommendation systems, and communication networks.
Traditional machine learning models were initially adapted to graph data through feature extraction techniques. Early methods relied on graph kernels, which aimed to measure the similarity between graphs by mapping them into a high-dimensional feature space. Notable works such as the Random Walk Kernel [
3] and the Weisfeiler-Lehman Kernel [
4] have been developed to compute graph similarities. These methods were instrumental in making traditional machine learning models compatible with graph data, but they suffered from high computational costs and limited scalability in large graph structures.
Another early technique that gained traction was spectral graph theory, which laid the foundation for learning on graphs by leveraging the graph Laplacian matrix. Techniques such as spectral clustering [
1] and manifold learning methods like Laplacian Eigenmaps [
1] and Diffusion Maps [
1] introduced ways to perform dimensionality reduction or clustering in graph-structured data. These spectral methods, however, also faced scalability issues due to the computational cost of eigenvalue decompositions, particularly in very large graphs.
The development of Graph Neural Networks (GNNs) marked a significant shift in graph-based machine learning, as these models could learn directly from the graph structure without requiring manual feature extraction. Early GNNs, such as the one proposed by [
5], extended the idea of recursive neural networks to graphs. However, it was with the introduction of Graph Convolutional Networks (GCNs) by [
6] that GNNs gained widespread adoption. GCNs generalized the concept of convolution from grid-based data (such as images) to arbitrary graphs, allowing for node embeddings to be computed based on the local graph structure.
Subsequent works introduced more sophisticated models that improved the expressive power of GNNs. For example, Graph Attention Networks (GATs) [
7] introduced an attention mechanism to focus on the most relevant neighboring nodes, improving performance in cases where node importance varies across the graph. GraphSAGE [
8] was another pivotal work that allowed for inductive learning on large graphs by sampling and aggregating information from a node’s neighbors. This enabled the model to generalize to unseen nodes and graphs, overcoming the limitations of transductive GCNs.
Beyond GCNs, several extensions to graph neural networks have emerged, targeting specific challenges in graph-based machine learning. Relational Graph Convolutional Networks (R-GCNs) [
5] extended GCNs to handle multi-relational graphs, which are graphs where edges can have multiple types, making them particularly useful for knowledge graphs and heterogeneous networks. Dynamic GNNs, such as the Temporal Graph Networks (TGNs) [
9], address the challenge of learning from dynamic or evolving graphs, which is crucial in applications like social networks and financial markets.
The combination of GNNs with reinforcement learning has also opened new research directions. Deep Q-Networks (DQN) and policy gradient methods have been integrated with GNNs to solve complex decision-making problems in graph-structured environments, such as traffic management and robot navigation.
Despite the impressive progress, several challenges remain in graph-based machine learning. One key issue is the scalability of GNNs on large, real-world graphs, such as social networks with millions or billions of nodes and edges. Techniques like mini-batching and sampling (e.g., GraphSAGE) have been proposed, but further work is needed to make GNNs more scalable without compromising accuracy.
Another challenge is the expressiveness of GNNs. Recent works have highlighted that traditional GNNs, such as GCNs, may struggle to capture certain graph properties, such as cycles or cliques, leading to the development of more powerful models like higher-order GNNs and neural message passing networks. Understanding the theoretical limitations of GNNs and designing architectures that can overcome them is a key area of ongoing research.
Furthermore, the interpretability of GNNs remains a significant concern, especially in domains like healthcare and finance, where understanding model predictions is critical. Techniques for explaining GNN decisions, such as GraphGrad-CAM or graph explainability frameworks, are being actively explored.