3.1. Problem Definition
In this paper, our goal is to effectively reduce the time and resource costs of parking prediction models by achieving a unified reduction in the dimensions of the structure and feature data of urban parking graphs, while ensuring the accuracy of parking prediction tasks. The urban parking graph can be represented as , where V denotes the set of parking nodes, E denotes the set of edges, and denotes the directed adjacency matrix, .
The parking prediction problem [
12] can be interpreted as: given a parking graph
G and historical parking data
for
T time slices before time
t, learning a function
F to predict future parking data
for
time slices after time
t, as shown in Equation 1.
where the parking data
represents the parking flow information observed at
T time slices for all parking lots. Each
corresponds to a time slice and includes
N parking lot nodes, each with
F dimensional features, including the longitude, latitude, openness to the public, charging situation, and real-time occupancy rate of the parking lot.
3.2. ParkingRank Graph Attention
Graph coarsening, as the current mainstream graph dimensionality reduction technique, aims to overcome the huge computational obstacles faced by large-scale graph data when processing, extracting and analyzing. Typically, the input for graph coarsening is the graph adjacency matrix [
21]. When dealing with complex structures such as urban parking graphs, the traditional adjacency matrix constructed from topological relationships is obviously difficult to adequately capture the complexity and diversity of the real parking landscape. In fact, drivers do not only consider the proximity of parking lots in the region when making parking decisions, but also comprehensively compare the distance of regional parking lots, real-time space occupancy rate, openness to the public, and charging situation, and other factors. Therefore, before performing the task of urban parking graph coarsening, we need a high-quality parking graph that can truly reflect the preference of parking behavior and adapt to the dynamic change of parking demand.
GAT [
45] is an advanced spatial-based graph neural network methodology, whose primary advantage lies in redefining the aggregation of node features. Unlike GCN that typically assign equal weights to all neighboring nodes for information aggregation, GAT introduces an attention mechanism. This allows the model to dynamically allocate varying weights based on the importance of each neighboring node. Such a strategy enables GAT to more effectively capture spatial correlations within road networks and selectively emphasize those nodes that are crucial for the current task. In the application of urban parking network prediction, this feature of GAT is particularly significant. It enables the network to flexibly capture important relationships between parking nodes, focusing on those nodes that are critical for the current prediction task. We believe that this not only helps in more accurately describing the correlations between different parking lots but also aids in constructing an efficient parking network that reflects the relevancy of parking spaces. The overall framework of our proposed PRGAT is shown in
Figure 1:
Given the complexity of the parking data X directly inputting the high-dimensional data into GAT would lead to considerable computational redundancy, thereby increasing the model’s time cost. To address this, we adopt a dimensionality reduction strategy that involves extracting and abstracting certain features of the parking lots. This effectively reduces the computational burden and enhances the model’s computational efficiency.
The ParkingRank model [
25] is an algorithm to quantitatively assess the service capability of different parking lots. By integrating complex parking information such as the total number of parking spaces in the parking lot, the degree of openness to the public, and the price of parking, it can help drivers to understand the actual situation of the parking lot in a more comprehensive way, so that they can make informed choices in the parking decision-making process. Specifically, the algorithm primarily focuses on three aspects to describe the real-time service capacity of parking lots, as shown in Equation 2:
Parking lot service range: This aspect considers which types of vehicles are allowed to park in the parking lot. For example, parking lots at shopping centers may be open to all vehicles, while those in residential areas may only serve residents. Therefore, parking lots with a broader service scope generally have stronger service capabilities.
Total number of parking spaces: The more internal parking spaces a parking lot has, the stronger its service capacity usually is.
Price of parking: Higher parking prices may reduce the number of vehicles able to afford parking fees. Thus, expensive prices may lower the service capacity of the parking lot.
where
represents the service capacity of each parking lot;
denotes the service range, which is used to measure the degree of openness of the parking lot, from the public parking lot (
) to the private parking lot (
);
and
represent the total number of parking spaces and the price of parking, respectively.
and
are the first-order paradigms for the
y- and
z-column vectors.
We believe that the process of quantifying the service capability of a parking lot is essentially a process of extracting the redundant parts of the parking lot features , and the low-dimensional evaluation results can be used as new abstract features to replace the original high-dimensional parking data .
Moreover, while GAT typically utilize Multi-Layer Perceptrons (MLP) or cosine similarity to ascertain the degree of association between nodes, such conventional methods may fall short in terms of interpretability, particularly when applied to entities like parking lots with distinct attributes. These methods often fail to offer a clear insight into the dynamics of parking lot interactions. To mitigate the black-box issue encountered by GATs in analyzing parking graphs, we integrate the parking spatio-temporal transfer matrix [
2] into the computation of attention coefficients within GATs. This matrix meticulously accounts for the spatio-temporal evolution of parking lot features, encompassing real-time occupancy rates, service capabilities, and spatial connections, thereby facilitating a dynamic representation of parking cruising behavior. This nuanced incorporation allows for a richer, more detailed understanding of parking dynamics, effectively translating the raw data into actionable insights. The formulation of this transfer matrix
is presented in Equation 3.
where each element
of
represents the probability of a vehicle heading to another parking lot
j when it finds parking lot
i is full.
represents the real-time occupancy rate of parking lot
i, which also indicates the probability of a vehicle choosing to stay in parking lot i based on the parking situation.
is the reciprocal of the normalized distance
between parking lot i and parking lot j.
is the normalized service capacity of parking lot
i.
Therefore, in this paper, the input to PRGAT consists of the original adjacency matrix A based on Euclidean distance and the new parking data , which is composed of the service capacity , the real-time occupancy rate , and latitude and longitude coordinates. The output is the updated parking data . Here, represents the new feature dimensions after quantification and extraction, while denotes the new feature dimensions relearned by PRGAT.
The algorithm initially applies a linear transformation to each parking lot node’s features
using a learnable weight matrix
to enhance the node’s expressive capability. Subsequently, it employs an attention mechanism
on the set of nodes to compute the attention coefficient
between node
i and node
j. This procedure can be encapsulated by Formula 4, wherein the attention mechanism might be a function that signifies the correlation between two objects, like cosine similarity or a Multilayer Perceptron (MLP), and ∥ represents the vector concatenation operation.
To capture local topological information and enhance computational efficiency, a masking mechanism and normalization operation have been introduced. The attention coefficients
are confined within the first-order neighborhood of the nodes, enabling each node to concentrate solely on its directly connected neighboring nodes and disregard the other nodes in the graph, ultimately yielding the attention matrix
. See Formula 5 for details, where LeakyReLU represents the activation function.
Moreover, the masking mechanism ingeniously reflects the factors influencing the parking decision-making process. Drivers, when choosing a parking lot, tend to compare a specific parking lot with its adjacent ones, rather than conducting pairwise comparisons among all parking lots. Such a masking mechanism not only enhances the model’s sensitivity to local associations but also aligns more closely with the behavioral patterns in the actual parking decision-making process.
We input both the parking spatio-temporal transfer matrix
and the aforementioned attention matrix
into the softmax activation function simultaneously to derive the ParkingRank attention matrix
, as indicated in Equation 6.
This ParkingRank attention matrix not only reflects the dynamic characteristics of the parking graph, capturing the flow trajectories and behavioral patterns of vehicles in urban parking scenarios, but also overcomes the shortcomings of insufficient interpretability of the parking lot relevance matrix computed by traditional graph attention methods.
After obtaining the normalized ParkingRank attention coefficients
, GAT conducts a weighted aggregation of the features of each node
i with its neighbors
, thereby producing the final output for each node. This procedure is depicted in Formula 7, where
denotes the sigmoid activation function.
The PRGAT we designed is fundamentally a pre-trained model, its primary purpose being to serve as a front-end graph construction module within the overall framework for urban parking prediction. Its loss function is described in Formula 8, and the pseudocode for the graph is provided in Algorithm 1.
Algorithm 1: ParkingRank Graph Attention (PRGAT) |
Input:
Output:
- 1:
- 2:
- 3:
- 4:
- 5:
fordo
- 6:
- 7:
- 8:
end for
- 9:
fordo
- 10:
- 11:
for do
- 12:
- 13:
- 14:
- 15:
- 16:
- 17:
- 18:
- 19:
end for
- 20:
end for
- 21:
- 22:
return
|
3.3. Parking Graph Coarsening
Define the coarsened parking graph is the result of coarsening the original adjacency matrix A, which is a smaller weighted graph. denotes a set of disjoint hypernodes, which covers all nodes in the original graph V, where each hypernode is formed by the aggregation of some of the nodes in the original graph, i.e., . is the adjacency matrix of the coarsened graph, .
This document employs the SGC algorithm [
22], which utilizes spectral distance (SD) to demonstrate that the coarsened network retains spatial features similar to those of the original network. This concept is illustrated in the equation below:
where the vectors
and
represent the eigenvalues of the Laplacian matrices of the original graph
G and the coarsened graph
, respectively. The spectral distance
is considered sufficiently small if it is less than
(where
is a very small number). Only when this condition is met do the eigenvalues and eigenvectors of the two graphs exhibit similarity in the spectral domain. The calculated spectral distance can then substantiate that the coarsened graph significantly preserves the attributes of the original graph during the coarsening process. Thus, it can be inferred that the spatial structure of the coarsened network remains similar to that of the original network. The key steps of the SGC algorithm are introduced below.
The algorithm initiates by inputting the ParkingRank attention matrix
and the coarsening ratio
. It computes the Laplacian matrix
and, through matrix decomposition, obtains the first
and the last
Laplacian eigenvalues
and Laplacian eigenvectors
u, for capturing both the local and global information of the parking graph. This procedure is illustrated in Formulas 10 and 11, wherein
signifies the normalized Laplacian matrix, with
and
D being the identity matrix and the degree matrix, respectively.
Then, the algorithm starts to iterate different eigenvalues intervals, performs parking clustering on the internal eigenvectors
u respectively, divides the parking lots with similar features into a parking hypernode, and generates a preliminary coarsened graph
based on the clustering results, the clustering principle is shown in Equation 12.
The Laplacian eigenvectors u computed from the ParkingRank attention matrix has rich parking graph topology information as well as parking lot node feature information, which can provide more accurate and intuitive parking lot node characterization for the coarsening process.
The process continuously calculates the distance
between the Laplacian eigenvectors of the coarsened graph
and those of the original graph, opting for the coarsening outcome with the minimum error. Finally, the coarsened graph
along with the corresponding index matrix
are returned. This procedure is depicted in Formula 13, with the coarsening specifics provided in Algorithm 2.
Algorithm 2: Parking graph Coarsening |
Input:
Output:
- 1:
- 2:
- 3:
- 4:
fordo
- 5:
for do
- 6:
- 7:
end for
- 8:
end for
- 9:
fordo
- 10:
for do
- 11:
- 12:
end for
- 13:
end for
- 14:
return
|
3.4. Prediction Framework Based on Coarsened Parking Graphs
Since the coarsening process of the parking graph described above does not downscale the parking lot node features, the feature data corresponding to the parking hypernodes
remains a splicing operation between the merged parking lot data, which is formalized as follows:
If it is simply used as the feature input for the parking prediction models, it will not significantly improve the overall framework training efficiency, and even lose the intrinsic connection between the nodes. In this regard, we consider adopting the symmetric TCN-AE, which is because:
TCN is able to mine the intrinsic laws behind the parking time-series data itself [
44], such as the tidal characteristics, which helps in the compression and reconstruction of the parking data.
The encoder is able to embed the high-dimensional sparse parking data into the low-dimensional dense tensor form, which reduces the computational overhead of the training model.
The decoder is able to achieve an approximate lossless reduction and can reconstruct the spatial structure of the original parking graph.
To maintain the consistency of the prediction results between the coarsened and the original parking graphs, we train a set of symmetric TCN-based AEs for each parking hypernode, utilizing the index matrix generated by the graph coarsening module. TCNs typically consist of multiple layers of dilated causal convolution, with each layer performing one-dimensional convolution. Moreover, to enhance the network’s expressive capacity, residual connections are often added following multiple convolutional layers, organizing these layers into several residual blocks. Each residual block primarily includes two dilated causal convolution layers, interspersed with normalization, activation, and dropout operations, before the output is added to the input of the next residual block. This design not only optimizes the flow of information but also enhances the model’s ability to capture features of parking data effectively.
The procedure is detailed as follows: As depicted in
Figure 2, firstly, the parking data
undergoes encoding with the residual blocks within the encoder [
44], as indicated in Formula 15. Owing to the reduction in the number of features as the quantity of residual blocks increases, it allows the feature data
corresponding to the coarsened graph to be represented within a low-dimensional feature space.
where
denotes the filter
f of length
k and * denotes the convolution operation. This formula ensures that only information prior to the current time step is considered and future information is not computed.
Next, we input the adjacency matrix
of coarsened graph and feature matrix
into a spatio-temporal graph convolutional model. This paper takes T-GCN [
33] as an example, which captures the spatio-temporal dependencies of the parking coarsened graph through 2 layers of GCN and 1 layer of GRU, finally obtaining the coarsened graph’s predicted results
through a linear transformation, as seen in Formulas 16, 17, 18, 19, and 20, where
W and
b represent the weights and biases of T-GCN, respectively, and
,
,
,
denote the reset gate, update gate, candidate hidden state, and current hidden state at time
t,respectively.
denotes the graph convolution operation, and tanh is the activation function.
Due to the fact that the dimensions of the coarsened graph’s predicted results
do not align with the size of the original parking graph, these results are not inherently interpretable. To address this, decoding is necessary. Based on the prediction length, we opt for a particular pre-trained decoder, inputting the prediction of each hypernode to reconstruct the original parking graph’s prediction results
, thereby fulfilling the prediction task for the entire urban parking graph. The decoding procedure is depicted in Formula 21.
denotes the filters within the decoder, distinct from the encoding phase, as the feature count in the decoder’s residual blocks grows with the increase in residual blocks, aiming to reconstruct parking data within a high-dimensional feature space.
The loss functions for the pre-trained TCN-AE and the parking prediction model utilize Mean Squared Error (MSE) and Huber Loss, respectively, as detailed in Formulas 22 and 23. In this context,
represents the outcome post-reconstruction by the autoencoder,
Y signifies the real observed data, and
represents the threshold parameter that controls MSE.