1. Introduction
Traffic congestion and frequent traffic accidents have become serious problems affecting urban traffic with the rapid development of cities [
1]. The Intelligent Transportation System (ITS) is proposed to alleviate these traffic problems. Location prediction of the vehicle trajectory is one of the research topics in the ITS [
2]. By effective location prediction, the traffic congestion can be alleviated and reduced, which is of great significance to optimize traffic management. The goal of this task is to predict the next location according to the observed trajectory [
3]. Deep learning methods has been widely used in vehicle trajectory location prediction. Vehicle trajectory is usually considered as a time series since it is a sequence of trajectory points [
1]. Long Short-Term Memory (LSTM) is mainly used to deal with time sequences. Li et al. [
4] apply a LSTM to capture temporal features between vehicles and design a convolutional social pooling network to capture spatial features to predict vehicle trajectory. However, complex roads and surrounding buildings could become the factors affecting prediction. Therefore, how to integrate external features with vehicle trajectory to improve the accuracy and robustness of model needs further research.
Knowledge graph, GAT, and time attention mechanism are the latest methods used in the field of transportation and have shown better performance in improving the accuracy and robustness of model [
5]. Knowledge graph can help fuse features. GAT can be used as spatial attention mechanism to effectively extract spatial features, and temporal attention mechanism can effectively extract temporal features. The application of knowledge graph has developed rapidly in the field of transportation. A multi-layer traffic knowledge graph model is proposed by Wang [
6] to realize destination prediction by modeling the complex traffic network, weighted summing, and fusing different node features. The driving intention in the traffic network is affected by the functional area and the surrounding POI [
7]. Therefore, it is important to predict the location of vehicle trajectory by designing traffic knowledge graph. The accuracy can be improved by interacting the vehicle trajectory with the surrounding environment [
7]. The driving intention of vehicles is closely related to the surrounding Points of Interest (POI). POI can be considered as the spatial feature and can be fused with trajectory points to improve prediction accuracy [
8]. Trajectory may change when there are traffic jams or traffic accidents, and the POI knowledge around trajectory points has an impact on the future trend. By integrating POI knowledge and vehicle trajectory, the prediction can be improved to help optimize traffic management. Graph Attention Network (GAT) [
9] and various temporal attention mechanisms [
10] are also widely used for extract spatial and temporal features. A new GAT is proposed by Su et al. [
11] to allocate attention to explain and express the spatial correlation of road network, as well as effectively capture the dynamic update of adjacent transformation matrix. LSTM combined with the attention mechanism are proposed by Ali et al. [
12] to capture temporal features. However, how to fuse the surrounding environment features such as POI, and the synergy of knowledge graph, GAT, as well as temporal attention mechanism on location prediction task still lack research.
A Local Dynamic Graph Spatiotemporal- Long Short-Term Memory (LDGST-LSTM) is proposed in this paper to predict the next location of vehicle trajectory. POI knowledge is extracted and fused by constructing a traffic knowledge graph. GAT and temporal attention mechanism are combined to improve the accuracy, robustness, and interpretability of the prediction model. The main contributions of this paper are as follows.
Knowledge graph and spatiotemporal attention mechanism are combined in this paper to predict the vehicle location at the next moment. POI is integrated with historical trajectory, and the POI weights that affect the prediction are visualized additionally. Regions that have a great impact on the prediction is explored, and the interpretability of model is enhanced.
A global traffic knowledge graph is constructed to learn and represent POI semantic information. POI nodes are considered as the entity, and the connection between POI is considered as the relationship. The representation vector of each node is obtained by Translate Embedding (TransE) algorithm and is considered as the feature vector for vehicle location prediction.
A spatiotemporal attention mechanism is designed to allocate weights for spatial and temporal features, thus enhancing the interpretability and accuracy of model. The weight distribution of spatial features is achieved through GAT to obtain the corresponding graph representation vector. LSTM combined with multi-head attention mechanism is used to allocate weights of trajectory points at different timestamps to improve the prediction accuracy.
2. Related Work
This section will provide a literature review on related works, including vehicle trajectory location prediction, knowledge graph, and attention mechanism.
2.1. Vehicle Trajectory Location Prediction
Deep learning method is widely used in location prediction of vehicle trajectory. A vehicle trajectory prediction model combining deep encoder-decoder and neural network is designed by Fan et al. [
13]. Multiple segments are processed and corrected by the deep neural network model, which improves the accuracy of the prediction and realizes long-term prediction [
14]. Kalatan and Farooq [
15] propose a new multi-input LSTM to fuse sequence data with context information of the environment. It can reduce prediction error but more external features that could improve the prediction performance are ignored. Graph Convolutional Network (GCN) combined with its variant model are proposed by An et al. [
16] to realize vehicle trajectory prediction in urban traffic system. Through this conjunction, model can effectively capture the spatiotemporal characteristics, therefore improving the effectiveness of vehicle trajectory prediction.
Methods based on deep learning algorithm can effectively improve the prediction accuracy and reduce error. Through feature mining and extraction, deep learning model can effectively improve its accuracy and robustness. However, the impact of surrounding buildings, or to say, Point-of-Interest (POI), on prediction under complex traffic environment is usually ignored in existed research. Therefore, how to better integrate external factors such as POI features needs further research to help improve the prediction performance.
The efficiency and accuracy of vehicle trajectory location prediction can be improved by combining vehicle trajectory with road network through data preprocessing. For example, original vehicle trajectory points are replaced with marked nodes or roads of road network by Fan [
13] to realize location prediction. However, only the location information is considered in the proposed method, external factors such as driving intentions or preferences that have significant impacts on prediction results are ignored.
2.2. Knowledge Graph
Knowledge graph is one of the most popular representation methods, which describes entities and their relationships in a structured triple. Each triple is composed of a head entity, a tail entity, and their relationship. The knowledge graph can be combined with deep learning models to learn the representation of the entities and relationships to realize prediction tasks. It can make contributions in the fields of medical treatment, national defense, transportation, and network information security [
17,
18].
Translating Embedding (TransE) is one of the typical models of knowledge graph [
19]. A TransGraph model based on TransE is proposed by Chen et al. [
20] to learn the structural features. Moreover, the knowledge graph is also widely used in machine learning. An open-source tool DGL-KE that can effectively calculate the embedded representation of the knowledge graph is proposed by Zheng et al. [
21] to execute TransE algorithm. This tool can speed up the training of entities and relationships in the knowledge graph through multi-processing, multi-GPU, and distributed parallelism. A traffic mode knowledge graph framework based on historical traffic data is proposed by Ji and Jin [
22] to capture the traffic status and congestion propagation mode of road segments. This framework has pioneering significance for the knowledge graph in the prediction of traffic congestion propagation. Wang et al. [
23] propose a method of embedding the temporal knowledge graph through a sparse transfer matrix. Based on static and temporal dynamic knowledge graphs, the global and local embedding knowledge are captured respectively, which alleviates the problem of inconsistent parameter scalability when model learning embedding from different datasets.
More semantic knowledge can be considered in the knowledge graph to enhance the interpretability of complex models [
18]. In addition, how to learn entities and relationships with low frequency better, how to integrate context into graph embedding learning better, and how to combine the graph embedding algorithm with other algorithms are future research directions.
2.3. Attention mechanism
Graph Attention Network (GAT) is widely used in the field of transportation to obtain the spatial correlation between nodes. Wang et al. [
24] propose a trend GAT model to predict traffic flow. By constructing a spatial graph structure, the transmission among similar nodes is realized and the problem of spatial heterogeneity is solved. A spatio-temporal multi-head GAT model is proposed by Wang and Wang [
25] to realize traffic prediction. Spatial features are captured through multi-head attention mechanism and temporal features are captured through full-volume transformation linear gated unit. The spatiotemporal correlation between traffic flow and traffic network is integrated to reduce the prediction error. Wang et al. [
26] propose a dynamic GAT model to realize traffic prediction. Spatial feature is extracted by a node embedding algorithm based on the dynamic attention mechanism, and temporal feature is extracted by gated temporal CNN in this model.
Temporal attention mechanism is also widely applied in traffic research. The time correlation between research objectives can be better learned by using the time attention mechanism. A temporal attention perception dual GCN is proposed by Cai et al. [
27] to realize air traffic prediction. The historical flight and time evolution pattern are characterized through temporal attention mechanism. Yan et al. [
28] design a gated self-attention mechanism module to realize the interaction between the current memory state and the long-term relationship. Chen et al. [
29] apply multi-head attention mechanism to mine the features of fine-grained spatiotemporal dynamics. A one-dimensional convolution LSTM model based on the attention mechanism is proposed by Wang et al. [
30] to realize traffic prediction. Multi-source data are fused as various features through attention mechanism.
Spatiotemporal attention mechanism can obtain reasonable weights of the features under complex traffic environment, therefore improving the interpretability of model. GAT can be considered as a spatial attention mechanism to capture spatial features and the multi-head attention mechanism can be considered as temporal attention mechanism to capture temporal features. Therefore, it is meaningful and important to combine GAT and temporal attention mechanisms when realizing vehicle trajectory location prediction.
3. Problem Statement
In this section, some basic concepts and notations in this paper are first introduced and then the studied problem are formally defined.
Definition 1
(Raw trajectory ). A raw vehicle trajectory is usually represented by a sequence of points continuously sampled by the Global Positioning System (GPS). Given a raw trajectory dataset , the trajectory is defined as a sequence of sampled points , where , . A sampled point is defined as a tuple which represents the longitude and latitude of the GPS point at timestamp . Due to the different sampling rate, trajectory data may have the characteristics of irregular distribution, such as being sparse in one area while dense in another area.
Definition 2
(Point of Interest ). Point-Of-Interests (POI) can be denoted as spatial representation of urban infrastructure such as school, hospital, restaurant and so on. It can reflect land use and urban functional characteristics, and has potential semantic information. Its distribution influences the intentions of travel. POI can be classified into different types. Given a POI dataset with types , the type of POI is defined as a set of points , where . A POI is a five-tuple, which represents the semantic information including the name, address, longitude, latitude, as well as attribute of the POI. 11 types of POI information are considered in this paper, including shopping, entertainment, food, hotel, scenic spot, finance, government, company, hospitals, life services, and sports.
Definition 3
(Road network ). Vehicle trajectory combined with POI are matched to a global map in this paper. A road network is defined as a graph , where is the set of nodes, each denotes a road segment. is the set of edges connecting nodes and each denotes the connectivity between road segment and . is an adjacency matrix, and semantic information of various POI classes are considered as features. Each element in this matrix is a binary value, which is 1 when road segment is adjacent to and 0 otherwise.
Definition 4
(Normalized trajectory ). Raw trajectory will be processed through a data conversion layer in this paper and converted into a normalized trajectory . Due to the measurement error, it is not appropriate to directly input raw trajectory to the model. Given a raw trajectory dataset , the trajectory . Match each point to the nearest node of road network. A normalized point is defined as a tuple which represents the longitude and latitude of node. A normalized trajectory is then defined as a sequence of road segments after projection.
Definition 5
(Normalized POI ). Match each POI to the nodes of road network which has the shortest projection distance. A normalized POI is defined as the corresponding road segment after projection and POI semantic information is assigned to the matched nodes as node features. refers to the normalized POI of type and represents semantic information of the normalized POI of type.
Definition 6
(POI Knowledge graph ). Defining a knowledge graph , where is the POI entity set, where is the number of the POI entities. is the POI relationship set, where is the number of the POI relationship. is a triple set, which refers to head entity, relation, and tail entity respectively.
Problem (Vehicle trajectory location prediction). Given a raw trajectory dataset and a road network , for current point of the trajectory , the task aims to predict the next location , which is a tuple consisting of the longitude and latitude of GPS point at timestamp .
4. Methodology
Vehicle trajectory is a sequence of continuously-sampled points and its intention are closely related to POI. The semantic information of POI nearby each trajectory point may have an impact on the next location prediction. Therefore, a Local Dynamic Graph Spatiotemporal- Long Short-Term Memory model (LDGST-LSTM) is proposed in this paper to realize location prediction for vehicle trajectory and explore the impact weight of the nearby POI.
Raw vehicle trajectory and POI points are first matched to the traffic network through the data conversion layer in the proposed model. Then global POI knowledge graph is constructed to obtain the representation vectors of the POI entities in the global POI knowledge extraction layer. Based on the global knowledge graph, Local graph related to each trajectory point is generated and the graph representation vector is captured through GAT in the local dynamic graph generation module. Finally, trajectory points with the related graph representation vectors are input into LSTM with multi-head attention mechanism in the trajectory prediction layer to predict the next location.
In this section, details of the proposed model LDGST-LSTM are provided, which consists of four major components: a data conversion layer, a global POI knowledge extraction layer, a local dynamic graph generation module, and a trajectory prediction layer. The overall framework is first described in
Section 4.1. Then, each component in this model is specifically introduced in
Section 4.2 to
Section 4.5.
4.1. Overall Framework
The overall framework of LDGST-LSTM is shown in
Figure 1. There are four major components in the proposed model: (1)
a data conversion layer, (2)
a global POI knowledge extraction layer, (3)
a local dynamic generation module, and (4)
a trajectory prediction layer. Additionally, the local dynamic generation module is concluded in the trajectory prediction layer. The observed vehicle trajectory
is considered as the input of the model, and the predicted next location
of the trajectory at timestamp
is considered as the output.
As shown in
Figure 1, vehicle trajectory
is considered as the input of the model. Firstly, normalized trajectory
and normalized POI
is obtained through map matching and proximity algorithm respectively in the data conversion layer. Then the representation vector
of every POI is trained through TransE algorithm based on the knowledge graph in the global POI knowledge extraction layer. It will be considered as semantic features of trajectory points. In the trajectory prediction layer, local graphs
related to trajectory points
are first generated, and the corresponding graph representation vectors
are obtained through GAT in the local dynamic graph generation module. Normalized trajectory points
and corresponding graph representation vectors
are then concatenated and input into LSTM with multi-head attention mechanism. The model finally outputs the predicted next location
of the trajectory. The overall framework can be denoted by the following formula (1)-(4), where
.
4.2. Data Conversion Layer
This section will introduce road network data, trajectory data conversion and POI data conversion specifically.
4.2.1. Road Network
The urban road network data used in this paper is downloaded from the Open Street Map (OSM), an open-source map website. Taking Chengdu as an example, the road network is shown in
Figure 2. This paper only maintains the roads and nodes within the Third Ring Road of Chengdu as the research scope, which can mainly represent the urban district.
4.2.2. Trajectory Data Conversion
In the data conversion lay, raw trajectory data is converted into normalized trajectory. Each point is represented by the latitude and longitude of matched road network node. Original trajectory coordinate system (GCJ-02) is converted to the road coordinate system (WGS84) in this paper. Based on the previous map-matching work of Lou et al. [
31], a certain threshold is considered as the radius to search for candidate points, and the candidate points with the shortest distance are selected as the normalized trajectory points. Trajectory points that are not successfully matched to the corresponding nodes will be deleted as redundant points.
4.2.3. POI Data Conversion
POI data of Chengdu in 2018 is used in this paper. The proximity algorithm is used in ArcGIS to search and select the nearest nodes in the road network for each POI point based on distance. The semantic information of POI points is allocated to the corresponding nodes as the normalized POIs. As shown in
Figure 3, POI points of different types are matched to the road network nodes, taking life services, food, and hospitals as examples.
4.3. Global POI Knowledge Extraction Layer
A global graph
and a knowledge graph
are constructed in the global POI knowledge extraction layer, as shown in
Figure 4. Entity set, relationship set and related triples are defined to study the representation vectors of each POI entity through TransE algorithm. In the defined global graph, each node in the global graph contains the related POI knowledge through the data conversion layer.
As shown in
Figure 4, the normalized POI is considered as the entity, and the link between the normalized POI is considered as the relationship, which in this paper denotes that there is a connection between two POI nodes. Moreover, the triplet
is considered as the training set of TransE algorithm, where
represents the head entity,
represents the tail entity, and
represents the relationship;
and
belongs to the entity set
, and
belongs to the relationship set
. The target of the TransE algorithm is to consider the relationship as the translation from the head entity to the tail entity, or to say, to make
equal to
as much as possible. The potential energy of a triplet is defined by the L2 norm of the difference between
and
, as shown in formula (5), where
is the number of the triplets, and
represents the
triplet.
Wrong triplets are identified and considered as negative samples in TransE algorithm for uniform sampling. A negative sample is generated when any one of the factors in a positive sample is replaced by the other entities or relationship randomly. The potential energy of the positive samples is reduced and the potential energy of the negative samples are increased in TransE algorithm. The objective function is defined as below.
Where, is the set of positive samples, and is the set of negative samples. is a constant, usually is set as 1, that represents the distance between positive and negative samples. and are the triplets of positive and negative samples respectively. is the potential energy function and is .
The distributed representation vector of the current head and tail entities are considered as the representation vectors
. Pseudocode of the global POI knowledge extraction layer is shown in
Table 1.
4.4. Local dynamic graph generation module
Local graphs are generated for each trajectory point in the local dynamic graph generation module. Graph Attention network (GAT) is used as a spatial attention mechanism to allocate weight and update the parameters of every trajectory point and its neighbors. Corresponding graph representation vector can be obtained through GAT.
As shown in
Figure 5, every node in the global graph is embedded with the POI representation vector
based on the POI knowledge graph in
Section 4.3. The feature matrix
is constructed by all the embeddings of nodes,
, where
is the number of the graph nodes,
is the embedding dimension of the feature,
represents the feature matrix of the
trajectory point of the
trajectory. The related local graphs
are generated for each normalized trajectory point
in the normalized trajectory
, where
is the set of the current point
and its neighbor nodes,
is the set of edges among the current point and its neighbors,
is the local adjacency matrix of the current point, and it concludes the features of both the current point and its neighbor nodes.
GAT is used to calculate the attention weight between the trajectory points and its neighbor nodes and fuse the features for the local graphs. The adjacency matrix is used to check whether there is connection among nodes, and the resource is allocated by calculating the weights of the neighbor nodes in GAT. It can be considered as a spatial attention mechanism to enhance the interpretability of the proposed model. The definition of attention mechanism is shown as formula (7).
where
is the original information, and it is formed by the
pairs.
represents the information extraction through weight allocation from the
under the condition of
. The aim of the GAT is to learn the relevance among target nodes and their neighbor nodes through the parameter matrix. In this paper,
is set as the feature vector
of the current point
,
is set as the feature vectors of all the neighbor nodes of
,
and
are respectively the
neighbor node and its feature vector. The relevance coefficients among every trajectory point and its neighbor nodes are calculated in GAT.
Where
represents concatenation,
is the feature vector of the
point of the
trajectory, and
is the feature vector of the
neighbor node of the
point.
is a learnable weight parameter.
is a learnable weight of the linear layer and
is the activation function. Moreover, all the coefficients are normalized by the function
in GAT to obtain the attention weight.
is the attention weight between node
and its
neighbor node, where
denotes the neighborhood of node
in the
trajectory.
The aggregated and updated feature vector is then calculated, as shown in formula (10). Multi-head attention mechanism is used to enhance the feature fusion of the neighbor nodes.
where
is the number of the heads,
is the activation function and
is the graph representation vector. Pseudocode of the local dynamic graph generation module is shown in
Table 2.
4.5. Trajectory prediction layer
In the trajectory prediction layer, trajectory points and corresponding graph representation vectors are input, and the coordinate of the next location are obtained after going through LSTM with a multi-head attention mechanism, as shown in
Figure 6.
As shown in
Figure 6, the trajectory points
and the corresponding graph representation vectors
are concatenated and input into LSTM, as shown in formula (11)-(17).
where
,
,
and
are respectively the weight of the forget gate, the input gate, the cell state, and the output gate.
,
,
and
are the corresponding bias. The trajectory points and corresponding graph representation vectors are concatenated as the input
.
goes through the forget, input and output gates, and generate the cell state
. Necessary information is processed by the input gate and the updated information is activated by the function
to obtain the
. The current cell state
is shown in formula (16), where
and
are respectively the output of the forget gate and the input gate.
is the cell state of the previous vector and
is the activated state updated by the input gate. The hidden state
of the current vector is obtained by multiplying the activated cell state
and the output of the output gate
.
A multi-head attention mechanism is used based on LSTM to allocate weight and enhance the interpretability of the proposed model.
Where , , . is the hidden state of the current vector and is considered as the query, denotes the hidden states of the vectors and is considered as the key and value, , , , are learnable weight parameters, is the dimension, is the number of the heads.
The predicted result is obtained by adding a multilayer perceptron composed of two full-connected layers.
where
is the predicted result.
,
,
, and
are respectively the weights and biases of the two full-connected layers.
is the output calculated by the multi-head attention mechanism. Mean Square Error (MSE) is considered as the loss function to calculate the difference between predicted results and truth, where
is the number of trajectories and
is the length of the trajectory.
The pseudocode of the trajectory prediction layer is shown in
Table 3.
5. Experiments
This section will demonstrate the details of experiments, including datasets, experiment settings and result analysis. Accuracy and robustness experiment are conducted to evaluate the performance of the proposed model compared with benchmarks. Additionally, ablation experiment is set to explore the effectiveness of the proposed model by filtering the spatial and temporal attention mechanisms. Moreover, the POI weights that influence the prediction results are visualized on the map to demonstrate the significance of urban functional regions to vehicle trajectories.
5.1. Datasets
Trajectory dataset: Chengdu taxi trajectory data in October 2018 from Didi Gaiya is used in this paper. It concludes attributes of driver ID, order ID, timestamp, longitudes, and latitudes. Although it collects trajectory data for one month, it has over 380k trajectories, and 270k are generated on holidays while 110k are trajectories of working days. Concerning the data distribution, the dataset covers mainly urban area of Chengdu, from 30.65283 to 30.72649°N, and 104.04210 to 104.12907° E. The data sampling frequency of each trajectory is 4s, which has relatively high data density on the city road network.
Road Network dataset: POI and trajectory information are combined to construct a global traffic graph. The road network data of Chengdu is downloaded from OSM. The node vector data is composed of road node ID and its corresponding longitude and latitude, and each road vector data is composed of road ID and sequence of the node coordinates.
POI dataset: POI data of 11 categories of Chengdu in AutoNavi map is obtained from the Chinese Software Developer Network (CSDN). The original data format is POI name, longitude and latitude coordinates, address, district, and POI class.
5.2. Experimental settings
Experimental settings in this paper are shown as follows: CPU used in the experiment is AMD RYZEN 75800h, and GPU is Nvidia Geforce RTX 3060. Windows 10 is used as the operating system, Python is used as the development language, and PyTorch is used as the deep learning development framework. Moreover, the parameter settings in this paper are shown as follows: The input data are divided into the training, validation, and test set according to the proportion of 7:2:1. Batch size is set as 16, the initial learning rate is set as 0.0001, and Adam is used as the optimizer in the training process.
5.2.1. Benchmark models
Five benchmark models for comparative experiments are mentioned, including LSTM [
32], GRU [
33], BiLSTM [
34], AttnLSTM [
35], AttnBiLSTM [
36].
LSTM: It is usually applied in time series prediction. Compared with the traditional RNN models, LSTM makes up for the defect that traditional RNN models only consider the recent state and cannot deal with long-term memory. It can determine which states are retained or forgotten through the internal cell state and forget gate. Moreover, LSTM can avoid some gradient vanishing problems of traditional RNN models.
GRU: Compared with LSTM, GRU only has two gates and three full connection layers, which reduces the amount of calculation to a certain extent and reduces the risk of overfitting. GRU has update and the reset gate, which can determine the output of information, and delete irrelevant information while saving historical information.
BiLSTM: Based on LSTM, the input in BiLSTM will have both forward and backward propagation, therefore each timestamp in the input sequence can save both the future and past historical information at the same time.
AttnLSTM: Attention mechanism can assign greater weight to more important tasks when the amount of computation is limited. It is a resource allocation scheme to solve information overload. Combining LSTM with attention mechanism, the model performance can be improved by increasing the weight of the most important features, or filtering useless feature information.
AttnBiLSTM: BiLSTM combined with the attention mechanism can fuse information by increasing the weight of important features, and bi-directional propagation. It can improve the calculation efficiency and accuracy of the model with abundant semantic information.
5.2.2. Evaluation Metrics
Evaluation metrics used in this paper are Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE), Haversine function (HSIN), and Accuracy. The identification and formulas of these metrics are shown as follows. and is true and predicted value respectively, and is the predicted number.
MAE: It represents the average of the absolute error between the real and the predicted value, which can be calculated as follows.
MSE: It represents the expected value of the square of the difference between the predicted and the real value, which can be calculated as follows.
RMSE: It represents the sample standard deviation of the difference between the real and the predicted value. RMSE can represent the dispersion degree of the sample. For nonlinear model fitting, the smaller the RMSE is, the better the model fits. Compared with MAE, RMSE has more penalties for high differences. It can be calculated as follows, where
represents Sum of the Squares of Error.
HSIN: It represents the distance difference between the predicted points and the ground truth. The smaller the distance error, the higher the accuracy of the model prediction. It can be calculated as follows, where
is the truth, and
is the predicted longitude and latitude.
Accuracy: It indicates the proportion of the accurate predicted output in the total output. The higher the accuracy rate, the better the model training effect. It can be calculated as follows,
is the correctly predicted samples,
is the incorrectly predicted samples, and
represents the number of the objects.
5.3. Result analysis
The accuracy experiment and the robustness experiment are discussed in this paper. The LDGST-LSTM model is compared with the benchmark models to explore the performance on accuracy and robustness. The ablation experiment is discussed to verify the spatial and temporal attention mechanisms of the proposed model to enhance the interpretability. Moreover, the POI weights calculated by the temporal attention mechanism are visualized on the map. The POI is considered as the feature of the vehicle trajectory and the influence of the POI on the vehicle position prediction is revealed.
5.3.1. Accuracy Experiment
All models are trained and the accuracy comparison results on training set are shown as
Figure 7. It can be seen that in both holidays and working days, the accuracy of LDGST-LSTM is significantly higher than the benchmarks.
As shown in
Figure 7a,b, the accuracy of the proposed model in both holidays and working days are nearly 4.5 times higher than the benchmarks in the dataset of October 2018. As shown in
Figure 7c,d, the accuracy of the proposed model in both weekends and working days are nearly 3.5-4.5 times higher than the benchmarks in the dataset of August 2014. Therefore, the accuracy of location prediction can be effectively improved by considering POI knowledge and combing GAT and temporal attention mechanism.
The accuracy comparison results of different models are shown in
Table 4.
In the October 2018 dataset, the accuracy of LDGST-LSTM is 21% higher compared to LSTM during holidays and 61% higher compared to Attn-LSTM during weekdays. In the August 2014 dataset, LDGST-LSTM improved by 64% compared to Attn-BiLSTM on weekends. Therefore, compared to the benchmarks, the model proposed in this paper can improve performance by selecting appropriate POI categories on weekdays and holidays, respectively. More importantly, integrating POI knowledge and combining spatial and temporal attention mechanisms can greatly improve the accuracy of vehicle trajectory location prediction.
5.3.2. Robustness experiment
The results of robustness experiment are discussed in this section. The convergence speed and the metrics after convergence of the proposed model and benchmarks are compared to analyze the robustness. The results of convergence speed are shown in
Figure 8.
As shown in
Figure 8a,b, the loss of LDGST-LSTM drops slowly compared with the benchmark models before the 200th iterations on the dataset of October 2018. The convergence speed of the proposed model becomes faster from the 250th to the 300th iterations, and it converges at around the 350th iteration. As shown in
Figure 8c,d, the loss of LDGST-LSTM drops slowly compared with the benchmark models before the 200th iterations on the dataset of August 2014. And the convergence speed of the proposed model becomes faster from the 250th to the 350th iterations and it converges at around the 450th iteration.
The performance on MAE, MSE, RMSE and HISN of different models are shown in
Table 5. As shown in
Table 5a,b, the performance of the proposed model on evaluation metrics are the best compared with benchmarks when using the dataset of Octorber 2018. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 7.73%, 28.57%, 11.11% and 15.72% lower than LSTM that is the most robust among the benchmars in the holidays. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 6.19%, 33.33%, 27.27% and 41.79% lower than GRU that is the most robust among the benchmarks in the working days. As shown in
Table 5c,d, the performance of the proposed model on evaluation metrics are still the best when using the dataset of Chengdu August 2014. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 6.09%, 33.33%, 15.64% and 41.66% lower than GRU in the weekend. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 4.88%, 28.57%, 18.55% and 15.99% lower than LSTM in the working days. Therefore, it can be seen that the robustness of the proposed model are the best compared with all the benchmark models.
5.3.3. Ablation Experiment
Ablation experiment is discussed to analyze the effect of major components in the proposed model by filtering GAT and temporal attention mechanism. In addition to the proposed model, three ablation models are analyzed, including (1) Local Dynamic Graph Convolutional Network- Long Short-Term Memory (LDGCN-LSTM), (2) Local Dynamic Graph Convolutional Network- Temporal Attention Long Short-Term Memory (LDGCN-TAttnLSTM), and (3) Local Dynamic Graph Attention- Long Short-Term Memory (LDGAT-LSTM).
As shown in
Figure 9, the accuracy of LDGST-LSTM is the highest among the other three ablation models in both two datasets. The accuracy of LDGST-LSTM is almost the same as the accuracy of LDGAT-LSTM. It may be the reason that in holidays, the intention of the taxis is more focuesd on some functional regions, and the importance of spatial overweighs the temporal features.
The ablation results is shown in
Figure 6. As shown in
Figure 6a,b, the evaluation metrics of LDGST-LSTM are the best among the ablation models when using dataset in 2018. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 9.91%, 28.57%, 12.97% and 15.46% lower, and the accuracy is 4.17% higher than LDGAT-LSTM that performances the best among ablation models in the holidays. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 3.9%, 14.29%, 22.18% and 20.12% lower, and the accuracy is 40% higher than LDGAT-LSTM in the working days. As shown in
Figure 6c,d, the performances of LDGST-LSTM are also the best in 2014 dataset. The MAE, MSE, RMSE and HSIN of LDGST-LSTM are respectively 25.40%, 33.33%, 18.31% and 18.52% lower, and the accuracy is 7.69% higher than LDGAT-LSTM that performances the best among ablation models in the weekends. The MAE, MSE, RMSE and HSIN of LDGST-LSTM are respectively 10.55%, 28.57%, 11.81% and 10.42% lower, and the accuracy is 3.64% higher than LDGAT-LSTM in the working days. In conclusion, the proposed model performances the best compared with other ablation models. Therefore, the combination of GAT and temporal attention mechanism can enhance the interpretability and also improve the accuracy and robustness of model.
5.3.4. POI weights visualization
The predicted coordinates of the next location and corresponding weight are calculated by the proposed model. The visualization of POI weights is realized through the nuclear density analysis in the ArcGIS.
The visualization of the POI weight has the positive significance to the vehicle trajectory planning, traffic optimazation and vehicle location prediction. As shown in
Figure 10, there are some regions of POI information having effects on the vehicle location prediction. For example , the western south regions in
Figure 10a,c, and the right and the top side in
Figure 10b,d. It can be seen that in holidays and weekends, POI in the western south regions can be considered as the important information that influences the vehicle trajectories. It may denote that these regions are close to the center of the city, and has high traffic flow in holidays and weekends. Therefore, it is important to plan the driving path in these regions in holidays and weekends. Moreover, POI regions that influence the location prediction is more dispersed in working days, and the trajectory decision can be more flexible.
6. Conclusions
A Local Dynamic Graph Spatiotemporal- Long Short-Term Memory (LDGST-LSTM) model is proposed in this paper to predict the next location of vehicle trajectory. Data conversion layer, POI global knowledge extraction layer, local dynamic graph generation module and trajectory prediction layer are major components in the proposed model. Raw taxi trajectory and POI semantic information are matched to the road network through map matching algorithm and proximity algorithm in the data conversion layer. The representation vectors of POI are learned through TransE algorithm by constructing knowledge graph in the POI global knowledge extraction layer. Based on the global knowledge graph, Local graph related to each trajectory point is generated and the graph representation vector is captured through GAT in the local dynamic graph generation module. Finally, trajectory points with the related graph representation vectors are input into LSTM with multi-head attention mechanism in the trajectory prediction layer to predict the next location.
However, there are limitations in this paper. Firstly, since the non-uniform sampling of GPS and the existence of GPS signal shielding areas, accurate trajectory recovery can be the future research direction. Moreover, only POI knowledge is considered as the external feature in this paper, while the vehicle trajectory can also be affected by other external features, such as morning and evening peak, and weather. Therefore, it is essential to integrate multi-source external features to predict the next location. Furthermore, scenario in this paper is the macro traffic roads, therefore more specific traffic scenarios, such as intersections, can be further studied in the future.
Author Contributions
Funding acquisition, J.C.; methodology, J.C.; project administration, J.C.; software, Q.F. and X.X.; supervision, J.C.; validation, X.X.; visualization, Q.F. and X.X.; writing—original draft, J.C.; writing—review & editing, Q.F. and X.X. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by National Natural Science Foundation of China, grant number 61104166.
Data Availability Statement
The research data is unavailable due to privacy or ethical restrictions.
Acknowledgments
The authors would like to thank the reviewers for useful suggestions.
Conflicts of Interest
The authors declare there are no conflict of interest regarding the publication of this paper. The authors have no financial and personal relationships with other people or organizations that could inappropriately influence our work.
References
- Guo, L. Research and Application of Location Prediction Algorithm Based on Deep Learning. Doctoral Thesis, Lanzhou University, Lanzhou, China, 2018. [Google Scholar]
- Havyarimana, V.; Hanyurwimfura, D.; Nsengiyumva, P.; Xiao, Z. A novel hybrid approach based-SRG model for vehicle position prediction in multi-GPS outage conditions. Inf. Fusion 2018, 41, 1–8. [Google Scholar] [CrossRef]
- Wu, Y.; Hu, Q.; Wu, X. Motor vehicle trajectory prediction model in the context of the Internet of Vehicles, J. Southeast Univ. (Nat. Sci. Ed.) 2022, 52, 1199–1208. [Google Scholar] [CrossRef]
- Li, L.; Xu, Z. Review of the research on the motion planning methods of intelligent networked vehicles, J. China Highw. Transp. 2019, 32, 20–33. [Google Scholar] [CrossRef]
- Wang, K.; Wang, Y.; Deng, X.; et al. Review of the impact of uncertainty on vehicle trajectory prediction, Automot. Technol. 2022, 7, 1–14. [Google Scholar]
- Wang, L. Trajectory Destination Prediction Based on Traffic Knowledge Map. Doctoral Thesis, Dalian University of Technology, Dalian, China, 2021. [Google Scholar]
- Guo, H.; Meng, Q.; Zhao, X.; et al. Map-enhanced generative adversarial trajectory prediction method for automated vehicles, Inf. Sci. 2023, 622, 1033–1049. [Google Scholar] [CrossRef]
- Xu, H.; Yu, J.; Yuan, S.; et al. Research on taxi parking location selection algorithm based on POI, High-Tech. Commun. 2021, 31, 1154–1163. [Google Scholar]
- Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio’, P.; Bengio, Y. Graph Attention Networks. arXiv 2017, arXiv:1710.10903. [Google Scholar] [CrossRef]
- Li, L.; Ping, Z.; Zhu, J.; et al. Space-time information fusion vehicle trajectory prediction for group driving scenarios, J. Transp. Eng. 2022, 22, 104–114. [Google Scholar]
- Su, J.; Jin, Z.; Ren, J.; Yang, J.; Liu, Y. GDFormer: A Graph Diffusing Attention based approach for Traffic Flow Prediction. Pattern Recognit. Lett. 2022, 156, 126–132. [Google Scholar] [CrossRef]
- Ali, A.; Zhu, Y.; Zakarya, M. Exploiting dynamic spatio-temporal correlations for citywide traffic flow prediction using attention based neural networks. Inf. Sci. 2021, 577, 852–870. [Google Scholar] [CrossRef]
- Fan, H. Research and Implementation of Vehicle Motion Tracking Technology based On Internet of Vehicles. Doctoral Thesis, Beijing University of Posts and Telecommunications, Beijing, China, 2017. [Google Scholar]
- Hui, F.; Wei, C.; Shangguan, W.; Ando, R.; Fang, S. Deep encoder-decoder-NN: A deep learning-based autonomous vehicle trajectory prediction and correction model. Phys. A Stat. Mech. Its Appl. 2022. [Google Scholar] [CrossRef]
- Kalatian, A.; Farooq, B. A context-aware pedestrian trajectory prediction framework for automated vehicles, Transp. Res. Part C: Emerg. Technol. 2022, 134. [Google Scholar] [CrossRef]
- An, J.; Liu, W.; Liu, Q.; Guo, L.; Ren, P.; Li, T. DGInet: Dynamic graph and interaction-aware convolutional network for vehicle trajectory prediction. Neural Netw. 2022, 151, 336–348. [Google Scholar] [CrossRef] [PubMed]
- Yang, D.; He, T.; Wang, H.; et al. Research progress in graph embedding learning for knowledge map. J. Softw. 2022, 33, 21. [Google Scholar] [CrossRef]
- Xia, Y.; Lan, M.; Chen, X.; et al. Overview of interpretable knowledge map reasoning methods. J. Netw. Inf. Secur. 2022, 8, 1–25. [Google Scholar]
- Zhang, Z.; Qian, Y.; Xing, Y.; et al. Overview of TransE-based representation learning methods, Comput. Appl. Res. 2021, 3, 656–663. [Google Scholar]
- Chen, W.; Wen, Y.; Zhang, X.; et al. An improved TransE-based knowledge map representation method. Computer Engineering 2020, 46, 8. [Google Scholar]
- Zheng, D.; Song, X.; Ma, C.; Tan, Z.; Ye, Z.; Dong, J.; Xiong, H.; Zhang, Z.; Karypis, G. DGL-KE: Training Knowledge Graph Embeddings at Scale. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020.
- Ji, Q.; Jin, J. Reasoning Traffic Pattern Knowledge Graph in Predicting Real-Time Traffic Congestion Propagation. IFAC-Pap. 2020, 53, 578–581. [Google Scholar] [CrossRef]
- Wang, X.; Lyu, S.; Wang, X.; Wu, X.; Chen, H. Temporal knowledge graph embedding via sparse transfer matrix. Inf. Sci. 2022, 623, 56–69. [Google Scholar] [CrossRef]
- Wang, C.; Tian, R.; Hu, J.; Ma, Z. A trend graph attention network for traffic prediction. Inf. Sci. 2023, 623, 275–292. [Google Scholar] [CrossRef]
- Wang, B.; Wang, J. St-Mgat:Spatio-Temporal Multi-Head Graph Attention Network for Traffic Flow Prediction. SSRN Electron. J. 2022. [Google Scholar] [CrossRef]
- Wang, T.; Ni, S.; Qin, T.; Cao, D. TransGAT: A dynamic graph attention residual networks for traffic flow forecasting. Sustain. Comput. Informatics Syst. 2022, 36, 100779. [Google Scholar] [CrossRef]
- Cai, K.; Shen, Z.; Luo, X.; Li, Y. Temporal attention aware dual-graph convolution network for air traffic flow prediction. J. Air Transp. Manag. 2023, 106. [Google Scholar] [CrossRef]
- Yan, X.; Gan, X.; Wang, R.; Qin, T. Self-attention eidetic 3D-LSTM: Video prediction models for traffic flow forecasting. Neurocomputing 2022, 509, 167–176. [Google Scholar] [CrossRef]
- Chen, L.; Shi, P.; Li, G.; Qi, T. Traffic flow prediction using multi-view graph convolution and masked attention mechanism. Comput. Commun. 2022, 194, 446–457. [Google Scholar] [CrossRef]
- Wang, K.; Ma, C.; Qiao, Y.; Lu, X.; Hao, W.; Dong, S. A hybrid deep learning model with 1DCNN-LSTM-Attention networks for short-term traffic flow prediction. Phys. A-Stat. Mech. Its Appl. 2021, 583, 126293. [Google Scholar] [CrossRef]
- Lou, Y.; Zhang, C.; Zheng, Y.; Xie, X.; Wang, W.; Huang, Y. Map-matching for low-sampling-rate GPS trajectories. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2009.
- Cai, Y. Research on Vehicle Trajectory Prediction Based on RNN-LSTM Network. Doctoral Thesis, Jilin University, Jilin, China, 2021. [Google Scholar]
- Zhang, H.; Huang, C.; Xuan, Y.; et al. Real time prediction of air combat flight trajectory using gated cycle unit. Syst. Eng. Electron. Technol. 2020, 42, 7. [Google Scholar] [CrossRef]
- Guo, Y.; Zhang, R.; Chen, Y. Vehicle trajectory prediction based on potential characteristics of observation data and bidirectional short-term and long-term memory network. Automot. Technol. 2022, 3. [Google Scholar] [CrossRef]
- Liu, C.; Liang, J. Vehicle trajectory prediction based on attention mechanism. J. Zhejiang Univ. (Eng. Ed.) 2020, 54, 8. [Google Scholar] [CrossRef]
- Guan, D. Research on Modeling and Prediction of Vehicle Moving Trajectory in the Internet of Vehicles. Doctoral Thesis, Beijing University of Posts and Telecommunications, Beijing, China, 2020. [Google Scholar]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).