Vehicle Trajectory Prediction Based on Local Dynamic Graph Spatiotemporal- Long Short Term Memory Model

Preprint

Article

Vehicle Trajectory Prediction Based on Local Dynamic Graph Spatiotemporal- Long Short Term Memory Model

Altmetrics

Downloads

Views

Comments

A peer-reviewed article of this preprint also exists.

Juan Chen^*

Qinxuan Feng,Daiqian Fan

Juan Chen^*

Qinxuan Feng,Daiqian Fan

This version is not peer-reviewed

Submitted:

23 November 2023

Posted:

23 November 2023

You are already at the latest version

Alerts

Abstract

Traffic congestion and frequent traffic accidents have become main problems affecting urban traffic. Effective location prediction of vehicle trajectory can help alleviate traffic congestion, reduce the occurrence of traffic accidents, and optimize the urban traffic system. Vehicle trajectory is closely related to the surrounding Points of Interest (POI). POI can be considered as the spatial feature and can be fused with trajectory points to improve prediction accuracy. A Local Dynamic Graph Spatiotemporal- Long Short-Term Memory (LDGST-LSTM) is proposed in this paper to extract and fuse the POI knowledge and realize next location prediction. POI semantic information is learned by constructing the traffic knowledge graph, and spatial and temporal features are extracted by combining Graph Attention Network (GAT) and temporal attention mechanism. Moreover, the weights of POI that influence location prediction are visualized to improve the interpretability of the proposed model. The effectiveness of LDGST-LSTM is verified on two datasets, including Chengdu taxi trajectory data in October 2018 and August 2014. The accuracy and robustness of the proposed model are significantly improved compared with the benchmark models.

Keywords:

Subject: Computer Science and Mathematics - Artificial Intelligence and Machine Learning

1. Introduction

Traffic congestion and frequent traffic accidents have become serious problems affecting urban traffic with the rapid development of cities [1]. The Intelligent Transportation System (ITS) is proposed to alleviate these traffic problems. Location prediction of the vehicle trajectory is one of the research topics in the ITS [2]. By effective location prediction, the traffic congestion can be alleviated and reduced, which is of great significance to optimize traffic management. The goal of this task is to predict the next location according to the observed trajectory [3]. Deep learning methods has been widely used in vehicle trajectory location prediction. Vehicle trajectory is usually considered as a time series since it is a sequence of trajectory points [1]. Long Short-Term Memory (LSTM) is mainly used to deal with time sequences. Li et al. [4] apply a LSTM to capture temporal features between vehicles and design a convolutional social pooling network to capture spatial features to predict vehicle trajectory. However, complex roads and surrounding buildings could become the factors affecting prediction. Therefore, how to integrate external features with vehicle trajectory to improve the accuracy and robustness of model needs further research.

Knowledge graph, GAT, and time attention mechanism are the latest methods used in the field of transportation and have shown better performance in improving the accuracy and robustness of model [5]. Knowledge graph can help fuse features. GAT can be used as spatial attention mechanism to effectively extract spatial features, and temporal attention mechanism can effectively extract temporal features. The application of knowledge graph has developed rapidly in the field of transportation. A multi-layer traffic knowledge graph model is proposed by Wang [6] to realize destination prediction by modeling the complex traffic network, weighted summing, and fusing different node features. The driving intention in the traffic network is affected by the functional area and the surrounding POI [7]. Therefore, it is important to predict the location of vehicle trajectory by designing traffic knowledge graph. The accuracy can be improved by interacting the vehicle trajectory with the surrounding environment [7]. The driving intention of vehicles is closely related to the surrounding Points of Interest (POI). POI can be considered as the spatial feature and can be fused with trajectory points to improve prediction accuracy [8]. Trajectory may change when there are traffic jams or traffic accidents, and the POI knowledge around trajectory points has an impact on the future trend. By integrating POI knowledge and vehicle trajectory, the prediction can be improved to help optimize traffic management. Graph Attention Network (GAT) [9] and various temporal attention mechanisms [10] are also widely used for extract spatial and temporal features. A new GAT is proposed by Su et al. [11] to allocate attention to explain and express the spatial correlation of road network, as well as effectively capture the dynamic update of adjacent transformation matrix. LSTM combined with the attention mechanism are proposed by Ali et al. [12] to capture temporal features. However, how to fuse the surrounding environment features such as POI, and the synergy of knowledge graph, GAT, as well as temporal attention mechanism on location prediction task still lack research.

A Local Dynamic Graph Spatiotemporal- Long Short-Term Memory (LDGST-LSTM) is proposed in this paper to predict the next location of vehicle trajectory. POI knowledge is extracted and fused by constructing a traffic knowledge graph. GAT and temporal attention mechanism are combined to improve the accuracy, robustness, and interpretability of the prediction model. The main contributions of this paper are as follows.

Knowledge graph and spatiotemporal attention mechanism are combined in this paper to predict the vehicle location at the next moment. POI is integrated with historical trajectory, and the POI weights that affect the prediction are visualized additionally. Regions that have a great impact on the prediction is explored, and the interpretability of model is enhanced.
A global traffic knowledge graph is constructed to learn and represent POI semantic information. POI nodes are considered as the entity, and the connection between POI is considered as the relationship. The representation vector of each node is obtained by Translate Embedding (TransE) algorithm and is considered as the feature vector for vehicle location prediction.
A spatiotemporal attention mechanism is designed to allocate weights for spatial and temporal features, thus enhancing the interpretability and accuracy of model. The weight distribution of spatial features is achieved through GAT to obtain the corresponding graph representation vector. LSTM combined with multi-head attention mechanism is used to allocate weights of trajectory points at different timestamps to improve the prediction accuracy.

2. Related Work

This section will provide a literature review on related works, including vehicle trajectory location prediction, knowledge graph, and attention mechanism.

2.1. Vehicle Trajectory Location Prediction

Deep learning method is widely used in location prediction of vehicle trajectory. A vehicle trajectory prediction model combining deep encoder-decoder and neural network is designed by Fan et al. [13]. Multiple segments are processed and corrected by the deep neural network model, which improves the accuracy of the prediction and realizes long-term prediction [14]. Kalatan and Farooq [15] propose a new multi-input LSTM to fuse sequence data with context information of the environment. It can reduce prediction error but more external features that could improve the prediction performance are ignored. Graph Convolutional Network (GCN) combined with its variant model are proposed by An et al. [16] to realize vehicle trajectory prediction in urban traffic system. Through this conjunction, model can effectively capture the spatiotemporal characteristics, therefore improving the effectiveness of vehicle trajectory prediction.

Methods based on deep learning algorithm can effectively improve the prediction accuracy and reduce error. Through feature mining and extraction, deep learning model can effectively improve its accuracy and robustness. However, the impact of surrounding buildings, or to say, Point-of-Interest (POI), on prediction under complex traffic environment is usually ignored in existed research. Therefore, how to better integrate external factors such as POI features needs further research to help improve the prediction performance.

The efficiency and accuracy of vehicle trajectory location prediction can be improved by combining vehicle trajectory with road network through data preprocessing. For example, original vehicle trajectory points are replaced with marked nodes or roads of road network by Fan [13] to realize location prediction. However, only the location information is considered in the proposed method, external factors such as driving intentions or preferences that have significant impacts on prediction results are ignored.

2.2. Knowledge Graph

Knowledge graph is one of the most popular representation methods, which describes entities and their relationships in a structured triple. Each triple is composed of a head entity, a tail entity, and their relationship. The knowledge graph can be combined with deep learning models to learn the representation of the entities and relationships to realize prediction tasks. It can make contributions in the fields of medical treatment, national defense, transportation, and network information security [17,18].

Translating Embedding (TransE) is one of the typical models of knowledge graph [19]. A TransGraph model based on TransE is proposed by Chen et al. [20] to learn the structural features. Moreover, the knowledge graph is also widely used in machine learning. An open-source tool DGL-KE that can effectively calculate the embedded representation of the knowledge graph is proposed by Zheng et al. [21] to execute TransE algorithm. This tool can speed up the training of entities and relationships in the knowledge graph through multi-processing, multi-GPU, and distributed parallelism. A traffic mode knowledge graph framework based on historical traffic data is proposed by Ji and Jin [22] to capture the traffic status and congestion propagation mode of road segments. This framework has pioneering significance for the knowledge graph in the prediction of traffic congestion propagation. Wang et al. [23] propose a method of embedding the temporal knowledge graph through a sparse transfer matrix. Based on static and temporal dynamic knowledge graphs, the global and local embedding knowledge are captured respectively, which alleviates the problem of inconsistent parameter scalability when model learning embedding from different datasets.

More semantic knowledge can be considered in the knowledge graph to enhance the interpretability of complex models [18]. In addition, how to learn entities and relationships with low frequency better, how to integrate context into graph embedding learning better, and how to combine the graph embedding algorithm with other algorithms are future research directions.

2.3. Attention mechanism

Graph Attention Network (GAT) is widely used in the field of transportation to obtain the spatial correlation between nodes. Wang et al. [24] propose a trend GAT model to predict traffic flow. By constructing a spatial graph structure, the transmission among similar nodes is realized and the problem of spatial heterogeneity is solved. A spatio-temporal multi-head GAT model is proposed by Wang and Wang [25] to realize traffic prediction. Spatial features are captured through multi-head attention mechanism and temporal features are captured through full-volume transformation linear gated unit. The spatiotemporal correlation between traffic flow and traffic network is integrated to reduce the prediction error. Wang et al. [26] propose a dynamic GAT model to realize traffic prediction. Spatial feature is extracted by a node embedding algorithm based on the dynamic attention mechanism, and temporal feature is extracted by gated temporal CNN in this model.

Temporal attention mechanism is also widely applied in traffic research. The time correlation between research objectives can be better learned by using the time attention mechanism. A temporal attention perception dual GCN is proposed by Cai et al. [27] to realize air traffic prediction. The historical flight and time evolution pattern are characterized through temporal attention mechanism. Yan et al. [28] design a gated self-attention mechanism module to realize the interaction between the current memory state and the long-term relationship. Chen et al. [29] apply multi-head attention mechanism to mine the features of fine-grained spatiotemporal dynamics. A one-dimensional convolution LSTM model based on the attention mechanism is proposed by Wang et al. [30] to realize traffic prediction. Multi-source data are fused as various features through attention mechanism.

Spatiotemporal attention mechanism can obtain reasonable weights of the features under complex traffic environment, therefore improving the interpretability of model. GAT can be considered as a spatial attention mechanism to capture spatial features and the multi-head attention mechanism can be considered as temporal attention mechanism to capture temporal features. Therefore, it is meaningful and important to combine GAT and temporal attention mechanisms when realizing vehicle trajectory location prediction.

3. Problem Statement

In this section, some basic concepts and notations in this paper are first introduced and then the studied problem are formally defined.

Definition 1

(Raw trajectory

T_{i}

). A raw vehicle trajectory is usually represented by a sequence of points continuously sampled by the Global Positioning System (GPS). Given a raw trajectory dataset

T r a j = {T_{1}, T_{2}, \dots T_{n}}

, the

i^{t h}

trajectory

T_{i} = {P_{i 1,} P_{i 2}, \dots, P_{i k}}

is defined as a sequence of sampled points

P_{i j}

, where

P_{i j} \in ℝ^{2}

i \in (1, 2, \dots, n), j \in (1, 2, \dots, k)

. A sampled point

P_{i j}

is defined as a tuple

(l n g_{i j}, l a t_{i j})

which represents the longitude and latitude of the GPS point at timestamp

j

. Due to the different sampling rate, trajectory data may have the characteristics of irregular distribution, such as being sparse in one area while dense in another area.

Definition 2

(Point of Interest

Q_{u}

). Point-Of-Interests (POI) can be denoted as spatial representation of urban infrastructure such as school, hospital, restaurant and so on. It can reflect land use and urban functional characteristics, and has potential semantic information. Its distribution influences the intentions of travel. POI can be classified into different types. Given a POI dataset with

h

types

P O I = {Q_{1}, Q_{2}, \dots, Q_{h}}

, the

u^{t h}

type of POI

Q_{u} = {I_{u 1}, I_{u 2}, \dots I_{u m}}

is defined as a set of points

I_{u j}

, where

u \in (1, .., h), j \in (1, .., m)

. A POI

I_{u j} = (n a m e_{u j}, a d d r e s s_{u j}, l n g_{u j}, l a t_{u j}, a t t r i b u t e_{u j})

is a five-tuple, which represents the semantic information including the name, address, longitude, latitude, as well as attribute of the

j^{t h}

POI. 11 types of POI information are considered in this paper, including shopping, entertainment, food, hotel, scenic spot, finance, government, company, hospitals, life services, and sports.

Definition 3

(Road network

G

). Vehicle trajectory combined with POI are matched to a global map in this paper. A road network is defined as a graph

G = (V, E, A)

, where

V

is the set of nodes, each

v_{i}

denotes a road segment.

E

is the set of edges connecting nodes and each

e_{i, j} = (v_{i}, v_{j})

denotes the connectivity between road segment

v_{i}

and

v_{j}

A \in ℝ^{| V | \times | V |}

is an adjacency matrix, and semantic information of various POI classes are considered as features. Each element

a_{i, j}

in this matrix is a binary value, which is 1 when road segment

v_{i}

is adjacent to

v_{j}

and 0 otherwise.

Definition 4

(Normalized trajectory

N T_{i}

). Raw trajectory

T_{i}

will be processed through a data conversion layer in this paper and converted into a normalized trajectory

N T_{i}

. Due to the measurement error, it is not appropriate to directly input raw trajectory to the model. Given a raw trajectory dataset

T r a j = {T_{1}, T_{2}, \dots T_{n}}

, the

i^{t h}

trajectory

T_{i} = {P_{i 1,} P_{i 2}, \dots, P_{i k}}

. Match each point

P_{i j}

to the nearest node of road network. A normalized point

N P_{i j}

is defined as a tuple

(v l n g_{i j}, v l a t_{i j})

which represents the longitude and latitude of

j^{t h}

node. A normalized trajectory

N T_{i}

is then defined as a sequence of road segments after projection.

Definition 5

(Normalized POI

N Q_{u}

). Match each POI

I_{u j}

to the nodes of road network which has the shortest projection distance. A normalized POI

N I_{u j}

is defined as the corresponding road segment after projection and POI semantic information is assigned to the matched nodes as node features.

N Q_{u}

refers to the normalized POI of type

u

and

N I_{u j}

represents semantic information of the

j^{t h}

normalized POI of

u^{t h}

type.

Definition 6

(POI Knowledge graph

K G^{P O I}

). Defining a knowledge graph

K G^{P O I} = (E^{P O I}, R^{P O I}, T^{P O I})

, where

E^{P O I} = {e_{1}, e_{2}, e_{3}, \dots e_{m}}

is the POI entity set, where

m

is the number of the POI entities.

R^{P O I} = {r_{1}, r_{2}, r_{3}, \dots r_{n}}

is the POI relationship set, where

n

is the number of the POI relationship.

T^{P O I} = {(h_{P O I}, l_{P O I}, t_{P O I})}

is a triple set, which refers to head entity, relation, and tail entity respectively.

Problem (Vehicle trajectory location prediction). Given a raw trajectory dataset

T r a j = {T_{1}, T_{2}, \dots T_{n}}

and a road network

G

, for current point

P_{i j}

of the

i^{t h}

trajectory

T_{i}

, the task aims to predict the next location

P_{i j + 1}

, which is a tuple

(l n g_{i j + 1}, l a t_{i j + 1})

consisting of the longitude and latitude of GPS point at timestamp

j + 1

4. Methodology

Vehicle trajectory is a sequence of continuously-sampled points and its intention are closely related to POI. The semantic information of POI nearby each trajectory point may have an impact on the next location prediction. Therefore, a Local Dynamic Graph Spatiotemporal- Long Short-Term Memory model (LDGST-LSTM) is proposed in this paper to realize location prediction for vehicle trajectory and explore the impact weight of the nearby POI.

Raw vehicle trajectory and POI points are first matched to the traffic network through the data conversion layer in the proposed model. Then global POI knowledge graph is constructed to obtain the representation vectors of the POI entities in the global POI knowledge extraction layer. Based on the global knowledge graph, Local graph related to each trajectory point is generated and the graph representation vector is captured through GAT in the local dynamic graph generation module. Finally, trajectory points with the related graph representation vectors are input into LSTM with multi-head attention mechanism in the trajectory prediction layer to predict the next location.

In this section, details of the proposed model LDGST-LSTM are provided, which consists of four major components: a data conversion layer, a global POI knowledge extraction layer, a local dynamic graph generation module, and a trajectory prediction layer. The overall framework is first described in Section 4.1. Then, each component in this model is specifically introduced in Section 4.2 to Section 4.5.

4.1. Overall Framework

The overall framework of LDGST-LSTM is shown in Figure 1. There are four major components in the proposed model: (1) a data conversion layer, (2) a global POI knowledge extraction layer, (3) a local dynamic generation module, and (4) a trajectory prediction layer. Additionally, the local dynamic generation module is concluded in the trajectory prediction layer. The observed vehicle trajectory

T_{i} = {P_{i j - t}, P_{i j - t + 1}, \dots, P_{i j}}

is considered as the input of the model, and the predicted next location

P_{i j + 1}

of the trajectory at timestamp

j + 1

is considered as the output.

As shown in Figure 1, vehicle trajectory

T_{i}

is considered as the input of the model. Firstly, normalized trajectory

N T_{i}

and normalized POI

N Q_{u}

is obtained through map matching and proximity algorithm respectively in the data conversion layer. Then the representation vector

{V^{P O I}}_{i j}

of every POI is trained through TransE algorithm based on the knowledge graph in the global POI knowledge extraction layer. It will be considered as semantic features of trajectory points. In the trajectory prediction layer, local graphs

G'_{i j - t : j}

related to trajectory points

P_{i j - t : j}

are first generated, and the corresponding graph representation vectors

N R V_{i j - t : j}

are obtained through GAT in the local dynamic graph generation module. Normalized trajectory points

N P_{i j - t : j}

and corresponding graph representation vectors

N P V_{i j - t : j}

are then concatenated and input into LSTM with multi-head attention mechanism. The model finally outputs the predicted next location

P_{i j + 1}

of the trajectory. The overall framework can be denoted by the following formula (1)-(4), where

P_{j - t : j} = (P_{j - t}, P_{j - t + 1}, \dots, P_{j})

Data conversion layer:

\begin{matrix} T_{i} \overset{m a p m a t c h i n g}{\to} N T_{i}, Q_{u} \overset{p r o x i m i t y a l g .}{\to} N Q_{u} \end{matrix}

(1)

Global POI knowledge extraction layer:

\begin{matrix} T_{i} \overset{m a p m a t c h i n g}{\to} N T_{i}, Q_{u} \overset{p r o x i m i t y a l g .}{\to} N Q_{u} \end{matrix}

(2)

Local dynamic graph generation module:

\begin{matrix} P_{i j - t : j} \overset{{V^{P O I}}_{i j}}{\to} {G^{'}}_{i j - t : j} \overset{G A T}{\to} N R V_{i j - t : j} \end{matrix}

(3)

Trajectory prediction layer:

\begin{matrix} N P_{i j - t : j}; N R V_{i j - t : j} \overset{L S T M}{\to} P_{i j + 1} \end{matrix}

(4)

4.2. Data Conversion Layer

This section will introduce road network data, trajectory data conversion and POI data conversion specifically.

4.2.1. Road Network

The urban road network data used in this paper is downloaded from the Open Street Map (OSM), an open-source map website. Taking Chengdu as an example, the road network is shown in Figure 2. This paper only maintains the roads and nodes within the Third Ring Road of Chengdu as the research scope, which can mainly represent the urban district.

4.2.2. Trajectory Data Conversion

In the data conversion lay, raw trajectory data is converted into normalized trajectory. Each point is represented by the latitude and longitude of matched road network node. Original trajectory coordinate system (GCJ-02) is converted to the road coordinate system (WGS84) in this paper. Based on the previous map-matching work of Lou et al. [31], a certain threshold is considered as the radius to search for candidate points, and the candidate points with the shortest distance are selected as the normalized trajectory points. Trajectory points that are not successfully matched to the corresponding nodes will be deleted as redundant points.

4.2.3. POI Data Conversion

POI data of Chengdu in 2018 is used in this paper. The proximity algorithm is used in ArcGIS to search and select the nearest nodes in the road network for each POI point based on distance. The semantic information of POI points is allocated to the corresponding nodes as the normalized POIs. As shown in Figure 3, POI points of different types are matched to the road network nodes, taking life services, food, and hospitals as examples.

4.3. Global POI Knowledge Extraction Layer

A global graph

G = (V, E, A)

and a knowledge graph

K G^{P O I} = (E^{P O I}, R^{P O I}, T^{P O I})

are constructed in the global POI knowledge extraction layer, as shown in Figure 4. Entity set, relationship set and related triples are defined to study the representation vectors of each POI entity through TransE algorithm. In the defined global graph, each node in the global graph contains the related POI knowledge through the data conversion layer.

As shown in Figure 4, the normalized POI is considered as the entity, and the link between the normalized POI is considered as the relationship, which in this paper denotes that there is a connection between two POI nodes. Moreover, the triplet

T^{P O I} = {(h, l, t)}

is considered as the training set of TransE algorithm, where

h

represents the head entity,

t

represents the tail entity, and

l

represents the relationship;

h

and

t

belongs to the entity set

E^{P O I}

, and

l

belongs to the relationship set

R^{P O I}

. The target of the TransE algorithm is to consider the relationship as the translation from the head entity to the tail entity, or to say, to make

h + l

equal to

t

as much as possible. The potential energy of a triplet is defined by the L2 norm of the difference between

h + l

and

t

, as shown in formula (5), where

N

is the number of the triplets, and

i

represents the

i^{t h}

triplet.

\begin{matrix} f (h_{i}, l_{i}, t_{i}) = {| | h_{i} + l_{i} - t_{i} | |}_{2} = \sqrt{\sum_{i = 1}^{N} {(h_{i} + l_{i} - t_{i})}^{2}} \end{matrix}

(5)

Wrong triplets are identified and considered as negative samples in TransE algorithm for uniform sampling. A negative sample is generated when any one of the factors in a positive sample is replaced by the other entities or relationship randomly. The potential energy of the positive samples is reduced and the potential energy of the negative samples are increased in TransE algorithm. The objective function is defined as below.

\begin{matrix} L = \sum_{(h, l, t) \in \cdot} \sum_{(h^{'}, l^{'}, t^{'}) \in ∆^{'}} {[(f_{r} (h, t) + γ - f_{r^{'}} (h^{'}, t^{'}))]}_{+} \end{matrix}

(6)

Where,

∆

is the set of positive samples, and

∆^{'}

is the set of negative samples.

γ

is a constant, usually is set as 1, that represents the distance between positive and negative samples.

(h, l, t)

and

(h^{'}, l^{'}, t^{'})

are the triplets of positive and negative samples respectively.

f (\cdot)

is the potential energy function and

{[\cdot]}_{+}

m a x (0, \cdot)

The distributed representation vector of the current head and tail entities are considered as the representation vectors

{V^{P O I}}_{i j} \in ℝ^{d}

. Pseudocode of the global POI knowledge extraction layer is shown in Table 1.

4.4. Local dynamic graph generation module

Local graphs are generated for each trajectory point in the local dynamic graph generation module. Graph Attention network (GAT) is used as a spatial attention mechanism to allocate weight and update the parameters of every trajectory point and its neighbors. Corresponding graph representation vector

N R V_{i j}

can be obtained through GAT.

As shown in Figure 5, every node in the global graph is embedded with the POI representation vector

{V^{P O I}}_{i j}

based on the POI knowledge graph in Section 4.3. The feature matrix

F \in ℝ^{N \times k}

is constructed by all the embeddings of nodes,

F_{i j} = {V^{P O I}}_{i j}

, where

N

is the number of the graph nodes,

k

is the embedding dimension of the feature,

F_{i j}

represents the feature matrix of the

j^{t h}

trajectory point of the

i^{t h}

trajectory. The related local graphs

G'_{i j - t : j} = (V_{i j - t : j}, E_{i j - t : j}, A'_{i j - t : j})

are generated for each normalized trajectory point

N P_{i j - t : j}

in the normalized trajectory

N T_{i}

, where

V_{i j}

is the set of the current point

N P_{i j}

and its neighbor nodes,

E_{i j}

is the set of edges among the current point and its neighbors,

A'_{i j}

is the local adjacency matrix of the current point, and it concludes the features of both the current point and its neighbor nodes.

GAT is used to calculate the attention weight between the trajectory points and its neighbor nodes and fuse the features for the local graphs. The adjacency matrix is used to check whether there is connection among nodes, and the resource is allocated by calculating the weights of the neighbor nodes in GAT. It can be considered as a spatial attention mechanism to enhance the interpretability of the proposed model. The definition of attention mechanism is shown as formula (7).

\begin{matrix} A t t e n t i o n (q u e r y, s o u r c e) = \sum s i m i l a r i t y (q u e r y, k e y_{m}) * v a l u e_{m} \end{matrix}

(7)

where

s o u r c e

is the original information, and it is formed by the

k e y - v a l u e

pairs.

A t t e n t i o n (q u e r y, s o u r c e)

represents the information extraction through weight allocation from the

s o u r c e

under the condition of

q u e r y

. The aim of the GAT is to learn the relevance among target nodes and their neighbor nodes through the parameter matrix. In this paper,

q u e r y

is set as the feature vector

F_{i j}

of the current point

N P_{i j}

s o u r c e

is set as the feature vectors of all the neighbor nodes of

N P_{i j}

k e y_{m}

and

v a l u e_{m}

are respectively the

m^{t h}

neighbor node and its feature vector. The relevance coefficients among every trajectory point and its neighbor nodes are calculated in GAT.

\begin{matrix} e_{j m} = L e a k y R e L U (w^{T} [W F_{i j} | | W F_{j m}]) \end{matrix}

(8)

Where

| |

represents concatenation,

F_{i j}

is the feature vector of the

j^{t h}

point of the

i^{t h}

trajectory, and

F_{j m}

is the feature vector of the

m^{t h}

neighbor node of the

j^{t h}

point.

W

is a learnable weight parameter.

w

is a learnable weight of the linear layer and

L e a k y R e L U

is the activation function. Moreover, all the coefficients are normalized by the function

s o f t m a x

in GAT to obtain the attention weight.

a_{j m}

is the attention weight between node

j

and its

m^{t h}

neighbor node, where

N_{i j}

denotes the neighborhood of node

j

in the

i^{t h}

trajectory.

\begin{matrix} a_{j m} = s o f t m a x (e_{j m}) = \frac{\exp (e_{j m})}{\sum_{n \in N_{i j}} \exp (e j n)} \end{matrix}

(9)

The aggregated and updated feature vector is then calculated, as shown in formula (10). Multi-head attention mechanism is used to enhance the feature fusion of the neighbor nodes.

\begin{matrix} N P V_{i j} = | |_{h = 1}^{H_{1}} σ (\sum_{m \in N_{i j}} {a_{j m}}^{(h)} W^{(h)} F_{j m}) \end{matrix}

(10)

where

H_{1}

is the number of the heads,

σ

is the activation function and

N P V_{i j}

is the graph representation vector. Pseudocode of the local dynamic graph generation module is shown in Table 2.

4.5. Trajectory prediction layer

In the trajectory prediction layer, trajectory points and corresponding graph representation vectors are input, and the coordinate of the next location are obtained after going through LSTM with a multi-head attention mechanism, as shown in Figure 6.

As shown in Figure 6, the trajectory points

N P_{i j - t : j}

and the corresponding graph representation vectors

N P V_{i j - t : j}

are concatenated and input into LSTM, as shown in formula (11)-(17).

\begin{matrix} x_{j} = {N P_{i j}, N P V_{i j}} \end{matrix}

(11)

\begin{matrix} f_{j} = σ (W_{f} \cdot [h_{i j - 1}, x_{j}] + b_{f}) \end{matrix}

(12)

\begin{matrix} i_{j} = σ (W_{i} \cdot [h_{i j - 1}, x_{j}] + b_{i}) \end{matrix}

(13)

\begin{matrix} {\tilde{C}}_{j} = \tanh (W_{C} \cdot [h_{i j - 1}, x_{j}] + b_{C}) \end{matrix}

(14)

\begin{matrix} o_{j} = σ (W_{o} \cdot [h_{i j - 1}, x_{j}] + b_{o}) \end{matrix}

(15)

\begin{matrix} C_{j} = f_{j} * C_{j - 1} + i_{j} * {\tilde{C}}_{j} \end{matrix}

(16)

\begin{matrix} h_{i j} = o_{j} * \tanh (C_{j}) \end{matrix}

(17)

where

W_{f}

W_{i}

W_{C}

and

W_{o}

are respectively the weight of the forget gate, the input gate, the cell state, and the output gate.

b_{f}

b_{i}

b_{C}

and

b_{o}

are the corresponding bias. The trajectory points and corresponding graph representation vectors are concatenated as the input

x_{j}

x_{j}

goes through the forget, input and output gates, and generate the cell state

C_{j}

. Necessary information is processed by the input gate and the updated information is activated by the function

t a n h

to obtain the

{\tilde{C}}_{j}

. The current cell state

C_{j}

is shown in formula (16), where

f_{j}

and

i_{j}

are respectively the output of the forget gate and the input gate.

C_{j - 1}

is the cell state of the previous vector and

{\tilde{C}}_{j}

is the activated state updated by the input gate. The hidden state

h_{i j}

of the current vector is obtained by multiplying the activated cell state

C_{j}

and the output of the output gate

o_{j}

A multi-head attention mechanism is used based on LSTM to allocate weight and enhance the interpretability of the proposed model.

\begin{matrix} A_{h} (Q_{h}, K_{h}, V_{h}) = s o f t m a x (\frac{Q_{h} \cdot {K_{h}}^{T}}{\sqrt{d}}) V_{h} \end{matrix}

(18)

\begin{matrix} M u l t i A t t (Q, K, V) = (A_{1} | | \dots | | A_{H_{2}}) W^{O} \end{matrix}

(19)

Where

Q_{h} = W_{i}^{Q} h_{i j}

K_{h} = W_{i}^{K}, h_{i j - t : j}

V_{h} = W_{i}^{V} h_{i j - t : j}

h_{i j}

is the hidden state of the current vector and is considered as the query,

h_{i j - t : j}

denotes the hidden states of the vectors

x_{j - t : j}

and is considered as the key and value,

W_{i}^{Q}

W_{i}^{K}

W_{i}^{V}

W^{O}

are learnable weight parameters,

d

is the dimension,

H_{2}

is the number of the heads.

The predicted result is obtained by adding a multilayer perceptron composed of two full-connected layers.

\begin{matrix} {\hat{Y}}_{j} = W_{F}^{2} R e L U (W_{F}^{1} h_{j}' + b_{F}^{1}) + b_{F}^{2} \end{matrix}

(20)

where

{\hat{Y}}_{j}

is the predicted result.

W_{F}^{1}

W_{F}^{2}'

b_{F}^{1}

, and

b_{F}^{2}

are respectively the weights and biases of the two full-connected layers.

h_{j}'

is the output calculated by the multi-head attention mechanism. Mean Square Error (MSE) is considered as the loss function to calculate the difference between predicted results and truth, where

N

is the number of trajectories and

K

is the length of the trajectory.

\begin{matrix} L = \sum_{i = 1}^{N} \sum_{j = 1}^{K} {({\hat{Y}}_{j} - N P_{i j})}^{2} \end{matrix}

(21)

The pseudocode of the trajectory prediction layer is shown in Table 3.

5. Experiments

This section will demonstrate the details of experiments, including datasets, experiment settings and result analysis. Accuracy and robustness experiment are conducted to evaluate the performance of the proposed model compared with benchmarks. Additionally, ablation experiment is set to explore the effectiveness of the proposed model by filtering the spatial and temporal attention mechanisms. Moreover, the POI weights that influence the prediction results are visualized on the map to demonstrate the significance of urban functional regions to vehicle trajectories.

5.1. Datasets

Trajectory dataset: Chengdu taxi trajectory data in October 2018 from Didi Gaiya is used in this paper. It concludes attributes of driver ID, order ID, timestamp, longitudes, and latitudes. Although it collects trajectory data for one month, it has over 380k trajectories, and 270k are generated on holidays while 110k are trajectories of working days. Concerning the data distribution, the dataset covers mainly urban area of Chengdu, from 30.65283 to 30.72649°N, and 104.04210 to 104.12907° E. The data sampling frequency of each trajectory is 4s, which has relatively high data density on the city road network.

Road Network dataset: POI and trajectory information are combined to construct a global traffic graph. The road network data of Chengdu is downloaded from OSM. The node vector data is composed of road node ID and its corresponding longitude and latitude, and each road vector data is composed of road ID and sequence of the node coordinates.

POI dataset: POI data of 11 categories of Chengdu in AutoNavi map is obtained from the Chinese Software Developer Network (CSDN). The original data format is POI name, longitude and latitude coordinates, address, district, and POI class.

5.2. Experimental settings

Experimental settings in this paper are shown as follows: CPU used in the experiment is AMD RYZEN 75800h, and GPU is Nvidia Geforce RTX 3060. Windows 10 is used as the operating system, Python is used as the development language, and PyTorch is used as the deep learning development framework. Moreover, the parameter settings in this paper are shown as follows: The input data are divided into the training, validation, and test set according to the proportion of 7:2:1. Batch size is set as 16, the initial learning rate is set as 0.0001, and Adam is used as the optimizer in the training process.

5.2.1. Benchmark models

Five benchmark models for comparative experiments are mentioned, including LSTM [32], GRU [33], BiLSTM [34], AttnLSTM [35], AttnBiLSTM [36].

LSTM: It is usually applied in time series prediction. Compared with the traditional RNN models, LSTM makes up for the defect that traditional RNN models only consider the recent state and cannot deal with long-term memory. It can determine which states are retained or forgotten through the internal cell state and forget gate. Moreover, LSTM can avoid some gradient vanishing problems of traditional RNN models.

GRU: Compared with LSTM, GRU only has two gates and three full connection layers, which reduces the amount of calculation to a certain extent and reduces the risk of overfitting. GRU has update and the reset gate, which can determine the output of information, and delete irrelevant information while saving historical information.

BiLSTM: Based on LSTM, the input in BiLSTM will have both forward and backward propagation, therefore each timestamp in the input sequence can save both the future and past historical information at the same time.

AttnLSTM: Attention mechanism can assign greater weight to more important tasks when the amount of computation is limited. It is a resource allocation scheme to solve information overload. Combining LSTM with attention mechanism, the model performance can be improved by increasing the weight of the most important features, or filtering useless feature information.

AttnBiLSTM: BiLSTM combined with the attention mechanism can fuse information by increasing the weight of important features, and bi-directional propagation. It can improve the calculation efficiency and accuracy of the model with abundant semantic information.

5.2.2. Evaluation Metrics

Evaluation metrics used in this paper are Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE), Haversine function (HSIN), and Accuracy. The identification and formulas of these metrics are shown as follows.

Y_{i}

and

{\hat{Y}}_{i}

is true and predicted value respectively, and

N

is the predicted number.

MAE: It represents the average of the absolute error between the real and the predicted value, which can be calculated as follows.

\begin{matrix} M A E = \frac{1}{N} \sum_{i = 1}^{N} | Y_{i} - {\hat{Y}}_{i} | \end{matrix}

(22)

MSE: It represents the expected value of the square of the difference between the predicted and the real value, which can be calculated as follows.

\begin{matrix} M S E = \frac{1}{N} \sum_{i = 1}^{N} {(Y_{i} - {\hat{Y}}_{i})}^{2} \end{matrix}

(23)

RMSE: It represents the sample standard deviation of the difference between the real and the predicted value. RMSE can represent the dispersion degree of the sample. For nonlinear model fitting, the smaller the RMSE is, the better the model fits. Compared with MAE, RMSE has more penalties for high differences. It can be calculated as follows, where

SSE = \sum_{i = 1}^{N} {(Y_{i} - {\hat{Y}}_{i})}^{2}

represents Sum of the Squares of Error.

\begin{matrix} R M S E = \sqrt{M S E} = \sqrt{\frac{S S E}{N}} = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(Y_{i} - {\hat{Y}}_{i})}^{2}} \end{matrix}

(24)

HSIN: It represents the distance difference between the predicted points and the ground truth. The smaller the distance error, the higher the accuracy of the model prediction. It can be calculated as follows, where

Y_{i} = (l a t_{i}, l o n_{i})

is the truth, and

{\hat{Y}}_{i} = ({\hat{l a t}}_{i}, {\hat{l o n}}_{i})

is the predicted longitude and latitude.

\begin{matrix} H S I N = 2 r a r g s i n (\sqrt{s i n^{2} (\frac{l a t_{i} - {\hat{l a t}}_{i}}{2}) + \cos (l a t_{i}) \cos ({\hat{l a t}}_{i}) s i n^{2} (\frac{l o n_{i} - {\hat{l o n}}_{i}}{2})}) \end{matrix}

(25)

Accuracy: It indicates the proportion of the accurate predicted output in the total output. The higher the accuracy rate, the better the model training effect. It can be calculated as follows,

T P

is the correctly predicted samples,

N P

is the incorrectly predicted samples, and

c o u n t (\cdot)

represents the number of the objects.

\begin{matrix} A c c u r a c y = \frac{c o u n t (T P)}{c o u n t (T P) + c o u n t (N P)} \end{matrix}

(26)

5.3. Result analysis

The accuracy experiment and the robustness experiment are discussed in this paper. The LDGST-LSTM model is compared with the benchmark models to explore the performance on accuracy and robustness. The ablation experiment is discussed to verify the spatial and temporal attention mechanisms of the proposed model to enhance the interpretability. Moreover, the POI weights calculated by the temporal attention mechanism are visualized on the map. The POI is considered as the feature of the vehicle trajectory and the influence of the POI on the vehicle position prediction is revealed.

5.3.1. Accuracy Experiment

All models are trained and the accuracy comparison results on training set are shown as Figure 7. It can be seen that in both holidays and working days, the accuracy of LDGST-LSTM is significantly higher than the benchmarks.

As shown in Figure 7a,b, the accuracy of the proposed model in both holidays and working days are nearly 4.5 times higher than the benchmarks in the dataset of October 2018. As shown in Figure 7c,d, the accuracy of the proposed model in both weekends and working days are nearly 3.5-4.5 times higher than the benchmarks in the dataset of August 2014. Therefore, the accuracy of location prediction can be effectively improved by considering POI knowledge and combing GAT and temporal attention mechanism.

The accuracy comparison results of different models are shown in Table 4.

In the October 2018 dataset, the accuracy of LDGST-LSTM is 21% higher compared to LSTM during holidays and 61% higher compared to Attn-LSTM during weekdays. In the August 2014 dataset, LDGST-LSTM improved by 64% compared to Attn-BiLSTM on weekends. Therefore, compared to the benchmarks, the model proposed in this paper can improve performance by selecting appropriate POI categories on weekdays and holidays, respectively. More importantly, integrating POI knowledge and combining spatial and temporal attention mechanisms can greatly improve the accuracy of vehicle trajectory location prediction.

5.3.2. Robustness experiment

The results of robustness experiment are discussed in this section. The convergence speed and the metrics after convergence of the proposed model and benchmarks are compared to analyze the robustness. The results of convergence speed are shown in Figure 8.

As shown in Figure 8a,b, the loss of LDGST-LSTM drops slowly compared with the benchmark models before the 200th iterations on the dataset of October 2018. The convergence speed of the proposed model becomes faster from the 250th to the 300th iterations, and it converges at around the 350th iteration. As shown in Figure 8c,d, the loss of LDGST-LSTM drops slowly compared with the benchmark models before the 200th iterations on the dataset of August 2014. And the convergence speed of the proposed model becomes faster from the 250th to the 350th iterations and it converges at around the 450th iteration.

The performance on MAE, MSE, RMSE and HISN of different models are shown in Table 5. As shown in Table 5a,b, the performance of the proposed model on evaluation metrics are the best compared with benchmarks when using the dataset of Octorber 2018. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 7.73%, 28.57%, 11.11% and 15.72% lower than LSTM that is the most robust among the benchmars in the holidays. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 6.19%, 33.33%, 27.27% and 41.79% lower than GRU that is the most robust among the benchmarks in the working days. As shown in Table 5c,d, the performance of the proposed model on evaluation metrics are still the best when using the dataset of Chengdu August 2014. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 6.09%, 33.33%, 15.64% and 41.66% lower than GRU in the weekend. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 4.88%, 28.57%, 18.55% and 15.99% lower than LSTM in the working days. Therefore, it can be seen that the robustness of the proposed model are the best compared with all the benchmark models.

5.3.3. Ablation Experiment

Ablation experiment is discussed to analyze the effect of major components in the proposed model by filtering GAT and temporal attention mechanism. In addition to the proposed model, three ablation models are analyzed, including (1) Local Dynamic Graph Convolutional Network- Long Short-Term Memory (LDGCN-LSTM), (2) Local Dynamic Graph Convolutional Network- Temporal Attention Long Short-Term Memory (LDGCN-TAttnLSTM), and (3) Local Dynamic Graph Attention- Long Short-Term Memory (LDGAT-LSTM).

As shown in Figure 9, the accuracy of LDGST-LSTM is the highest among the other three ablation models in both two datasets. The accuracy of LDGST-LSTM is almost the same as the accuracy of LDGAT-LSTM. It may be the reason that in holidays, the intention of the taxis is more focuesd on some functional regions, and the importance of spatial overweighs the temporal features.

The ablation results is shown in Figure 6. As shown in Figure 6a,b, the evaluation metrics of LDGST-LSTM are the best among the ablation models when using dataset in 2018. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 9.91%, 28.57%, 12.97% and 15.46% lower, and the accuracy is 4.17% higher than LDGAT-LSTM that performances the best among ablation models in the holidays. The MAE, MSE, RMSE and HSIN of the proposed model are respectively 3.9%, 14.29%, 22.18% and 20.12% lower, and the accuracy is 40% higher than LDGAT-LSTM in the working days. As shown in Figure 6c,d, the performances of LDGST-LSTM are also the best in 2014 dataset. The MAE, MSE, RMSE and HSIN of LDGST-LSTM are respectively 25.40%, 33.33%, 18.31% and 18.52% lower, and the accuracy is 7.69% higher than LDGAT-LSTM that performances the best among ablation models in the weekends. The MAE, MSE, RMSE and HSIN of LDGST-LSTM are respectively 10.55%, 28.57%, 11.81% and 10.42% lower, and the accuracy is 3.64% higher than LDGAT-LSTM in the working days. In conclusion, the proposed model performances the best compared with other ablation models. Therefore, the combination of GAT and temporal attention mechanism can enhance the interpretability and also improve the accuracy and robustness of model.

5.3.4. POI weights visualization

The predicted coordinates of the next location and corresponding weight are calculated by the proposed model. The visualization of POI weights is realized through the nuclear density analysis in the ArcGIS.

The visualization of the POI weight has the positive significance to the vehicle trajectory planning, traffic optimazation and vehicle location prediction. As shown in Figure 10, there are some regions of POI information having effects on the vehicle location prediction. For example , the western south regions in Figure 10a,c, and the right and the top side in Figure 10b,d. It can be seen that in holidays and weekends, POI in the western south regions can be considered as the important information that influences the vehicle trajectories. It may denote that these regions are close to the center of the city, and has high traffic flow in holidays and weekends. Therefore, it is important to plan the driving path in these regions in holidays and weekends. Moreover, POI regions that influence the location prediction is more dispersed in working days, and the trajectory decision can be more flexible.

6. Conclusions

A Local Dynamic Graph Spatiotemporal- Long Short-Term Memory (LDGST-LSTM) model is proposed in this paper to predict the next location of vehicle trajectory. Data conversion layer, POI global knowledge extraction layer, local dynamic graph generation module and trajectory prediction layer are major components in the proposed model. Raw taxi trajectory and POI semantic information are matched to the road network through map matching algorithm and proximity algorithm in the data conversion layer. The representation vectors of POI are learned through TransE algorithm by constructing knowledge graph in the POI global knowledge extraction layer. Based on the global knowledge graph, Local graph related to each trajectory point is generated and the graph representation vector is captured through GAT in the local dynamic graph generation module. Finally, trajectory points with the related graph representation vectors are input into LSTM with multi-head attention mechanism in the trajectory prediction layer to predict the next location.

However, there are limitations in this paper. Firstly, since the non-uniform sampling of GPS and the existence of GPS signal shielding areas, accurate trajectory recovery can be the future research direction. Moreover, only POI knowledge is considered as the external feature in this paper, while the vehicle trajectory can also be affected by other external features, such as morning and evening peak, and weather. Therefore, it is essential to integrate multi-source external features to predict the next location. Furthermore, scenario in this paper is the macro traffic roads, therefore more specific traffic scenarios, such as intersections, can be further studied in the future.

Author Contributions

Funding acquisition, J.C.; methodology, J.C.; project administration, J.C.; software, Q.F. and X.X.; supervision, J.C.; validation, X.X.; visualization, Q.F. and X.X.; writing—original draft, J.C.; writing—review & editing, Q.F. and X.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 61104166.

Data Availability Statement

The research data is unavailable due to privacy or ethical restrictions.

Acknowledgments

The authors would like to thank the reviewers for useful suggestions.

Conflicts of Interest

The authors declare there are no conflict of interest regarding the publication of this paper. The authors have no financial and personal relationships with other people or organizations that could inappropriately influence our work.

References

Guo, L. Research and Application of Location Prediction Algorithm Based on Deep Learning. Doctoral Thesis, Lanzhou University, Lanzhou, China, 2018. [Google Scholar]
Havyarimana, V.; Hanyurwimfura, D.; Nsengiyumva, P.; Xiao, Z. A novel hybrid approach based-SRG model for vehicle position prediction in multi-GPS outage conditions. Inf. Fusion 2018, 41, 1–8. [Google Scholar] [CrossRef]
Wu, Y.; Hu, Q.; Wu, X. Motor vehicle trajectory prediction model in the context of the Internet of Vehicles, J. Southeast Univ. (Nat. Sci. Ed.) 2022, 52, 1199–1208. [Google Scholar] [CrossRef]
Li, L.; Xu, Z. Review of the research on the motion planning methods of intelligent networked vehicles, J. China Highw. Transp. 2019, 32, 20–33. [Google Scholar] [CrossRef]
Wang, K.; Wang, Y.; Deng, X.; et al. Review of the impact of uncertainty on vehicle trajectory prediction, Automot. Technol. 2022, 7, 1–14. [Google Scholar]
Wang, L. Trajectory Destination Prediction Based on Traffic Knowledge Map. Doctoral Thesis, Dalian University of Technology, Dalian, China, 2021. [Google Scholar]
Guo, H.; Meng, Q.; Zhao, X.; et al. Map-enhanced generative adversarial trajectory prediction method for automated vehicles, Inf. Sci. 2023, 622, 1033–1049. [Google Scholar] [CrossRef]
Xu, H.; Yu, J.; Yuan, S.; et al. Research on taxi parking location selection algorithm based on POI, High-Tech. Commun. 2021, 31, 1154–1163. [Google Scholar]
Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio’, P.; Bengio, Y. Graph Attention Networks. arXiv 2017, arXiv:1710.10903. [Google Scholar] [CrossRef]
Li, L.; Ping, Z.; Zhu, J.; et al. Space-time information fusion vehicle trajectory prediction for group driving scenarios, J. Transp. Eng. 2022, 22, 104–114. [Google Scholar]
Su, J.; Jin, Z.; Ren, J.; Yang, J.; Liu, Y. GDFormer: A Graph Diffusing Attention based approach for Traffic Flow Prediction. Pattern Recognit. Lett. 2022, 156, 126–132. [Google Scholar] [CrossRef]
Ali, A.; Zhu, Y.; Zakarya, M. Exploiting dynamic spatio-temporal correlations for citywide traffic flow prediction using attention based neural networks. Inf. Sci. 2021, 577, 852–870. [Google Scholar] [CrossRef]
Fan, H. Research and Implementation of Vehicle Motion Tracking Technology based On Internet of Vehicles. Doctoral Thesis, Beijing University of Posts and Telecommunications, Beijing, China, 2017. [Google Scholar]
Hui, F.; Wei, C.; Shangguan, W.; Ando, R.; Fang, S. Deep encoder-decoder-NN: A deep learning-based autonomous vehicle trajectory prediction and correction model. Phys. A Stat. Mech. Its Appl. 2022. [Google Scholar] [CrossRef]
Kalatian, A.; Farooq, B. A context-aware pedestrian trajectory prediction framework for automated vehicles, Transp. Res. Part C: Emerg. Technol. 2022, 134. [Google Scholar] [CrossRef]
An, J.; Liu, W.; Liu, Q.; Guo, L.; Ren, P.; Li, T. DGInet: Dynamic graph and interaction-aware convolutional network for vehicle trajectory prediction. Neural Netw. 2022, 151, 336–348. [Google Scholar] [CrossRef] [PubMed]
Yang, D.; He, T.; Wang, H.; et al. Research progress in graph embedding learning for knowledge map. J. Softw. 2022, 33, 21. [Google Scholar] [CrossRef]
Xia, Y.; Lan, M.; Chen, X.; et al. Overview of interpretable knowledge map reasoning methods. J. Netw. Inf. Secur. 2022, 8, 1–25. [Google Scholar]
Zhang, Z.; Qian, Y.; Xing, Y.; et al. Overview of TransE-based representation learning methods, Comput. Appl. Res. 2021, 3, 656–663. [Google Scholar]
Chen, W.; Wen, Y.; Zhang, X.; et al. An improved TransE-based knowledge map representation method. Computer Engineering 2020, 46, 8. [Google Scholar]
Zheng, D.; Song, X.; Ma, C.; Tan, Z.; Ye, Z.; Dong, J.; Xiong, H.; Zhang, Z.; Karypis, G. DGL-KE: Training Knowledge Graph Embeddings at Scale. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020.
Ji, Q.; Jin, J. Reasoning Traffic Pattern Knowledge Graph in Predicting Real-Time Traffic Congestion Propagation. IFAC-Pap. 2020, 53, 578–581. [Google Scholar] [CrossRef]
Wang, X.; Lyu, S.; Wang, X.; Wu, X.; Chen, H. Temporal knowledge graph embedding via sparse transfer matrix. Inf. Sci. 2022, 623, 56–69. [Google Scholar] [CrossRef]
Wang, C.; Tian, R.; Hu, J.; Ma, Z. A trend graph attention network for traffic prediction. Inf. Sci. 2023, 623, 275–292. [Google Scholar] [CrossRef]
Wang, B.; Wang, J. St-Mgat:Spatio-Temporal Multi-Head Graph Attention Network for Traffic Flow Prediction. SSRN Electron. J. 2022. [Google Scholar] [CrossRef]
Wang, T.; Ni, S.; Qin, T.; Cao, D. TransGAT: A dynamic graph attention residual networks for traffic flow forecasting. Sustain. Comput. Informatics Syst. 2022, 36, 100779. [Google Scholar] [CrossRef]
Cai, K.; Shen, Z.; Luo, X.; Li, Y. Temporal attention aware dual-graph convolution network for air traffic flow prediction. J. Air Transp. Manag. 2023, 106. [Google Scholar] [CrossRef]
Yan, X.; Gan, X.; Wang, R.; Qin, T. Self-attention eidetic 3D-LSTM: Video prediction models for traffic flow forecasting. Neurocomputing 2022, 509, 167–176. [Google Scholar] [CrossRef]
Chen, L.; Shi, P.; Li, G.; Qi, T. Traffic flow prediction using multi-view graph convolution and masked attention mechanism. Comput. Commun. 2022, 194, 446–457. [Google Scholar] [CrossRef]
Wang, K.; Ma, C.; Qiao, Y.; Lu, X.; Hao, W.; Dong, S. A hybrid deep learning model with 1DCNN-LSTM-Attention networks for short-term traffic flow prediction. Phys. A-Stat. Mech. Its Appl. 2021, 583, 126293. [Google Scholar] [CrossRef]
Lou, Y.; Zhang, C.; Zheng, Y.; Xie, X.; Wang, W.; Huang, Y. Map-matching for low-sampling-rate GPS trajectories. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2009.
Cai, Y. Research on Vehicle Trajectory Prediction Based on RNN-LSTM Network. Doctoral Thesis, Jilin University, Jilin, China, 2021. [Google Scholar]
Zhang, H.; Huang, C.; Xuan, Y.; et al. Real time prediction of air combat flight trajectory using gated cycle unit. Syst. Eng. Electron. Technol. 2020, 42, 7. [Google Scholar] [CrossRef]
Guo, Y.; Zhang, R.; Chen, Y. Vehicle trajectory prediction based on potential characteristics of observation data and bidirectional short-term and long-term memory network. Automot. Technol. 2022, 3. [Google Scholar] [CrossRef]
Liu, C.; Liang, J. Vehicle trajectory prediction based on attention mechanism. J. Zhejiang Univ. (Eng. Ed.) 2020, 54, 8. [Google Scholar] [CrossRef]
Guan, D. Research on Modeling and Prediction of Vehicle Moving Trajectory in the Internet of Vehicles. Doctoral Thesis, Beijing University of Posts and Telecommunications, Beijing, China, 2020. [Google Scholar]

Figure 1. The overall framework of LDGST-LSTM.

Figure 2. Road network of Chengdu: (a) Roads vector map; (b) Nodes vector map; (c) Satellite projection map.

Figure 3. Partial visualization of normalized POI.

Figure 4. Global POI knowledge extraction layer.

Figure 5. Local graph generation module.

Figure 6. The research framework of the trajectory prediction layer.

Figure 7. Accuracy comparison of different models on training set: (a) holidays in Oct. 2018; (b) working days in Oct. 2018; (c) weekends in Aug. 2014; (d) working days in Aug. 2014.

Figure 8. Convergence speed comparison of different models: (a) holidays in Oct. 2018; (b) working days in Oct. 2018; (c) weekends in Aug. 2014; (d) working days in Aug. 2014.

Figure 9. Accuracy comparison of different ablation modules: (a) holidays in Oct. 2018; (b) working days in Oct. 2018; (c) weekends in Aug. 2014; (d) working days in Aug. 2014.

Figure 10. Visualization of the POI weights: (a) holidays in Oct. 2018; (b) working days in Oct. 2018; (c) weekends in Aug. 2014; (d) working days in Aug. 2014.

Table 1. Pseudocode of the global POI knowledge extraction layer.

Algorithm 1: Global POI knowledge extraction layer
Input entity, relation, and training sets $E^{P O I}, R^{P O I}, T^{P O I} = {(h, l, t)}$ , embedding dim $k$
1:	normalize $l \leftarrow$ uniform $(- \frac{6}{\sqrt{k}}, \frac{6}{\sqrt{k}})$ for each $l \in R^{P O I}$
2:	$l / l$ for each $l \in R^{P O I}$
3:	$e \leftarrow$ uniform $(- \frac{6}{\sqrt{k}}, \frac{6}{\sqrt{k}})$ for each $e \in E^{P O I}$
4:	loop
5:	$e / e$ for each $e \in E^{P O I}$
6:	$S_{b a t c h}^{P O I} \leftarrow$ sample $T^{P O I}, b)$ //sample by size $b$
7:	$T_{b a t c h}^{P O I} \leftarrow \emptyset$ //initialize the triplets
8:	for $(h, l, t) \in S_{b a t c h}^{P O I}$ do
9:	$(h^{'}, l^{'}, t') \leftarrow$ sample $S'_{(h, l, t)}^{P O I}$ //extract negative samples
10:	$T_{b a t c h}^{P O I} \leftarrow T_{b a t c h}^{P O I} \cup^{} {((h, l, t), (h^{'}, l^{'}, t^{'}))}$ //extract positive and negative samples randomly
11:	end for
12:	Update embeddings $\sum_{(h, l, t) \in ∆} \sum_{(h^{'}, l^{'}, t^{'}) \in ∆^{'}} {[(f_{r} (h, t) + γ - f_{r^{'}} (h^{'}, t^{'}))]}_{+}$ //L
13:	end loop
Output representation vector $V_{i j}^{P O I}$ of the current entity

Table 2. Pseudocode of the local dynamic graph generation module.

Algorithm 2: Local dynamic graph generation module
Input normalized trajectory points $N P_{i j - t : j}$ , POI feature vectors $F_{i j} = V_{i j}^{P O I}$
1:	$G'_{i j - t : j} \leftarrow (V_{i j - t : j}, E_{i j - t : j}, A'_{i j - t : j})$ for each trajectory point // generate local graphs
2:	Target node vector $F_{i j} \leftarrow V_{i j}$ and neighbor node vector $F_{j m} \leftarrow V_{j m}$ // according to local graphs
3:	$e_{j m} \leftarrow L e a k y R e L U (w^{T} [W F_{i j} \| \| W F_{j m}])$
4:	$a_{j m} \leftarrow s o f t m a x (e_{j m})$
5:	weighted sum:
6:	for $h \leq H_{1}$ do
7:	$N P V_{i j} \leftarrow N P V_{i j} \| \| σ (\sum_{m \in N_{i j}} {a_{j m}}^{(h)} W^{(h)} F_{j m})$
8:	end for
Output graph representation vector $N P V_{i j}$ for each trajectory point

Table 3. Pseudocode of the trajectory prediction layer.

Algorithm 3: Trajectory prediction layer
Input normalized trajectory points $N P_{i j - t : j}$ , graph representation vectors $N P V_{i j - t : j}$
1:	LSTM Module:
2:	loop
3:	$x_{j} \leftarrow {N P_{i j}, N P V_{i j}}$
4:	$f_{j}, i_{j}, \tilde{C_{j}}, o_{j} \leftarrow$ calculated by forget gate, input gate and output gate
5:	$C_{j}, h_{i j} \leftarrow$ cell state and hidden state are calculated by $f_{j}, i_{j}, \tilde{C_{j}}, o_{j}$
6:	Temporal attention mechanism:
7:	$Q_{h} \leftarrow W_{i}^{Q} h_{i j}$ , $K_{h} \leftarrow W_{i}^{K}, h_{i j - t : j}$ , $V_{h} \leftarrow W_{i}^{V} h_{i j - t : j}$
8:	for $h \leq H_{2}$ do
9:	$A_{h} \leftarrow s o f t m a x (\frac{Q_{h} \cdot {K_{h}}^{T}}{\sqrt{d}}) V_{h}$
10:	$h_{j}' \leftarrow h_{j}^{'} \| \| A_{h} W^{O}$
11:	end for
12:	MLP:
13:	${\hat{Y}}_{j} \leftarrow W_{F}^{2} R e L U (W_{F}^{1} h_{j}' + b_{F}^{1}) + b_{F}^{2}$
14:	end loop
Output coordinate ${\hat{Y}}_{j}$ of the next location

Table 4. Accuracy comparison of different models: (a) holidays in Oct. 2018; (b) working days in Oct. 2018; (c) weekends in Aug. 2014; (d) working days in Aug. 2014.

(a)		(b)
Model	Acuraccy (%)	Model	Acuraccy (%)
LSTM	0.09	LSTM	0.10
GRU	0.07	GRU	0.08
BiLSTM	0.06	BiLSTM	0.07
Attn-LSTM	0.05	Attn-LSTM	0.14
Attn-BiLSTM	0.06	Attn-BiLSTM	0.09
LDGST-LSTM	0.30	LDGST-LSTM	0.75
(c)		(d)
Model	Acuraccy (%)	Model	Acuraccy (%)
LSTM	0.12	LSTM	0.10
GRU	0.12	GRU	0.07
BiLSTM	0.11	BiLSTM	0.05
Attn-LSTM	0.15	Attn-LSTM	0.09
Attn-BiLSTM	0.16	Attn-BiLSTM	0.05
Model	Acuraccy (%)	LDGST-LSTM	0.57

Table 5. Performance comparison of different models: (a) holidays in Oct. 2018; (b) working days in Oct. 2018; (c) weekends in Aug. 2014; (d) working days in Aug. 2014.

(a)
Model	MAE	MSE	RMSE	HISN
LSTM	0.0207	0.0007	0.0234	2.4433
GRU	0.0218	0.0008	0.0243	2.8943
BiLSTM	0.0302	0.0012	0.0329	4.7287
Attn-LSTM	0.0234	0.0009	0.0258	3.7313
Attn-BiLSTM	0.0492	0.0035	0.0567	8.4362
LDGST-LSTM	0.0191	0.0005	0.0208	2.0591
(b)
Model	MAE	MSE	RMSE	HISN
LSTM	0.0212	0.0009	0.0295	3.5067
GRU	0.0210	0.0009	0.0275	3.4957
BiLSTM	0.0305	0.0010	0.0305	3.1503
Attn-LSTM	0.0220	0.0008	0.0250	3.4728
Attn-BiLSTM	0.0345	0.0012	0.0355	3.7504
LDGST-LSTM	0.0197	0.000	0.0200	2.0348
(c)
Model	MAE	MSE	RMSE	HISN
LSTM	0.0202	0.0007	0.0305	2.9507
GRU	0.0197	0.0006	0.0275	2.6054
BiLSTM	0.0255	0.0008	0.0335	3.0504
Attn-LSTM	0.0199	0.0006	0.0290	2.6595
Attn-BiLSTM	0.0301	0.0012	0.0403	3.7955
LDGST-LSTM	0.0185	0.0004	0.0232	2.0634
(d)
Model	MAE	MSE	RMSE	HISN
LSTM	0.0205	0.0007	0.0275	2.5047
GRU	0.0227	0.0009	0.0294	2.8643
BiLSTM	0.0269	0.0011	0.0327	3.0457
Attn-LSTM	0.0235	0.0009	0.0310	3.0137
Attn-BiLSTM	0.0312	0.0014	0.0343	3.3189
LDGST-LSTM	0.0195	0.0005	0.0224	2.1042

Table 6. Ablation results of the proposed model: (a) holidays in Oct. 2018; (b) working days in Oct. 2018; (c) weekends in Aug. 2014; (d) working days in Aug. 2014.

(a)
Model	MAE	MSE	RMSE	HISN	Accuracy (%)
LDGST-LSTM	0.0191	0.0005	0.0208	2.0591	0.25
LDGAT-LSTM	0.0212	0.0007	0.0239	2.4357	0.24
LDCN-TAttnLSTM	0.0225	0.0008	0.0244	2.6329	0.20
LDGST-LSTM	0.0195	0.0005	0.0224	2.1042	0.13
(b)
Model	MAE	MSE	RMSE	HISN	Accuracy (%)
LDGST-LSTM	0.0197	0.0006	0.0200	2.0348	0.70
LDGAT-LSTM	0.0205	0.0007	0.0257	2.5473	0.50
LDCN-TAttnLSTM	0.0220	0.0008	0.0295	2.6904	0.45
LDGST-LSTM	0.0237	0.0010	0.0305	2.9754	0.40
(c)
Model	MAE	MSE	RMSE	HISN	Accuracy (%)
LDGST-LSTM	0.0185	0.0004	0.0232	2.0634	0.70
LDGAT-LSTM	0.0253	0.0007	0.0301	2.8751	0.64
LDCN-TAttnLSTM	0.0248	0.0006	0.0284	2.5323	0.65
LDGST-LSTM	0.0297	0.0009	0.0328	2.9107	0.50
(d)
Model	MAE	MSE	RMSE	HISN	Accuracy (%)
LDGST-LSTM	0.0195	0.0005	0.0224	2.1042	0.57
LDGAT-LSTM	0.0218	0.0007	0.0254	2.3490	0.55
LDCN-TAttnLSTM	0.0261	0.0009	0.0278	2.5983	0.40
LDGST-LSTM	0.0284	0.0009	0.0290	2.9841	0.30

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

MDPI Initiatives

Important Links

Choose an area of interest and we will send you notifications of new preprints at your preferred frequency.

Disclaimer

Vehicle Trajectory Prediction Based on Local Dynamic Graph Spatiotemporal- Long Short Term Memory Model

Abstract

1. Introduction

2. Related Work

2.1. Vehicle Trajectory Location Prediction

2.2. Knowledge Graph

2.3. Attention mechanism

3. Problem Statement

4. Methodology

4.1. Overall Framework

4.2. Data Conversion Layer

4.2.1. Road Network

4.2.2. Trajectory Data Conversion

4.2.3. POI Data Conversion

4.3. Global POI Knowledge Extraction Layer

4.4. Local dynamic graph generation module

4.5. Trajectory prediction layer

5. Experiments

5.1. Datasets

5.2. Experimental settings

5.2.1. Benchmark models

5.2.2. Evaluation Metrics

5.3. Result analysis

5.3.1. Accuracy Experiment

5.3.2. Robustness experiment

5.3.3. Ablation Experiment

5.3.4. POI weights visualization

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe