Preprint
Review

Transcending Time and Space: A Survey Methods, Uncertainty, and Development in Human Migration Prediction

This version is not peer-reviewed.

Submitted:

06 April 2023

Posted:

06 April 2023

You are already at the latest version

Abstract
As a fundamental, overall, and strategic issue facing human society, human migration is a key factor affecting the development of countries and cities given constantly changing population numbers. The fuzziness of the spatiotemporal attributes of human migration limits the pool of open-source data for human migration prediction, leading to a relative lag in human migration prediction algorithm research. This study expands the definition of human migration research, reviews the progress of research into human migration prediction, and classifies and compares human migration algorithms based on open-source data. It also explores the critical uncertainty factors restricting the development of human migration prediction. Given the effect of human migration prediction, in combination with artificial intelligence and big data technology, the paper concludes with specific suggestions and countermeasures aimed at enhancing human migration prediction research results to serve economic and social development and national strategy.
Keywords: 
Subject: 
Social Sciences  -   Demography

1. Introduction

With the worldwide occurrence of rapid economic and social development, human migration and mobility between urban and rural areas, between cities, and between countries have become more convenient, and human migration (HM) has become a universal phenomenon. For instance, in China alone, as of November 2021, the inter-provincial mobile population was 124,837,153 and the intra-provincial mobile population was 250,979,606[1]. A report issued by the United Nations Population Division (United Nations, 2020) predicted that the number of global migrants in 2020 would be 281 million [2]. While the pace of HM slowed due to the COVID-19 pandemic, the gradual improvement of the pandemic and the adjustment of relevant policies have once again led to an increase in domestic and international HM [3].Given the significant decrease in the global population growth rate and the gradual increase in the ageing population, HM has become a significant component of population growth in many countries and regions. Labor, health care, education and other such policies must fully consider HM development dynamics and trends to ensure that the policies they formulate are forward-looking, targeted, and effective. To accurately understand HM trends, extensive research has been conducted by governmental organizations, academics, and industry. HM forecasting has, therefore, become a vital research hotspot in the field of population studies.
International migration has received greater attention than internal migration due to its multifaceted effects and policy importance, leading to greater theoretical and empirical results in that field. Early studies of human migration prediction (HMP) mainly focused on the analysis of HM drivers and the establishment of a relationship equation between HM drivers and number of migrants to predict future HM development trends [4]. Studies show that prediction models used in domestic and international HM forecasting are remarkably similar, differing only in data sources and policies; models from both types of HM can be used to support each other in the development of improved forecasting methods [5]. In the 1940s, the sociologist Zipf first proposed a gravitational model to predict HM [6]. In the late 1950s, the demographer Bogue proposed the famous "push-pull theory" to describe the motives of HM from a kinematic perspective [7]. Later, the American scholar Lee added to and improved the push-pull theory of HM based on Bogue's thesis [8]. In recent years, scholars have conducted empirical analyses of these models and have continuously optimized them and improved their accuracy based on applied examples [9,10]. With the development of information technology and statistics, many forecasting methods have emerged, and these mainly focus on econometrics, time series, Bayesian statistics, etc. In recent years, with the rapid development of artificial intelligence (AI) technology with big data and machine learning (ML) as the core, some scholars have tried to use ML technology for HMP[67,68,69,70,71,72,73]; however, the limitations include the limited amount of HM data, different standards, and difficulties in access. In addition, the uncertainty of HM drivers and the difficulty of their quantification have led to the slow development of HMP research [4].
Many scholars respectively conducted systematic reviews of HMP research, which provided a comprehensive overview of the current status of HMP research and a reference for further investigation [11,12,13]. However, they mainly focused on traditional HMP methods and ignored the application and development of AI technology in HMP research. In this regard, this paper tracks the development of the application of AI technology to HMP and provides a review of the traditional and AI technology-based HMP research to provide a scientific basis for future HMP research.
This paper first reviews the development of HM theory and summarizes and analyzes the classification of HMP. Next, focus is placed on traditional HM forecasting methods, and their successes and shortcomings are analyzed. Then, the application and development of AI technology-based HMP are reviewed, and the uncertainties that lead to the slow growth of HM are analyzed. Finally, conclusions are drawn by analyzing the current status of HMP research and pointing out the development direction.

2. Human Migration Theory and Research Progress

2.1. Human Migration Theoretical Review

The geographical or spatial flow of population between two regions usually involves the change of a permanent residence from the place of departure to the location of arrival, which is called permanent migration [14]. With the rapid development of transportation and information networks and the acceleration of globalization, people's travel frequency and time are increasing, the travel distance is growing, and permanent residence is decreasing [20].In HM projection, it is common practice to ignore internal migration projection studies and concentrate on international migration, despite the latter comprising only a small proportion of the mobile population. Because of this, the general public's comprehension of the entire process of population movement is less thorough and accurate [15]. Despite significant differences in the scale of these two phenomena , studies have shown that the evolution of international migration can be explained to a large extent by internal migration, as there is significant complementarity between the two [16,17,18]. In addition, tourism migration as another form of population flow, overlaps with migration in many respects and has a causal relationship [19]. It is difficult to draw a line between these two phenomena [20]. It is therefore necessary to dismiss the generally limited definition of HM and adopt a systematic approach to properly understand the interconnections in HM and ultimately better support policy. This will aid in the comprehension of the reasons for HM and the accurate prediction of its development trend [21].
Compared with other areas of population research, the research on HM theory is relatively unitary and can be traced back to the Laws of Migration written by Ravenstein [22], a British statistician. Since then, researchers and specialists in fields including population geography, socioeconomics, and political economy have combined their research to propose several related theories, including neoclassical economic theory, labor market theory, world system theory, migration network theory, and cumulative causality [23,24]. These theories have helped to lay the groundwork for HMP. In traditional theoretical research, a relatively systematic and mature academic system has been established for HM. Many empirical studies have been carried out and have focused on analyzing the relationship between HM and urbanization, as well as the spatial characteristics, policies, causes, and influencing factors of HM. With the fast growth of technologies like big data and AI, researchers have increasingly been looking into this topic. Based on many empirical studies [25,26], they have developed theories like the spatiotemporal network of HM, which constantly adds to the theoretical system of HM.

2.2. Research Progress

The basic task of the HMP are to predict the scale, characteristics and development of HM in the future, on the assumptions of its influencing factors, current situation and development trend. Another task is to predict the future development changes of HM via space-time series abstraction, based on spatiotemporal information, population characteristics and events, and other information. In this regard, the traditional migration prediction methods mainly explore the targets of the migration population quantity (flow, stock, migration rate, etc.) and migration probability. Specifically, the human migration prediction process includes: data acquisition and pre-processing, feature extraction and correlation analysis, model construction, model application, and prediction result output, as shown in Figure 1.
In the early stage, due to the limitation of technical capacity, HMP used a single mathematical method. With the development of information technology and computerized statistics, HMP based on statistical means has become mainstream. The use of mathematical tools, the accumulation of relevant data, and the use of small sample data have become the main means by which to predict HM. However, incomplete information may lead to prediction bias [27]. In recent years, prediction methods based on big data and AI technology have been proposed [28]. However, despite the difficulty in obtaining government data and consequently their incomplete utilization, these data can provide the government with personal migration preference information to a certain extent, thereby providing a reference for policy formulation and immigration governance. Figure 2 presents the development trend of the use of critical technologies for HMP in recent years. The results revealed that the use of classical prediction models has exhibited a downward trend in recent years, whereas the use of AI models has gradually exhibited an upward trend.
In this review, HMP methods are divided into traditional and AI-based prediction methods. The conventional HMP methods include the cohort factor, scene judgment, expert, gravity, Bayesian statistical, and parameter models [11]. The AI prediction methods are mainly divided into ML and deep learning. According to the classification of the prediction results, the traditional prediction models can be divided into deterministic and stochastic approaches [13,26]. Table 1 in the Supplementary Material lists the main methods, data sources, and spatiotemporal characteristics of migration prediction. It can be seen that HM data are mainly government statistics. With the application of big data and AI technology, social media data with geographic characteristic information have been applied to HMP [28]. Based on the extant literature, it is common to see that many studies focus their attention on the prediction of international HM. Moreover, while most scholars have used the dichotomy method to forecast domestic and international HM, some have used the comprehensive prediction method. Due to the limitation of data, most scholars have chosen the use of traditional forecasting methods, mainly econometric models, for HMP, and have primarily focused on the long-term changes of HM. In contrast, new AI methods mainly focus on short-term forecasting.

3. Traditional Human Migration Forecasting Approaches

3.1. Deterministic Methods

As a traditional population forecasting method, deterministic approaches are based on the assumption of relevant influencing factors, the independent belief of expert experience, or simple extrapolation, then limited predictions have been output(‘high’, ‘medium’ and ‘low’). The deterministic prediction method has the advantages of a simple concept; low data dependency and can fully utilize the subjective prediction results of experts. Moreover, the technique is simple in operation, suitable for medium- and long-term prediction, and widely used in many countries and international institutions, such as the United Nations (UN) and the European Union (EU) [77]. However, deterministic methods often ignore the influence of many uncertain factors. The limited forecast results do not answer the question "To what extent will the future development of population migration be medium (high or low)? " or "What is the probability of high (medium, low) population migration in the future?” which leads to an insufficient ability to interpret the prediction results. In addition, forecasting results are largely dependent on the knowledge of experts and can be influenced by the scope of expert knowledge, political stance or social attitudes, which can be easily misleading [78].

3.2. Stochastic Methods

In contrast to deterministic methods, the model parameters in stochastic methods are not fixed, but are considered as random variables. It is found that population migration has many random elements, so in order to get closer to reality, some unavoidable random factors are considered in the population migration model to build a random model. In this regard, many scholars use stochastic models to effectively predict population migration [57,58,59]. Of course, the analysis of stochastic systems is much more difficult than that of relatively deterministic systems. When using stochastic models to describe the migration process, the parameters of the model are difficult to deal with due to the lack of data, and econometric models based on sample data have come to be widely used to predict population migration. In addition, the development of linear and nonlinear migration theory has promoted the research of gravitational model, time series model, Bayesian model and other methods in population migration prediction, making the stochastic model one of the most effective methods for population migration prediction.

3.2.1. Econometric Forecasting Models

Econometric models are mainly used to study the causal relationships between variables of interest, and they are convenient for revealing the relationships between the relative amounts of change of variables; thus, they are widely used in HM forecasting studies. In the 1980s, Plaut used econometric models to forecast the number of net population movements in Texas, USA [64]. Later, in response to the problem of expanding HM forecasting in the EU, Fertig and Schmidt provided a simple econometric model to forecast the rate of HM from the Czech Republic, Estonia, Hungary, and Poland (four candidate countries) to Germany [79]. Based on this, Dustmann et al. improved this method by adding a relative per capita income variable to predict European migration after the enlargement of the EU in 2004 [80]. However, because the model they used did not fully take into account the non-stationary of HM, but instead treated it as a stationary process, is was subject to significant errors in post hoc tests under the influence of relevant policies. Nevertheless, it was found to be valid for forecasting in Germany, where temporary restrictions on access to its labour market were imposed. A similar study was carried out by Alvarez-Plata et al. using actual income levels, employment rates, population size, and geographic and cultural similarity dummy variables to build a predictive model for HM from CEE10 to EU15 countries [81].
The determinants of HM can be selected and quantified. According to the theory of HM, Cappelen et al. from Statistics Norway used econometric models to set variables such as the income level, unemployment rate, and population size of Norway and the emigration countries, as well as the number of immigrants already living in Norway, to predict the overview of immigrants moving to Norway [82]. In addition, the change in income under the influence of exogenous variables was fully considered, and the corresponding prediction results were provided. In addition, econometric models have also been successfully applied to the prediction of various types of HM (such as skilled talents, college graduates, the labor force, etc.).
While many achievements have been made, the econometric model is also characterized by shortcomings when applied to HMP; in particular, the problem of missing variables often occurs during variable selection. In addition, to reduce the research complexity, researchers tend to ignore some essential characteristics of the population, such as the population size and age structure. Moreover, variable selection error is often a vital source of prediction error.
Given the shortcomings of the econometric model, Dao et al. selected and parameterized appropriate driving factors for international HM, built a structural equation model based on the social economy, solved the parameters via historical data, and then predicted the two-way trend of international HM [83]. The prediction results of the model were found to be consistent with the actual situation, which proved the correctness and feasibility of the model.
Burzynski et al. proposed a comparable institutional equation model by optimizing the model by Dao et al [84]. The model expanded the scope of international migration drivers by introducing fresh factors such as internal migration, technological change, and education. The quantitative analysis of the factors influencing the global distribution of highly skilled people (i.e., domestic educational opportunities, the sectorial distribution of workers, and international migration) conclusively demonstrated that the uneven distribution of labor is a significant contributor to global inequality, that HM is one of the most powerful ways to alter the global distribution of highly skilled people, and that economic inequality within regions can affect the global distribution of highly skilled people. According to the researchers, the model is equally appropriate for predicting the amount of international population flow and the working-age population.
Compared with other methods, the parametric model can be used to analyze the overall effects of individual indicators and the relationships between them, and has strong adaptability to complex systems.

3.2.2. Gravity Model

The gravity model is named for its morphology, similar to Newton's law of gravity, which can effectively explain spatial interaction. The model is suitable for the analysis of regional flow, and has been widely used in the research and prediction of HM. In the 1940s, Zipf proposed a gravity model for HMP; the model holds that the HM between region i and region j is directly proportional to the population of the two areas and inversely proportional to the distance between the two areas, as given by the following equation.
M i j ~ α P i P j d i j β ,
Where M i j is the number of people migrating from region i to region j, P i is the number of people in region i, P j is the number of people in region j, d i j is the distance between the two areas, and α and β are two parameters.
Based on the classical gravity model, Beine et al. optimized the model with the actual HM data [85]. The model is as follows:
E m i j = S i D j ϕ i j ,
where E ( m i j ) represents the expected number of migrations from country i to country j, S i represents the ability of country i to export immigrants, ϕ i j represents bilateral accessibility, which can also be understood as the migration cost between countries i and j, and D i = y i Ω i denotes the relative attractiveness of destination j depending on the potential income y i (wages or GDP) in country j and the relative cost Ω i of emigrating to other destinations.
In gravity models, attractiveness generally refers to the economic beauty of a particular destination as compared to that of other countries. Because data on expected income are difficult to come by, economists usually use GDP levels or related indices instead [86,87]. When considering the transfer cost, various possible factors are regarded, such as the cost, psychological cost, new language learning cost, etc. [88]. However, in the classical gravity model, the distance parameter incorporates all of these factors, i.e., an increase in the distance between two countries leads to a rise in the cost of migration. In this regard, the fixed effects of similar countries can be regarded as dummy variables.
In addition, the gravity model can also be transformed into a multiple regression equation, and the migration flow at time t can be expressed as follows [85].
l n ( m i j , t ) = β 0 + β 1 l n ( G D P i , t ) + β 2 l n ( G D P j , t ) + β 3 l n ( d i s t a n c e i j , t ) + β 4 d u m m i e s i j , t + ε i j , t
In the formula, there is a linear relationship between all coefficients and migration flow, and each variable is independent of the others. While the principle of the model is straightforward, a critical problem is ignored, i.e., the individual and invisible characteristics in the process of HM are not considered. In this regard, Backhaus et al. studied the impact of climate change on bilateral migration by using the gravity model, and increased the average temperature and precipitation of the immigration country based on the classical model [89]. In addition, Friebel et al. added changes in smuggling routes based on the gravity model and studied the immigration costs that affect willingness to migrate to a specific location [90].
In summation, based on the interpretation of attraction and distance, the gravity model is applicable to the study of HM-related issues and has strong robustness. The flexible selection of parameters, such as the environment, politics, sociology, micro/macro-economy, geography, etc., enables the better understanding of the driving factors of cross-border migration flows. Although the gravity model provides a reasonable explanation for the spatial pattern of HM, Beyer et al. found that the gravity model based on the time dimension does not perform well [91]. The existing methods are all discussed based on historical data, and the relevant variables are ideal for long-term stability. Unpredictable impacts, such as those of financial crises, war, climate change, or technological progress, are not fully considered, and it is therefore difficult for the results to be convincing.
Because the gravity model contains parameters to be estimated, parameter estimation requires a large amount of historical data. The calculation process is complex, so Simini et al. proposed the radiation model [9].
M i j ~ O i P i P j ( P i + d i j ) ( P j + d i j + P j ) ,
Where M i j is the number of migrations from region i to region j, P i is the number of people in region i, P j is the number of people in region j, d i j is the distance between the two regions, and O i is the total number of trips in area i. This method is weak in has its limitations because it only focuses on the flow between two specific points.

3.2.3. Time-Series Models

In the traditional HMP method, the migration problem can also be abstracted into a time-series problem to be solved. The classical models of time-series prediction mainly include the autoregressive (AR) model, the moving average (MA) model, and the autoregressive integrated moving average (ARIMA) model [48]. Generally, the HM forecast is expressed by the AR (1) model as
m i j , t + 1 = c + φ m i j , t + ε t + 1 ,
where m i j , t + 1 represents the amount of HM between i and j at time t+1, m i j , t is the HM quantity between i and j at time t, and ε t + 1 represents the error at time t+1. In the linear model, assuming that the error term satisfies a normal distribution, the parameters can be obtained using historical data φ. The uncertainty and randomness of HM are applied to the non-seasonal ARIMA model, and its differential form is expressed as follows.
y t = c + ϕ 1 y t 1 + + ϕ p y t p + θ 1 ε t 1 + + θ q ε t q + ε t ,
In demographic applications, the order of the ARIMA model used for prediction usually does not exceed (1,1,1). Bijak used the ARIMA (0,1,0) model to predict international migration, and the model is expressed as follows [92]:
ln m t + 1 = c + ln m t + ε t ,
Where m t + 1 is the predicted number of immigrants, m t is the number of previous immigrants, c is the drift constant, and ε t is the error term satisfying the independent normal distribution.
The time-series prediction model can determine the characteristics, trends, and development rules of HM changes according to the time series to effectively predict the future modifications of the HM. However, because the time-series forecasting method does not consider outside factors, there is a defective prediction error; when significant changes take place in HM policy, they tend to have a more substantial deviation, and will produce predicted results that do not tally with the actual situation [12]. Therefore, the effect of the time-series prediction method for short-term prediction is better than that for long-term forecasts.

3.2.4. Bayesian Prediction Model

Bayesian models are considered to be an extension of univariate time-series models, which use probabilistic methods as inputs. In HMP, the number of historical population movements is the only influence; therefore, the method is also considered a purely data-driven approach. Research results have shown that the Bayesian model is more flexible and practical for migration data deficiencies [93]. Due to the incompleteness of HMP data, Bayesian models can all be represented in a probabilistic manner, in which historical trends, expert judgments, and various models are combined in a probabilistic way. In the combined Bayesian model, expert judgment can be used as a prior distribution of the different parameters [33]. The parameters are then updated according to the data.
Azose and Raftery used a Bayesian hierarchical first-order autoregressive model or AR (1) model to achieve a fitted forecast of global HM rates [36]. The model is expressed as
r c , t μ c = ϕ c r c , t 1 μ c + ε c , t
where r c , t denotes the HM rate of country c at time t, and ε c , t is a normally distributed random deviation with mean 0 and variance σ c 2 . The uncertainty of international migration is quantified based on the posterior distribution by inputting demographic variables. The model enables the long-term forecasting of international migration without causing an explosion of tension.
The advantage of Bayesian models is that probabilistic, rather than quantitative, assessment is used for the estimation of the model parameters, i.e., there is a complete distribution in the Bayesian analysis, not just a parameter. In a Bayesian model, the parameters are considered variables of a random distribution that are extracted from a specific distribution, and the type of distribution of the parameters is used as an additional input variable for the input data. By using this distribution, it is possible to simulate the data following a stochastic process and to derive possible values for the parameters from the assumed distribution using a data generation process.

3.2.5. Expert Prediction Model

In the context of traditional temporal probabilistic prediction, Lutz and Goldstein proposed an expert-based probabilistic population prediction method [94], which can be expressed as
v t = v t ¯ + ε t
Where v t is the number of population movements studied, v t ¯ the average trajectory of the population movement process, which is an a priori assumption derived from the subjective judgment of experts, and ε t is the chosen stochastic process. However, for exceptional cases (e.g., war, disaster, etc.), expert experience may lose its usefulness and lead to invalid or opposite prediction results. In response, an expert-based algorithm for forecasting population composition (including net migration, etc.) under Bayesian models was proposed by Billari et al. [95]. The method was then further extended by the researchers to population in-migration and out-migration forecasting [96]. However, purely expert methods are limited by the use of too little data, and rely entirely on the subjective judgments of experts. The problem of bimodality may arise when experts make errors in their emotional decisions or when there are differences of opinion among expert groups.
In addition to the lack of a large amount of temporal data on HM, some scholars use the grey model to take a portion of HM information as the research object [66,67]. A grey model is established by extracting sufficient information from known data to achieve an accurate description and grasp of HM development trends.

4. Machine Learning Prediction Methods

Machine learning (ML) is an important branch of artificial intelligence (AI). Its basic principle is to study how computers simulate human learning patterns to automatically acquire knowledge and continuously upgrade their performance [97]. Compared with the traditional research model based on statistics and simulation, the machine learning method can quickly and accurately extract effective associative information from historical data. In recent years, it has been widely used in population migration prediction research.
According to different learning methods, machine learning can be divided into classical machine learning and deep learning. As shown in Figure 3, a range of machine learning methods have been applied in population migration prediction research, including illegal migration prediction, conventional migration prediction, labour migration prediction, migration flow data generation, migration trend prediction, international migration drivers, and asylum seeker prediction [68,69,70,71,72,73,74,75,76].

4.1. Classical Machine Learning Prediction Method

4.1.1. Artificial Neural Network

As a model to simulate the structure function and computation of biological neural network, artificial neural network aims to achieve certain functions by simulating some mechanisms and mechanisms of the brain, such as image recognition, speech recognition, etc. Its main structure includes input layer, hidden layer and output layer. After years of efforts and research, machine learning has shown strong advantages in the field of population migration prediction. The prediction method of population migration based on neural network takes the original data or the features extracted based on the original measurement data as the input of the neural network, and constantly adjusts the structure and parameters of the network by a certain training algorithm, and uses the optimized network to predict the development trend of population migration.
Robinson and Dilkina were probably the first to use machine learning models to predict population migration; addressing the inability of traditional linear models to model the non-linear relationship between population migration and its characteristics, while proposing a comprehensive solution to the problems of data imbalance, hyperparameter tuning and performance evaluation in model training, providing a new tool and instrument for population migration prediction [98]. The study successfully uses machine learning as an emerging tool to predict the development trend of domestic and international population migration, demonstrating the advancement and generalizability of the prediction tool, and providing a new reliable tool for the assessment of the future development of population migration and the evaluation of migration management policies.
Tarasyev et al. constructed a multi-regional migration-unemployment-wage model by selecting the distribution of migrants, age structure of migrants, wage level, cost of migrating, labor market conditions, regional employment and unemployment information, climatic conditions and distance between countries of origin and destination, etc [99]. An inductive machine learning approach is used to explore the trend of labour migration.
Subsequently, to improve the interpretability of machine learning models, Kiossou et al. used an interpretable machine learning approach to study the drivers of international migration with higher accuracy than the classical gravity model [100]. It also provides a deeper understanding of how migration is affected by its drivers, effectively revealing the interesting non-linear relationship between covariates and outcome variables. To solve the problem of predicting illegal migration, Azizi, S. and Yektansani, K. established a machine learning model that is different from the traditional one [101]. They used eight machine learning techniques to effectively predict the legal status of individuals from Mexico in the United States using data available from Princeton University's Mexican Immigration Project. Based on an adaptive machine learning algorithm, Carammia et al. developed a Dynamic Elastic Net Model that integrates government statistics and social media data to effectively predict asylum-related migration flows [72]. Giang, N.H. et al. proposed a BPNN model for forecasting labour production and labour migration, and the results show that this method can improve the forecasting performance compared to K-nearest neighbour (kNN) and Random Forest Regression (RFR) models [70].

4.1.2. Random Forest

Random forest is an algorithm that uses multiple decision trees to train, classify and predict samples. It was first proposed by Breiman, L. in 2001 and was mainly applied to regression and classification scenarios [102]. The predicted value of Random Forest is the calculation result based on multiple decision trees (forests), which is usually the mean or mode of the output value of all decision trees. Random forest has the advantages of simple operation, fast training speed and easy to fall into overfitting, so it has become a hot tool for population migration modelling. The focus of population migration prediction research is to use random forest to solve the regression problem. Its basic idea is as follows: One, the autonomous sampling method is used to extract k samples from the original training set, and the sample size of each sample is the same as that of the original training set. Then, k decision tree models are constructed for k samples and k regression results are obtained. Finally, k decision tree results are combined by taking the average values.
To address the problem of forecasting environmental migration, Kelsea Best et al. proposed a random forest model that can effectively identify significant variables from large social surveys [71]. This model can identify the most important predictors of migration from around 2000 original factors, allowing regression analysis with fewer variables and more degrees of freedom.
Aoga et al. proposed a tree-based machine learning (ML) method to predict the impact of weather shocks on individual migration tendencies in six agriculture-dependent economies: Burkina Faso, Cote d'Ivoire, Mali, Mauritania, Niger and Senegal [103]. The results show that climatic factors have positive and significant effects on the predictive performance of individual migration intentions.

4.1.3. Support Vector Machine

SVM was first proposed by Cortes and Vapnik in 1995 [104]. It is mainly used to solve the classification and regression problems of machine learning, and is suitable for analysing small samples and multidimensional data. Different from the traditional neural network learning methods, SVM is developed based on the Vapnik-Chervonenkis(VC) dimension theory and the structural risk minimization principle, which realizes the structural risk minimization (SRM) principle. It also minimizes the empirical risk, has good generalization performance for future samples, and has less samples. The main idea of SVM-based HM forecasting method research is to use the actual HM data to train the support vector machine model, determine the model parameters (insensitivity coefficient, penalty factor, kernel function parameters, etc.), forecasting the future state based on the trained SVM model, and obtain the predicted value of HM by comparing with the pre-set failure threshold.
Lan Zhang et al. employed the support vector machine (SVM) algorithm to create a classification model for the migration of the unregistered resident population in Beijing, and conducted an empirical analysis of migration data from various surveys in Beijing [105]. The results show that SVM performs better than BP neural network and logical regression in accuracy and generalization for these specific classification tasks, and can forecasting migration trends with higher accuracy.

4.2. Deep Learning Prediction Method

The traditional shallow machine learning algorithm relies heavily on expert prior knowledge and signal processing technology, and it is difficult to automatically process and explore massive monitoring data. As a new technology developed from neural networks, deep learning provides a solution for training massive data with its powerful feature extraction capability. However, in the field of migration research, due to differences in relevant concepts and insufficient human, material and financial resources, the data collected on population migration is sparse or missing, which makes the understanding of population migration lagging behind and inaccurate. Social media data offers a new way to expand the timely perception of complete information on population migration. With the acceleration of globalization, social media and traditional data complement each other, which can further meet the needs of efficient management of population migration.

4.2.1. Recurrent Neural Networks

Recurrent Neural Network (RNN) is a sort of neural network with short-term memory capability. In RNN, neurons can not only accept information from other neurons, but also accept their own information, forming a network structure with loops. Compared with feedforward neural network, RNN is more consistent with the structure of biological neural network. In tasks such as language modelling and natural language generation, recurrent neural networks (RNN) are suitable for processing sequence data with back-and-forth dependency because of the feedforward and feedback connections between neurons at each layer. The parameters of the recurrent neural network can be learned over time using the back-propagation algorithm. The back-propagation algorithm with time is to transmit the error information step by step in the reverse order of time. If the input sequence is relatively long, there will be gradient explosion and disappearance problems. To solve the problem of RNN, some improved RNN structures have been proposed. Among them, gate current unit (GRU) and long short term memory (LSTM) are typical representatives, which overcome the problems of gradient disappearance and gradient explosion of traditional RNN networks, so that the structure of RNN can not only extract the deep characteristics of time series, but also consider the long-term dependence of time series, It also makes the model obtain better prediction results.
Nicolas Golenvaux et al. use the Long and Short Term Memory Network (LSTM) to predict international migration based on Google trend data, using the thermal coding vector input tag to incorporate more complex time-invariant factors (such as the distance between the two countries and the common language) [76]. The results show that the LSTM method is significantly superior to the standard artificial neural network and the traditional gravity model; in addition, this paper adjusts the LSTM structural model and adds crises to improve the accuracy of the model according to the abnormal impact of a particular year.

4.2.2. Graph Neural Networks

Graph Neural Networks (GNNs), as a generalization of cyclic neural networks, are widely used due to their powerful ability to process complex graph data. By formulating certain strategies on the nodes and edges of the graph, GNN converts the graph structure data into a standardized and standard representation and inputs it into a variety of different neural networks for training, achieving good results in node classification, link prediction, graph clustering and other tasks. In the actual network, the network can be mapped to the relationship between nodes and edges, and the GNN can be used to generate a graph from unstructured data. Its output does not change with the input order of the nodes. The edges represent the dependency between two nodes, and can update the state of nodes by relying on the surrounding state.
Terroso-Sáenz and Muñoz proposed a method for predicting population movement at the national level [73]. This method uses a Graphical Neural Network (GNN) to consider the potential relationship between large geographical regions, and realizes the prediction of population movement between cities on the national spatial granularity; In addition, the author also introduced the impact of climatic factors on population flows. The results show that the effect of weather factors is not obvious due to the mismatch between climate data and model data.
Although ML methods have made some achievements in HM research in recent years, they are currently at a preliminary stage overall, and have had a limited impact on HMP research. To give full play to the predictive power of ML, massive amounts of data are required, and the current sample size of HM research is far from reaching the lower limit that can allow it to yield accurate predictions. In addition, with the complexity of cross-border HM policy changes, the related models lack robustness, and the generalization ability of ML models is poor.
Moreover, the feasibility of the use of Twitter data for predicting domestic HM has been explored to address the problem of an insufficient data volume for deep learning prediction models [63,74]. The results showed that Twitter data have considerable value in HMP. In future studies, focus will be placed on the selection of pertinent data and the design of efficient feature models to further the research on deep learning-based HMP.

5. Uncertainty in Population Migration Projections

As the key to the success of HMP, datasets are traditionally collected using official statistics or survey data compiled and published by relevant organizations. However, different countries have different definitions of HM statistics during collection. During the collection process, government convenience is followed and tools developed for other purposes, rather than specifically designed to measure HM and its outcomes, are used [106]. Even in developed countries like EU nations, information on migrating populations remains sketchy. As a result, HM definitions and data quality issues often cause forecasting methods to be incomparable, forecasts are not accurate, forecasting models are not robust, and there is significant uncertainty [106,107]. Moreover, the numerous social, political, demographic, economic, environmental, and technological drivers in HM forecasting are highly uncertain and difficult to quantify. Their interactions lead to different migration outcomes and bring significant uncertainty to HMP.
Though there has been an accumulation of theories and empirical studies related to HM, no single idea has proven comprehensive enough to cover the multiple forms of migration. The push and pull factors (determinants) or the drivers of migration and non-migration interact with each other, thus making a comprehensive explanation of the migration process impossible even if they cover most cases. Therefore, HM theory has a limited role in the interpretation for the results. Moreover, HM forecasting involves different disciplines, and other experts have entirely different expectations about HM changes. The cumulative effect of all these uncertainties hampers the development of HMP. Therefore, uncertainty discussions should be given attention in all HM projections; otherwise, such uncertainties will spread.

6. Conclusion and Outlook

This paper examined the work on HM forecasting in the past 80 years. The main conclusions include the following: (1) the lack of necessary data will cause misjudgement and affect the development of the industry; (2) deep learning is the future development trend; (3) currently, more consideration is given to the macro level than to the individual level; more research is conducted on the static process than on the dynamic process; finally, there has been more research on international HM than on domestic HM. While the scale and frequency of HM are increasing, the development of HM forecasting research is slow due to the constraints of data and existing technology. The future development of HMP can be carried out from the following aspects.
First, global population growth is gradually slowing down. The driving force of population growth in many countries is mainly rely on HM, which has consequently become an important issue related to the population and political security of some countries. Only by mastering the actual data of HM can scientific and practical measures be taken to ensure the stable development of economies and societies. To ensure the comparability and uniformity of data, statistical authorities of all countries should strengthen cooperation and follow uniform statistical standards. In addition, to ensure the comprehensiveness and adequacy of data, data from transit countries or regions should be collected in addition to data from countries or areas where people move in and out. Data should be made freely available to the public, and government agencies, industry, and academia should collaborate to facilitate rapid research on migration prediction.
Second, with the development of big data and AI technology, the number, type, and fineness of data will be continuously improved, which will challenge the ability of prediction models. The dynamic change of HM depends not only on the differences in the temporal dimension, but also on the transformation of the characteristics in the spatial dimension [108]. The traditional model-driven methods cannot capture the hidden nonlinear features in the spatiotemporal series, so they are unsuitable for processing these series. As an efficient deep learning framework based on the graph data structure, the GNN is widely used in various fields and has achieved remarkable results. The data of HM flow have the natural attribute of a graph data structure, and the application of GNNs will be an inevitable choice in the future. Therefore, HMP models based on GNNs will be an essential future development direction.
Once again, research on HMP will lead to peaks and valleys in the application development of big data and AI technology. Countries should use the chance for growth to solve the talent problem that has been a longstanding issue in research on HM. Governments should plan to facilitate the migration of skilled people, rely on universities and institutions that carry out population research, and work hard to train reserve talent with an AI technology base for the research and management of HM. Also, different models can be used to improve the skills and abilities of government management teams on the job, thereby creating a talent pool that can be used to manage HM and carry out research in the new era.
Finally, in the traditional process of HM management, although relevant government agencies have obtained a large amount of stock data, they are unable to make full use of these data due to technical limitations. Moreover, no technical support is formed, and the management method is often based on the personal intentions and subjective opinions of managers. With the development of big data and AI technology, the acquisition, preservation, and processing of big data on HM have become possible. Important directions for future development will be to use big data to improve the management level of HM, to establish and improve the mechanism of the scientific decision-making and social management of HM, to promote the innovation of government management and the social governance mode, and to strengthen the research and development of HMP and intelligent auxiliary decision-making systems.

Author Contributions

Tong Zhen Pu, Chong Xin Huang, Jing Jing Yang, and Ming Huang contributed equally to the authorship of the manuscript, including the research design, conducting the research, performing the analysis, and writing the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (NSFC) , grant number 61963037, 61863035, 62261059.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. China's State Council. China Population Census Yearbook 2020; China Statistics Press: Beijing, China, 2022; pp. 10–23. [Google Scholar]
  2. United Nations. International Migrant Stock 2019; (United Nations Database, POP/DB/MIG/Stock/Rev.2019); Department of Economic and Social Affairs, United Nations: Geneva, Switzerland, 2019. [Google Scholar]
  3. Sharma, D.R.; Kandpal, D.V. COVID 19 pandemic and international migration: An initial view. Sustain. Oper. Comput. 2021, 2, 122–126. [Google Scholar] [CrossRef]
  4. Willekens, F.; Massey, D.; Raymer, J.; Beauchemin, C. International migration under the microscope. Science 2016, 352, 897–899. [Google Scholar] [CrossRef] [PubMed]
  5. King, R.; Skeldon, R. “Mind the gap!” integrating approaches to internal and international migration. J. Ethn. Migr. Stud. 2010, 36, 1619–1646. [Google Scholar] [CrossRef]
  6. Zipf, G.K. The P 1 P 2/D hypothesis: On the intercity movement of persons. Am. Sociol. Rev. 1946, 11, 677–686. [Google Scholar] [CrossRef]
  7. Bogue, D.J. Internal migration. In The Study of Population; Hauser, P.M., Duncan, O.D., Eds.; University of Chicago Press: Chicago, IL, USA, 1959. [Google Scholar]
  8. Lee, E.S. A theory of migration. Demography 1966, 3, 47–57. [Google Scholar] [CrossRef]
  9. Simini, F.; González, M.C.; Maritan, A.; Barabási, A.L. A universal model for mobility and migration patterns. Nature 2012, 484, 96–100. [Google Scholar] [CrossRef]
  10. Brockmann, D.; Hufnagel, L.; Geisel, T. The scaling laws of human travel. Nature 2006, 439, 462–465. [Google Scholar] [CrossRef]
  11. Disney, G.; Wiśniowski, A.; Forster, J.J.; et al. Evaluation of existing migration forecasting methods and models. Report for the Migration Advisory Committee, 2015; Southampton: University of Southampton. https://www.gov.uk/government/publications/evaluationof-existing-migration-forecasting-methods-and-models.
  12. Sardoschau, S. The Future of Migration to Germany. Assessing Methods in Migration Forecasting. DeZIM Project Report #DPR 1|20 Berlin, 2020.
  13. Patrizio, V.; Deschermeier, P.; Wilke, C.B. An Overview of Population Projections—Methodological Concepts, International Data Availability, and Use Cases. Forecasting 2020, 2, 346–363. [Google Scholar] [CrossRef]
  14. Van de Walle, E.; Henry, L. Multilingual Demographic Dictionary; Ordina: Liege, Belgium, 1982. [Google Scholar]
  15. Skeldon, R. International Migration, Internal Migration, Mobility and Urbanization: Towards more Integrated Approaches; United Nations: New York, NY, USA, 2018. [Google Scholar]
  16. Otoiu, A.; Titan, E.; Dumitrescu, R. Internal and international migration: Is a dichotomous approach justified? Procedia-Soc. Behav. Sci. 2014, 109, 1011–1015. [Google Scholar] [CrossRef]
  17. Cirillo, M.; Cattaneo, A.; Miller, M.; Sadiddin, A. Establishing the link between internal and international migration: Evidence from Sub-Saharan Africa. World Dev. 2022, 157, 105943. [Google Scholar] [CrossRef]
  18. Bernard, A.; Perales, F. Linking internal and international migration in 13 European countries: Complementarity or substitution? J. Ethn. Migr. Stud. 2022, 48, 655–675. [Google Scholar] [CrossRef]
  19. Provenzano, D. , & Baggio, R.. The contribution of human migration to tourism: The VFR travel between the EU 28 member states. Int. J. Tour. Res. 2017, 19, 412–420. [Google Scholar] [CrossRef]
  20. Möhring, M. Tourism and Migration: Interrelated Forms of Mobility. Comparativ 2014, 24, 116–123. [Google Scholar] [CrossRef]
  21. Sabine, M.; Marschall, S. (Eds.) Memory, Migration and Travel; Routledge: London, UK, 2018; pp. 1–23. [Google Scholar]
  22. Ravenstein, E.G. The laws of migration. J. Stat. Soc. Lond. 1885, 48, 167–235. [Google Scholar] [CrossRef]
  23. O’Reilly, K. Migration Theories: A Critical Overview; Routledge Handbook of Immigration and Refugee Studies; Routledge: Abingdon, UK, 2015; pp. 25–33. [Google Scholar]
  24. Arango, J. Theories of International Migration; International Migration in the New Millennium; Ashgate: Aldershot, UK, 2017; pp. 25–45. [Google Scholar]
  25. Lewis, G.J. Human Migration: A Geographical Perspective; Croom Helm: London, UK, 1982. [Google Scholar]
  26. Sohst, R.; Tjaden, J.; de Valk, H.; Melde, S. The Future of Migration to Europe: A Systematic Review of the Literature on Migration Scenarios and Forecasts; International Organization for Migration: Geneva, Germany, 2020. [Google Scholar]
  27. Demirel, D.F.; Basak, M. A fuzzy bi-level method for modeling age-specific migration. Socio-Econ. Plan. Sci. [CrossRef]
  28. Ana Beduschi. International migration management in the age of artificial intelligence. Migr. Stud. 2021, 9, 576–596. [Google Scholar] [CrossRef]
  29. Smith, S.K. Accounting for migration in cohort-component projections of state and local populations. Demography 1986, 23, 127–135. [Google Scholar] [CrossRef]
  30. Hyndman, R.J.; Booth, H. Stochastic population forecasts using functional data models for mortality, fertility and migration. Int. J. Forecast. 2008, 24, 323–342. [Google Scholar] [CrossRef]
  31. Fuchs, J.; Söhnlein, D.; Vanella, P. Migration Forecasting—Significance and Approaches. Encyclopedia 2021, 1, 689–709. [Google Scholar] [CrossRef]
  32. Gorbey, S.; James, D.; Poot, J. Population forecasting with endogenous migration: An application to trans-Tasman migration. Int. Reg. Sci. Rev. 1999, 22, 69–101. [Google Scholar] [CrossRef]
  33. Bijak, J.; Wiśniowski, A. Bayesian forecasting of immigration to selected European countries by using expert knowledge. J. R. Stat. Soc. A 2010, 173, 775–796. [Google Scholar] [CrossRef]
  34. Abel, G.; Bijak, J.; Findlay, A.; McCollum, D.; Wiśniowski, A. Forecasting environmental migration to the United Kingdom: An exploration using Bayesian models. Popul. Environ. 2013, 35, 183–203. [Google Scholar] [CrossRef]
  35. Wiśniowski, A.; Bijak, J.; Shang, H.L. Forecasting Scottish migration in the context of the 2014 constitutional change debate. Popul. Space Place 2014, 20, 455–464. [Google Scholar] [CrossRef]
  36. Azose, J.J.; Raftery, A.E. Bayesian probabilistic projection of international migration. Demography 2015, 52, 1627–1650. [Google Scholar] [CrossRef] [PubMed]
  37. Raymer, J.; Wiśniowski, A. Applying and testing a forecasting model for age and sex patterns of immigration and emigration. Popul. Stud. 2018, 72, 339–355. [Google Scholar] [CrossRef] [PubMed]
  38. Frees, E.W. Short-Term Forecasting of Internal Migration. Environ. Plan. A Econ. Space 1993, 25, 1593–1606. [Google Scholar] [CrossRef]
  39. Ramos, R.; Surinach, J. A Gravity Model of Migration between ENC and EU; IZA Discussion Papers, No. 7700, Institute for the Study of Labor (IZA): Bonn, Germany, 2013.
  40. Campos, R.G. Migratory Pressures in the Long Run: International Migration Projections to 2050. Banco Esp. Artic. 2017, 38, 17. [Google Scholar]
  41. Iancu, N.; Badulescu, A.; Urziceanu, R.M.; Iancu, E.A.; Simut, R. The use of the gravity model in forecasting the flows of emigrants in EU countries. Technol. Econ. Dev. Econ. 2017, 23, 392–409. [Google Scholar] [CrossRef]
  42. Böhme, M.H.; Gröger, A.; Stöhr, T. Searching for a better life: Predicting international migration with online search keywords. J. Dev. Econ. 2020, 142, 102347. [Google Scholar] [CrossRef]
  43. Frees, E.W. Forecasting state-to-state migration rates. J. Bus. Econ. Stat. 1992, 10, 153–167. [Google Scholar] [CrossRef]
  44. Beer, J.D. Forecast intervals of net migration: The case of the Netherlands. J. Forecast. 1993, 12, 585–599. [Google Scholar] [CrossRef]
  45. García-Guerrero, V.M. A probabilistic method to forecast the international migration of Mexico by age and sex. Papeles Población 2016, 22, 113–140. [Google Scholar]
  46. Schoumaker, B.; Beauchemin, C. Reconstructing trends in international migration with three questions in household surveys: Lessons from the MAFE project. Demogr. Res. 2015, 32, 983–1030. [Google Scholar] [CrossRef]
  47. Vanella, P.; Deschermeier, P. A stochastic Forecasting Model of international Migration in Germany. In Familie—Bildung—Migration. Familienforschung Im Spannungsfeld Zwischen Wissenschaft, Politik Und Praxis. Tagungsband Zum 5. Europäischen Fachkongress Familienforschung; Kapella, O., Schneider, N.F., Rost, H., Eds.; Verlag Barbara Budrich: Berlin, Germany; Toronto, ON, Canada, 2018; pp. 261–280. [Google Scholar]
  48. Bijak, J.; Disney, G.; Findlay, A.M.; Forster, J.J.; Smith, P.W.; Wiśniowski, A. Assessing time series models for forecasting international migration: Lessons from the United Kingdom. J. Forecast. 2019, 38, 470–487. [Google Scholar] [CrossRef]
  49. Vollset, S.E.; Goren, E.; Yuan, C.W.; Cao, J.; Smith, A.E.; Hsiao, T.; Murray, C.J. Fertility, mortality, migration, and population scenarios for 195 countries and territories from 2017 to 2100: A forecasting analysis for the Global Burden of Disease Study. Lancet 2020, 396, 1285–1306. [Google Scholar] [CrossRef]
  50. Shimizu, S.; Shin, S. Applicability of SARIMA Model in Tokyo Population Migration Forecast. In Proceedings of 2021 14th International Conference on Human System Interaction (HSI), Gdańsk-Wrzeszcz, Poland, 2021. pp. 1–4. [CrossRef]
  51. Fantazzini, D.; Pushchelenko, J.; Mironenkov, A.; Kurbatskii, A. Forecasting Internal Migration in Russia Using Google Trends: Evidence from Moscow and Saint Petersburg. Forecasting 2021, 3, 774–803. [Google Scholar] [CrossRef]
  52. Kupiszewski, M. How trustworthy are forecasts of international migration between Poland and the European Union? J. Ethn. Migr. Stud. 2002, 28, 627–645. [Google Scholar] [CrossRef]
  53. Brücker, H.; Siliverstovs, B. On the estimation and forecasting of international migration: How relevant is heterogeneity across countries? Empir. Econ. 2006, 31, 735–754. [Google Scholar] [CrossRef]
  54. Bahna, M. Predictions of Migration from the New Member States after Their Accession into the European Union: Successes and Failures. Int. Migr. Rev. 2008, 42, 844–860. [Google Scholar] [CrossRef]
  55. Rogers, T.W. Migration Prediction On The Basis Of Prior Migratory Behavior: A Methodological Note. Int. Migr. 1969, 7, 13–19. [Google Scholar] [CrossRef]
  56. Zagheni, E.; Garimella, V.R.K.; Weber, I.; State, B. Inferring international and internal migration patterns from Twitter data. In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Korea, 7–11 April 2014; pp. 439–444. [Google Scholar]
  57. Cappelen, Å.; Skjerpen, T.; Tønnessen, M. Forecasting Immigration in Official Population Projections Using an Econometric Model. Int. Migr. Rev. 2015, 49, 945–980. [Google Scholar] [CrossRef]
  58. Azose, J.J.; Sevcikova, H.; Raftery, A.E. Probabilistic population projections with migration uncertainty. Proc. Natl. Acad. Sci. USA 2016, 113, 6460–6465. [Google Scholar] [CrossRef]
  59. Vasilyeva, A.V. The Forecast of Labour Migration, Reproduction of the Population and Economic Development of Russia. Econ. Reg. 2017, 13, 812–826. [Google Scholar] [CrossRef]
  60. Böhme, M.H.; Gröger, A.; Stöhr, T. Searching for a better life: Predicting international migration with online search keywords. J. Dev. Econ. 2020, 142, 102347. [Google Scholar] [CrossRef]
  61. Ovchynnikova, O.; Nahornova, O.; Mylko, I.; Begun, S.; Buniak, N.; Kolenda, N. Forecasting Regional Migration Flows. Proceedings of 10th International Conference on Advanced Computer Information Technologies (ACIT), Deggendorf, Germany; 2020; pp. 165–169. [Google Scholar]
  62. Shayegh, S.; Emmerling, J.; Tavoni, M. International Migration Projections across Skill Levels in the Shared Socioeconomic Pathways. Sustainability 2022, 14, 4757. [Google Scholar] [CrossRef]
  63. Jurdak, R.; Zhao, K.; Liu, J.; AbouJaoude, M.; Cameron, M.; Newth, D. Understanding Human Mobility from Twitter. PLoS ONE 2015, 10, e0131469. [Google Scholar] [CrossRef]
  64. Plaut, T.R. An econometric model for forecasting regional population growth. Int. Reg. Sci. Rev. 1981, 6, 53–70. [Google Scholar] [CrossRef]
  65. Sun, Y.; Pan, K. Prediction of the intercity migration of Chinese graduates. J. Stat. Mech. Theory Exp. 2014, 12, P12022. [Google Scholar] [CrossRef]
  66. Geng, Y.; Wang, R.; Wei, Z.; Zhai, Q. Temporal-spatial measurement and prediction between air environment and inbound tourism: Case of China. J. Clean. Prod. 2021, 287, 125486. [Google Scholar] [CrossRef]
  67. Pu, T.; Huang, M.; Yang, J. Forecasting international migrants using grey model with heat label. In Proceedings of the 5th International Conference on Computer Science and Software Engineering CSSE; Guilin, China; 2022; pp. 652–656. [Google Scholar] [CrossRef]
  68. Weber, H. How well can the migration component of regional population change be predicted? A machine learning approach applied to German municipalities. Comp. Popul. Stud. 2020, 45, 143–178. [Google Scholar] [CrossRef]
  69. Azizi, S.; Yektansani, K. Artificial intelligence and predicting illegal immigration to the USA. Int. Migr. 2020, 58, 183–193. [Google Scholar] [CrossRef]
  70. Giang, N.H.; Nguyen, T.-T.; Tay, C.C.; Phuong, L.A.; Dang, T.-T. Towards Predictive Vietnamese Human Resource Migration by Machine Learning: A Case Study in Northeast Asian Countries. Axioms 2022, 11, 151. [Google Scholar] [CrossRef]
  71. Best, K.; Gilligan, J.; Baroud, H.; Carrico, A.; Donato, K.; Mallick, B. Applying machine learning to social datasets: A study of migration in southwestern Bangladesh using random forests. Reg. Environ. Chang. 2022, 22, 1–12. [Google Scholar] [CrossRef]
  72. Carammia, M.; Iacus, S.M.; Wilkin, T. Forecasting asylum-related migration flows with machine learning and data at scale. Sci. Rep. 2022, 12, 1–16. [Google Scholar] [CrossRef]
  73. Terroso-Sáenz, F.; Muñoz, A. Nation-wide human mobility prediction based on graph neural networks. Appl. Intell. 2022, 52, 4144–4160. [Google Scholar] [CrossRef]
  74. Terroso-Saenz, F.; Flores, R.; Muñoz, A. Human mobility forecasting with region-based flows and geotagged Twitter data. Expert Syst. Appl. 2022, 117477. [Google Scholar] [CrossRef]
  75. Terroso-Sáenz, F.; Muñoz, A.; Fernández-Pedauye, J.; Cecilia, J.M. Human Mobility Prediction With Region-Based Flows and Water Consumption. IEEE Access 2021, 9, 88651–88663. [Google Scholar] [CrossRef]
  76. Golenvaux, N.; Alvarez, P.G.; Kiossou, H.S.; Schaus, P. An LSTM approach to Forecast Migration using Google Trends. arXiv 2020, arXiv:2005.09902. [Google Scholar]
  77. Gaigbe-Togbe, V.; Bassarsky, L.; Gu, D.; Spoorenberg, T.; Zeifman, L. World Population Prospects 2022; United Nations: New York, NY, USA, 2022; ISBN 978-92-1-148373-4. [Google Scholar]
  78. Kupiszewski, M. How trustworthy are forecasts of international migration between Poland and the European Union? J. Ethn. Migr. Stud. 2002, 28, 627–645. [Google Scholar] [CrossRef]
  79. Michael, F. ; Christoph; Schmidt, M. Aggregate-level migration studies as a tool for forecasting future migration streams. In International Migration: Trends, Policy and Economic Impact, Ed.; London and New York: Institute for the Study of Labor: London, UK; New York, NY, USA, 2005; pp. 110–136. [Google Scholar]
  80. Dustmann, C.; Casanova, M.; Fertig, M.; Preston, I.; Schmidt, C.M. The Impact of EU Enlargement on Migration Flows; Home Office Online Report 25/03; Research Development and Statistics Directorate, Home Office: London, UK, 2003; pp. 1–76. [Google Scholar]
  81. Alvarez-Plata, P.; Brücker, H.; Siliverstovs, B. Potential migration from Central and Eastern Europe into the EU-15: An update. European Commission, Directorate-General for Employment and Social Affairs, 2003, Unit A. 1.
  82. Cappelen, Å.; Skjerpen, T.; Tønnessen, M. Forecasting Immigration in Official Population Projections Using an Econometric Model. Int. Migr. Rev. 2015, 49, 945–980. [Google Scholar] [CrossRef]
  83. Dao, T.H.; Docquier, F.; Maurel, M.; Schaus, P. Global migration in the twentieth and twenty-first centuries: The unstoppable force of demography. Rev. World Econ. 2021, 157, 417–449. [Google Scholar] [CrossRef]
  84. Burzynski, M.; Deuster, C.; Docquier, F. Geography of skills and global inequality. J. Dev. Econ. 2020, 142, 102333. [Google Scholar] [CrossRef]
  85. Beine, M.; Bertoli, S.; Fernández-Huertas Moraga, J. A practitioners’ guide to gravity models of international migration. World Econ. 2016, 39, 496–512. [Google Scholar] [CrossRef]
  86. Hanson, G.; McIntosh, C. Is the Mediterranean the new Rio Grande? US and EU immigration pressures in the long run. J. Econ. Perspect. 2016, 30, 57–82. [Google Scholar] [CrossRef]
  87. Bertoli, S.; Brücker, H.; Moraga, J.F.H. The European crisis and migration to Germany. Reg. Sci. Urban Econ. 2016, 60, 61–72. [Google Scholar] [CrossRef]
  88. Sjaastad, L.A. The costs and returns of human migration. J. Political Econ. 1962, 70, 80–93. [Google Scholar] [CrossRef]
  89. Backhaus, A.; Martinez-Zarzoso, I.; Muris, C. Do climate variations explain bilateral migration? A gravity model analysis. IZA J. Migr. 2015, 4, 1–15. [Google Scholar] [CrossRef]
  90. Friebel, G.; Manchin, M.; Mendola, M.; Prarolo, G. International migration intentions and illegal costs: Evidence using Africa-to-Europe smuggling routes. CEPR Discussion Paper No. DP1 3326, 2018, Available at SSRN: https://ssrncom/abstract=3290517. [Google Scholar]
  91. Beyer, R.M.; Schewe, J.; Lotze-Campen, H. Gravity models do not explain, and cannot predict, international migration dynamics. Humanit. Soc. Sci. Commun. 2022, 9, 1–10. [Google Scholar] [CrossRef]
  92. Bijak, J. Migration Assumptions in the UK National Population Projections: Methodology Review; University of Southampton: Southampton, UK, 2012. [Google Scholar]
  93. Bijak, J. Forecasting International Migration in Europe: A Bayesian View; Springer Science+Business Media: Dordrecht, The Netherlands; Heidelberg, Germany; London, UK; New York, NY, USA, 2011. [Google Scholar]
  94. Lutz, W.; Goldstein, J.R. Introduction: How to deal with uncertainty in population forecasting? Int. Stat. Rev. 2004, 72, 1–4. [Google Scholar] [CrossRef]
  95. Billari, F.C.; Graziani, R.; Melilli, E. Stochastic population forecasts based on conditional expert opinions. J. R. Stat. Soc. Ser. A (Stat. Soc.) 2012, 175, 491–511. [Google Scholar] [CrossRef]
  96. Billari, F.C.; Graziani, R.; Melilli, E. Stochastic Population Forecasting Based on Combinations of Expert Evaluations Within the Bayesian Paradigm. Demography 2014, 51, 1933–1954. [Google Scholar] [CrossRef] [PubMed]
  97. Mitchell, T. Machine Learning; McGraw Hill: New York, NY, USA, 1997. [Google Scholar]
  98. Robinson, C.; Dilkina, B. A machine learning approach to modeling human migration. In Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies, Menlo Park and San Jose CA, USA; pp. 1–8. [CrossRef]
  99. Tarasyev, A.A.; Agarkov, G.A.; Hosseini, S.I. Machine learning in labor migration prediction. AIP Publ. LLC 2018, 1978, 440004. [Google Scholar] [CrossRef]
  100. Kiossou, H.S.; Schenk, Y.; Docquier, F.; Houndji, V.R.; Nijssen, S.; Schaus, P. Using an interpretable Machine Learning approach to study the drivers of International Migration. arXiv 2020, arXiv:2006.03560. [Google Scholar]
  101. Azizi, S.; Yektansani, K. Artificial Intelligence and Predicting Illegal Immigration to the USA. Int Migr 2020, 58, 183–193. [Google Scholar] [CrossRef]
  102. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  103. Aoga, J.; Bae, J.; Veljanoska, S.; Nijssen, S.; Schaus, P. Impact of weather factors on migration intention using machine learning algorithms. arXiv 2020, arXiv:2012.02794. [Google Scholar]
  104. Cherkassky, V.; Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 2004, 17, 113–126. [Google Scholar] [CrossRef]
  105. Zhang, L.; Luo, L.; Hu, L.; et al. An SVM-based classification model for migration prediction of Beijing. Eng. Lett. 2020, 28, 1023–1030. [Google Scholar]
  106. Bijak, J. Migration forecasting: Beyond the limits of uncertainty. Global Migration Data Analysis Centre Data Briefing Series, 2016, (6).
  107. Azose, J.J.; Ševčíková, H.; Raftery, A.E. Probabilistic population projections with migration uncertainty. Proc. Natl. Acad. Sci. 2016, 113, 6460–6465. [Google Scholar] [CrossRef]
  108. Srbu, A.; Andrienko, G.; Andrienko, N.; Boldrini, C.; Sharma, R. Human migration: The big data perspective. Int. J. Data Sci. Anal. 2021, 11. [Google Scholar] [CrossRef]
Figure 1. The flow of the human migration forecasting problem.
Figure 1. The flow of the human migration forecasting problem.
Preprints 70606 g001
Figure 2. Key technology roadmap for population migration prediction.
Figure 2. Key technology roadmap for population migration prediction.
Preprints 70606 g002
Figure 3. Human migration research using various machine learning methods.
Figure 3. Human migration research using various machine learning methods.
Preprints 70606 g003
Table 1. Classification of human migration forecasting methods.
Table 1. Classification of human migration forecasting methods.
Method Data Source Space-time Attributes and References
Deterministic Method US Department of Health and Human Services and US Bureau of the Census statistics; Human Mortality Database; Federal Statistical Office of Germany. internal migration, long-term, [29];international migration, long-term, [30];international migration, long-term, [31].
Bayesian Model Statistics NewZealand online database INFOS and Australian Bureau of Statistics database; Eurostat, United Nations Statistics Division statistics and the Council of Europe's Demographic Yearbooks; Office for National Statistics census data; National Records of Scotland and the Office for National Statistics census data; United Nations Population Division's biennial World Population Prospects report; Statistics Sweden census data; Korean Statistical Information Service statistics, Australian Bureau of Statistics census data. international migration, short-term, [32]; international migration, long-term, [33]; international migration, long-term, [34]; international migration, long-term, [35]; international migration, long-term, [36]; international migration, long-term, [37].
Gravity Model Internal Revenue Service statistics; Internal Revenue Service statistics; World Bank's Global Bilateral Migration database; Eurostat statistics, World DataBank statistical data(2016), Organization for Economic Co-Operation and Development statistics, French Centre d'Etudes Prospectives et d'Informations Internationals statistics; Google Trends data, Organization for Economic Co-Operation and Development statistics. internal migration, long-term, [38]; international migration, long-term, [39]; international migration, long-term, [40] ; international migration, long-term, [41]; international migration, short-term, [42].
Time Series Model Internal Revenue Service statistics; Netherlands Central Bureau of Statistics census data; Mexican demographic surveys of households and American Community Survey data; Household surveys data; Federal Statistical Office of Germany census data; Internal Revenue Service statistics; International migration report(2017); Author-collected datasets; Google Trends data. internal migration, short-term, [43]; international migration, short-term, [44]; international migration, long-term, [45]; international migration, long-term, [46]; international migration, long-term, [47]; international migration, long and short-term, [48]; international migration, long-term, [49]; internal migration, short-term, [50]; internal migration, short and long-term, [51].
Econometrics Method Trends in International Migration statistics, Migration Potential in Central and Eastern Europe statistics; Organization for Eco-nomic Co-Operation and Development sta-tistics, world bank statistics, Federal Statis-tical Office census data; The Candidate Country Eurobarometer survey series; Household surveys data; Twitter; Statistics Norway's "Statbank"; World Population Prospects(2015); Russian Federation and Commonwealth of Independent States countries statistics; Google Trends data; Site of the Main Department of Statistics in the Khmelnytskyi Region statistics; World Bank Global Bilateral Migration; Twitter; United States Bureau of the Census data; Data Collected by Author. international migration, long-term, [52]; international migration, short-term, [53]; international migration, short-term, [54]; international migration, short-term, [55]; international migration, short-term, [56]; international migration, long-term, [57]; international migration, long-term, [58]; international migration, long-term, [59]; international migration, short-term, [60]; internal migration, short-term, [61]; international migration, long-term, [62]; internal migration, short-term, [63]; internal migration, long-term, [64]; internal migration, short-term, [65].
Grey Model China Statistical Yearbook; US Department of Health and Human Services and US Bureau of the Census statistics. international migration, long-term, [66];international migration, long-term, [67].
Machine Learning Method Data Collected by Author; Mexican Migrant Project statistics; Department of Overseas Labor statistics; Household survey data; Google Trends Index. internal migration, long-term, [68]; international migration, short-term, [69]; international migration, long-term, [70]; internal migration, --, [71]; international migration, short-term, [72].
Deep-Learning Algorithm Nationwide Human Mobility Dataset released by the Spanish Ministry of Transportation; Nationwide Human Mobility Dataset released by the Spanish Ministry of Transportation; Nationwide Human Mobility Dataset released by the Spanish Ministry of Transportation; Google Trends Index, Organization for Economic Co-Operation and Development International Migration Database. internal migration, short-term, [73]; internal migration, short-term, [74]; internal migration, short-term, [75]; international migration, long-term, [76].
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Alerts
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated