Preprint
Article

Forecasting Lattice and Point Spatial Data: Comparison of Unilateral and Multilateral Sar Models

This version is not peer-reviewed.

Submitted:

12 August 2024

Posted:

13 August 2024

You are already at the latest version

A peer-reviewed article of this preprint also exists.

Abstract
Spatial auto-regressive (SAR) models are widely used in geosciences for spatial analyses; their main feature is the presence of weight (W) matrices, which define the neighboring relationships between the spatial units. The statistical properties of parameter and forecast estimates strongly depend on the structure of such matrices. The least squares (LS) method is the most flexible and can estimate systems of large dimensions; however, it is not consistent in the presence of multilateral (sparse) matrices. Instead, the unilateral specification of SAR models provides triangular weight matrices which allow good statistical properties to LS and enable the implementation of sequential prediction functions. In this paper we show the better performance in out-of-sample forecasting of unilateral SAR and LS with respect to multilateral and maximum likelihood (ML) methods. This conclusion is supported by extensive numerical simulations and applications to real geological data, both on regular lattice and irregular polygonal form.
Keywords: 
Subject: 
Computer Science and Mathematics  -   Probability and Statistics

1. Introduction

Spatial data are nowaday present in many social and natural phenomena. They are mainly available in two formats: on regular lattices (as digital images and remotely-sensed data), and irregular polygons and sparse points (as in zonings and geological surveys). These data are representable on maps by geographic information systems (GIS), and statical models may be developed for practical purposes; such as filtering, prediction and decision. In digital images the main goals are denoising and sharpening (e.g. Grillenzoni, 2008) to support image segmentation and classification; in polygonal data the aims are structural analyzes, evaluation and out-of-sample prediction (see Barbiero and Grillenzoni, 2022).
One of the main features of spatial data is the auto-correlation (ACR), which arises from the interaction between the spatial units (pixels in images, and centroids in polygons). The general rule is that the minor is the distance of units, the greater is the size of ACR. This dependence must be properly represented in spatial regression models in order to satisfy the basic assumption of independent residuals. This, in turn, is necessary for having unbiased and efficient estimates of the parameters and their standard errors; that is, for performing statistical inference and prediction. It is also possible to show that correlating two independent, but strongly auto-correlated series, then spurious dependence between them may be found (Granger, 2012); this effect also occurs in spatial data.
On the other hand, ACR is useful for forecasting the character under study in areas where it is not measured, especially when exogenous covariates are not available. It follows that representing the ACR in spatial systems is one of the major concerns of regression modelling. As in time series analysis, it may be accomplished by introducing into the equations suitable lagged terms; i.e. the value of the dependent variable in nearby units. This leads to the spatial auto-regressive (SAR) models, which resemble the AR schemes of time series and dynamical systems. However, while in the time there is only a single direction (from past to present), in the plane there are almost infinite directions.
The specification of the models is twofold: unilateral or multilateral, depending on whether the linkage between the nearby units reflects or not a causal ordering. As an example, in lattice data one has the triangular and rook schemes in Figure 1a,b; in the first, the dynamic is unidirectional since starting from the upper-left corner it follows the sequence of writing a text (see Tjøstheim, 1983). Instead, in Figure 1b the dynamic is multi-directional as each cell involves parts of the text that are not yet written (Bustos et al., 2009). Figure 1c,b show the analogous situations in polygonal data, where the position of each unit is defined by its center (geometric or institutional). In Figure 1d the link is multi-directional with four nearest neighbors (NN), whereas in Figure 1c it is in the north direction only.
The statistical consequences of the two specifications for SAR models are important. Multidirection linkages violate the condition of independence between regressors and residuals, therefore make least squares (LS) estimates biased and inconsistent. To solve this problem, many alternative estimators have been proposed, such as maximum likelihood (ML, Anselin, 2003; LeSage and Pace, 2009), generalized method of moment (GM, Kelejian and Prucha, 1999), two-stage least squares (Lee, 2003) and recently indirect inference (Bao et al., 2020). These methods use iterative algorithms of optimization, which may not converge, and matrices of spatial contiguity which are hindered by the system/sample dimension. However, Lee (2002) showed that under suitable conditions of contiguity, the ordinary LS estimator is consistent even for multilateral models.
In this paper we focus on unilateral SAR for both lattice and polygonal data and we show their ability to preserve the optimal properties of LS estimates even with respect to the efficient ML and GM methods. This feature is important because the LS method is linear and can manage dataset of large dimensions, as it may avoid the direct use of contiguity matrices (i.e. the square arrays that connect all spatial units). Further, unilateral models applied to the data sorted in the same spatial direction as the models, have a recursive structure which enable effective forecasting functions. This feature confirms the close connection between parametric identifiability and model predictability.
Spatial prediction of missing data (inside the perimeter of the observed area) has been treated by LeSage and Pace (2004, 2008), Kelejian and Prucha (2007) and Goulard et al. (2017) for multilateral models, and Moijri et al. (2021) in the lattice case. They show that the naive predictor based on the reduced form of SAR models must be corrected with the best linear unbiased predictor (BLUP) of classical statistics. However, the recursive forecasts of unilateral models do not need the BLUP correction and outperform the others. We show evidence of these results with Monte Carlo experiments on synthetic data and out-of-sample forecasting on real data of geosciences, such as digital elevation models (e.g. USGS, 2017) and spatial diffusion of water isotopes (e.g. Moreiras Reynaga et al., 2021).
The paper is organized as follows: Section 2 deals with lattice models, it reviews the conditions of identification and consistency and evaluates their forecasts on random surfaces and digital images; Section 3 deals with SAR models for polygonal and sparse point data, it compares the forecasts of unilateral and multilateral models with various algorithms; two Appendices provide computational details.

2. Regular Lattice Data

Regular lattice data are mostly present in remote sensing and digital images. These data are in the form of rectangular arrays of the type Y = { y i j } , with i = 1 , 2 n , j = 1 , 2 m the indexes of position, which may be transformed into latitude and longitude. The values y i j are usually autocorrelated (e.g. clustered) and one of the main goals is to filter the array Y with its own values, to obtain interpolates and forecasts: y ^ n + h , m + k . To accomplish this task, as in time series analysis (Fingleton, 2009), the SAR modeling puts each cell in relation to the contiguous ones, such as y i j = g ( y i ± h , j ± k ) + e i j , where h , k = 1 , 2 p are spatial lags, p is the order of dependence and { e i j } is an unpredictable sequence.
In time series the unidirectional (past-present) ordering of data is also called causal and allows to define the causality between stochastic processes (e.g. Herrera et al., 2012). In lattice data, the causal ordering can be established by following the lexicographic way of reading/writing a text; i.e. processing the cells of Y starting from the upper-left corner. Unilateral dynamics which satisfy this feature are the row-wise y i , j k , the half-plane y i h , j ± k , h > 0 and the one-quadrant (or triangular) with y i h , j k , ( h + k ) > 0 in Fig. 1a (see Tjøstheim, 1983, Bustos et al., 2009). Although causality is naturally related to the dynamic of events and aspects of human learning, digital filters of denosing and sharpening may follow non-causal models, such as the rook scheme in Fig. 1b (e.g. Grillenzoni, 2008). However, the unidirectional approach remains the favorite solution as allows properties of sequentiality and prediction to the models.
Under linear and unilateral constraints, the triangular SAR(p) model becomes
y i j = h = 0 p k = 0 ( h + k ) > 0 p ϕ h k y i h , j k + e i j , e i j IN ( 0 , σ e 2 ) ,
with independent normal (IN) residuals. In statistics, the model (1) is usually estimated with the maximum likelihood (ML); this requires the vectorization of the data matrix y = vec j ( Y ) , of length N = n × m and the inversion of auto-covariance matrices Γ y of size N × N , see Basu and Reinsel (1993, Sect. 4) and Awang and Shitan (2006). This solution is statistically efficient but computationally expensive and can be implemented only for moderate dimensions of the lattice field.
Instead, the LS method can be applied even for large values of n , m ; rewriting the model (1) in regression form as
y i j = ϕ x i j + e i j , x i j = y i , j 1 y i h , j k y i p , j p , ϕ = ϕ 01 ϕ h k ϕ p p ,
the LS estimator becomes
ϕ ^ N = i = p + 1 n j = p + 1 m x i j x i j 1 i = p + 1 n j = p + 1 m x i j y i j
= ϕ + i = p + 1 n j = p + 1 m x i j x i j 1 i = p + 1 n j = p + 1 m x i j e i j
At computational level, the double summation in Eq. (2) can be implemeted in recursive form as follows
R i , m = R i 1 , m + P i m , P i j = P i , j 1 + x i j x i j , P 0 = O ( p 2 1 ) , s i , m = s i 1 , m + q i m , q i j = q i , j 1 + x i j y i j , q 0 = 0 ( p 2 1 ) , ϕ ^ N = R n m 1 s n m , Σ ^ N = R n m 1 σ ^ e 2 ,
with N= ( n p ) ( m p ) . Unlike the ML approach, this algorithm only involves the inversion of the matrix R of size ( p 2 –1), for any values of n , m . As regards the statistical properties, the LS estimator (2) is unbiased and consistent under the assumption of unilateral dynamic of the model (1) because E ( x i j e i j ) = 0 .
SAR(1). As an example, let p=1, then the model (1) is y i j = ( ϕ 1 y i , j 1 + ϕ 2 y i 1 , j + ϕ 3 y i 1 , j 1 ) + e i j . Applying the formula of Basu and Reinsel (1993, p.634) to the lagged term y i , j 1 , one obtains the moving average (MA) representation
y i , j 1 = e i , j 1 + h = 1 ( h + l ) < i i k = 1 ( k + l ) < j j 1 l = 0 max ( i , j 1 ) ψ h k l e i h l , j 1 k l ,
from which E ( y i , j 1 e i j ) =0. Similarly, in Eq. (2b) it follows that E ( x i j e i j ) = 0 , because all entries of x i j = [ y i , j 1 , y i 1 , j , y i 1 , j 1 ] do not depend on e i j , and the LS estimator (2a) is unbiased. This result does not apply to multilateral SAR models, because the decomposition (3) would involve e i + h , j + k ; for the rook-type scheme in Fig. 1b the ML estimator is necessary. Finally, as in time series, the consistency of LS does not need the stability of the model (1); rather, it improves with signal-to-noise ratio σ y 2 / σ e 2 (e.g. Bhattacharyya et al., 1997).
SAR(p,q,r). Empirical models often have a subset structure, i.e. have missing lagged regressors or they aggregate y i k , j h under a common parameter. The proper definition of such models is SAR( p , q , r ), where p=maximum lag of regressors, q=number of spatial units and r=number of parameters. Two parsimonious models that will be extensively applied in the paper are the triangular SAR(1,3,1) and the rook SAR(1,4,1) with drift α , defined as
y i j = α + ϕ y i 1 , j + y i , j 1 + y i 1 , j 1 / 3 + e i j , y i 0 = y 0 j = 0 ,
y i j = α + ϕ y i , j 1 + y i , j + 1 + y i 1 , j + y i + 1 , j / 4 + e i j , y i , m + 1 = y n + 1 , j = 0 ,
see Fig. 1a,b. These models can also be written as y i j = α + ϕ y ¯ i j + e i j , where y ¯ is an average of lagged local terms.
The models (4) are first-order both in the spatial lag and the number of coefficients; their condition of stability is given by | ϕ | < 1 , as a Corollary of the Proposition 1 of Basu and Reinsel (1993). The rook model (4b) has a multilateral structure, with problems of LS estimation and forecasting; however, it may represent circular dynamics and is a natural competitor of the triangular (4a). This admits an MA representation similar to Eq. (3) with weights
ψ h k l = ( h + k + l ) ! h ! k ! l ! ϕ 3 ( h + k + l ) ,
see Bustos et al. (2009). If e i j IN ( 0 , σ e 2 ) , it follows that E ( y 11 ) = α ; the definition of the border values is crucial in simulating the rook model, as it involves missing values. For this reason, we assume the general starting condition Y 0 = α which is the simplest out-of-sample forecast.

2.1. The Vector Form

Previous representations are termed raster in GIS softwares; when the lattice dimension is moderate (e.g. n × m < 1000 ), it may be useful to pass at the vector form. Basu and Reinsel (1993), Awang and Shitan (2006) and Mojiri et al. (2021) consider complex forms of vectorization for ML estimation. We follow the spatial econometric approach which vectorizes the data-matrix by columns and build the contiguity matrix of all cells, see LeSage and Pace (2009). Thus let y = vec j ( Y ) the N = n × m data vector; the first-order vector model with drift is
y = α 1 N + ϕ W y + e , e N ( 0 , σ e 2 I N ) ,
where 1 N is a unit vector of length N and W is an N × N contiguity matrix. Its structure depends on the spatial dynamics and the marginal values: the models (4a,b) have matrices W with sub-diagonals <1 and sparse 1 on the main diagonal, in correspondence of the border values y i 1 , y 1 j , y i m , y n j . For n=4, m=7 one has the arrays in Figure 2 (see Appendix A); the distinctive feature of the unilateral model (4a) is that W is lower triangular.
The arrays in Figure 2 are obtained by solving the static system y = W y , where m determines the number of blocks of size n × n , with matrices I n at the corners (see Appendix A). However, when W is inserted in the dynamic model (5), its diagonal elements w l l =1 must be replaced by 0, obtaining W 0 . Each sub-diagonal corresponds to a specific regressor y i h , j k of models (4); thus, the array in Fig. 2a can be decomposed as W 0 = ( W 1 + W 2 + W 3 ) , providing a SAR(3) model. The data-generation of { y l } proceeds by rows, starting from an initial y = y 0 , as
y l = α + ϕ w l y + e l , l = 1 , 2 N ,
where w l is the l-th row of W 0 . Unilateral processes are insensitive to y 0 , since W 0 triangular fills y recursively; instead, the model (4b), having the forward terms y i , j + 1 , y i + 1 , j , needs a non-null initial vector, e.g. y 0 = α 1 N .
Using matrix algebra, one may obtain the reduced (MA) form of the system (5)
I N ϕ W y = α 1 N + e , y = I N ϕ W 1 α 1 N + e ,
this provides an automatic way to generate SAR data, independent of y 0 . Differences with the AR method (6) arise from the choice of the border values and the contiguity matrices; however, in models with W triangular and diag( W )=0 the algorithms (6) and (7) are equivalent. In this case, the impulse-response matrix I N ϕ W is always invertible and triangular, and leads to the MA decomposition
M = I N ϕ W 1 = I N + k = 1 ϕ k W k , y = α ( 1 ϕ ) 1 N + k = 1 ϕ k W k e + e ,
which converges when | ϕ | < 1 . The expression of the constant term follows from the row-normalization, where W 1 N = 1 N and also W k 1 N = 1 N . Further, each element of y becomes a linear combination of e , and in unidirectional models, they are function of one-quadrant errors only, providing the MA decomposition
y i j = α ( 1 ϕ ) + h = 1 i 1 k = 1 j 1 ψ h k e i h , j k + e i j ,
where the weights ψ h k 0 are complex functions of ϕ k W k .
Finally, the representation (5) enables to define the LS estimator of parameters δ = [ α , ϕ ] of the models (4) using the entire sample Z = 1 N , W y as
δ ^ N = Z Z 1 Z y ,
Σ ^ N = Z Z 1 σ ^ e 2 ,
this improves the estimator (2) based on N p observations. The matrix Σ ^ N is the dispersion and if W is triangular, as in Fig. 2a, then LS (9) is consistent; its analysis will be developed in Sect. 3.
Spatial Forecasting. Prediction is one of the central goals of SAR modeling; typical examples are restoring parts of remote sensing data which are hidden by clouds or extrapolating the image out of its perimeter. However, existing literature on lattice data has mostly concerned with in-sample filtering and interpolation, focusing on techniques of image sharpening (Grillenzoni, 2008), robust denosing (Bustos et al., 2009) and trend estimation (Mojiri et al., 2021), also in conjunction with non-parametric smoothing.
This section aims to forecast data segments which are external of the estimation perimeter n × m ; e.g. on the right hand side as { y i , m + k } or, in vector form y N + l on units placed beyond N. By, defining h , k , l = 1 , 2 H , K , L the prediction indexes, the forecast function depends on the SAR representation used, see Kelejian and Prucha (2007). Here, we have the lattice form (4), the vector AR (6) and the reduced MA (7); for the triangular model (4a), the 3 predictors are
y ^ n + h , m + k = E y n + h , m + k | y n i + 1 , m j + 1 , h , k = 1 , 2 H , K ,
y ^ n + h , m + k = α + ϕ y ^ n + h , m + k 1 + y ^ n + h 1 , m + k 1 + y ^ n + h 1 , m + k / 3 ,
y ^ N + l = α + ϕ w N + l y ^ , y ^ = y 11 , y 21 y n m , y ^ N + 1 y ^ N + l 1 0 ,
y ^ = I N + L ϕ W N + L 1 α 1 N + L , l = 1 , 2 L ,
where w N + l is the ( N + l ) -th row of the augmented matrix W N + L and the vector y ^ in (10b) is updated with the forecasts. The predictor (10c) is nearly automatic, but in absence of exogenous variables provides constant forecasts; on the other hand, the functions (10a,b) need border values to start. These can be set as y ^ 1 , m + k ( 0 ) = α ^ for all k, and may be improved by r-iterating the forecasts with the nearby values as y ^ 1 , m + k ( r + 1 ) = y ^ 2 , m + k 1 ( r ) , etc.. This approach can be extensively applied to the rook model (4b), to provide initial values also of its forward elements y i j + 1 .
The variance of predictors is useful for building confidence intervals of the forecasts (10) and testing hypotheses; as in time series models, the process variance σ y 2 = var ( y i j ) represent an upper bound. Applying the formula of Mojiri et al. (2018, p.187) to the triangular model (4a), one obtains
σ y 2 = σ e 2 / 1 2 ϕ 2 9 3 2 + ρ ( 1 , 1 ) + ρ ( 0 , 1 ) + ρ ( 1 , 0 ) σ e 2 / 1 ϕ 2 ,
where | ρ ( . , . ) | < 1 are autocorrelation coefficients. The last formula is the variance of an AR(1) process; hence, for the model (4a), the approximate variance of the predictor (10a) may be σ e 2 1 ϕ 2 ( h + k ) / 1 ϕ 2 . A more general result can be obtained for the representations (5) and (10c), namely
V ( y ^ ) = I N + L ϕ W N + L 1 I N + L ϕ W N + L 1 σ e 2 ,
whose diagonal elements increase with the index l.

2.2. Simulations and Applications

We develop simulation experiments to test the performance of LS (9) and ML (19) estimators (which are based on the same matrices W ), applied to unilateral and multilateral SAR models. We consider the models (4a,b) with parameters α 0 = 1 , ϕ 0 = 0 . 5 , 0 . 75 , 1 . 0 , 1 . 025 and border values ( α + e i j ) , i.e. y 00 = 0 etc.. Owing to the forward terms y i + 1 , j , y i , j + 1 , the generation of the rook process (4b) is not fully possible with the AR Equation (6) and needs the MA reduced form (7), with the matrix in Fig. 2b, without the diagonal entries. Instead, for the unilateral model (4a), the generating methods (6) and (7) are equivalent.
200 replications on a 32 × 32 lattice (N=1024 cells) are generated with Normal disturbances and σ e 2 = 1 . Performance statistics are the relative bias | ϕ ^ ϕ 0 | / ϕ 0 , the relative root mean squared errors (the square root of MSE( ϕ ^ )/ ϕ 0 2 ) and the p-values of the Jarque and Bera (1987) Normality test. Since the relative statistics of α ^ , ϕ ^ do not depend on the parameter size, the overall performance of LS and ML methods can be compared by summing the index values.
The results are displayed in Table 1 and Figure 3; the main findings are as follows: LS (9) is uniformly better than ML (19) when the matrix W is triangular. Its efficiency improves as ϕ 0 1 , as a consequence of the super-consistency property of LS estimates in AR models (e.g. Grillenzoni, 2004). Its performance slightly worsens when using LS (2), due to the smaller sample size, which excludes border values. The performance of LS estimates in multilateral SAR models is biased for ϕ < 1 , but still benefits of the super-consistency. It is interesting to note that, unlike time series, the Normal distribution of LS estimates of the AR parameter ϕ ^ N holds even in the presence of the unit-root ϕ 0 = 1 (see Figure 3b).
As regard the ML method (19), we use the implementation of LeSage and Pace (2009), which is computationally demanding and time consuming. However, it significantly improves the estimates of multilateral models, providing good levels of unbiasedness and efficiency when ϕ 0 1 . On the other hand, it requires conditions of stationarity and its performance in triangular models is negative, meaning that ML is mainly designed for sparse weight matrices. We will resume this topic in Sect. 3 for non-lattice data; as a conclusion, we can state that LS is suitable for unilateral models, whereas ML must be used in multilateral SAR.
We also carry out two empirical applications to test the performance of various SAR models in out-of-sample forecasting of spatial data. In comparing predictions, one cannot proceed by simulations as in the experiments of Tab. 1, because simulated data depend on the ground models, which will influence the final results. Hence, we fit SAR models to data generated from external sources.
Random Surface. The first application considers rough random surfaces (Sheffield, 2022); these techniques are used in physics to study electromagnetic, fluid and plasma phenomena (see Sjodin, 2017), and are applied in engineering (e.g. 3D printers). We follow an approach based on trigonometric series, where a matrix of random numbers is convolved with a Gaussian filter to achieve spatial autocorrelation. Thus, let the fast Fourier transform (FFT) of the bivariate random sequence u i j on a m × n regular lattice be
F ( u i j ) = h = 0 n 1 k = 0 m 1 u h k e ı 2 π h i / n e ı 2 π k j / m , u i j IN ( 0 , σ u 2 ) ,
where ı is the imaginary unit. Let R i j ( δ ) = e 0 . 5 ( i 2 + j 2 ) / δ 2 an exponential ACR function with δ > 0 ; then the model surface generator becomes
y i j = α + F 1 R i j ( δ ) F ( u i j ) + e i j , e i j IN ( 0 , σ e 2 ) ,
where ( · ) is the real part, F 1 ( · ) is the inverse FFT and e i j is a white noise.
Despite the randomness of u h k , the rough surface g i j = ( · ) in Eq. (11) is still smooth, hence we add an external noise e i j to increase its realism; the resulting series y i j resemble real terrain models of geostatistics (see Figs. 4a,b). In the experiment we set n=m=32 (which yields N=1024), e , u independent Gaussian with σ u =1, σ e =0.067, and α =1, δ =2.5 provide mild spatial ACR. The goal is to forecast the last K=7 columns (about 22%) of the surface Y with the algorithms (10). Table 2 reports the in-sample parameter estimates (with LS and ML methods) of the models (4a,b) and the mean absolute percentage forecast errors statistics
MAPE = 1 n K i = 1 n j = m K + 1 m | y ^ i j y i j y i j | .
A triangular SAR(1,3,3) model (with 3 distinct AR parameters) is also fitted to the data, but it does not improve the forecasts of the constrained (4a). The rook (4b) performs poorly with the (biased) LS estimates, whereas it significantly improves with the ML ones; anyway, the path of the predicted surface in Figure 4d remains disappointing. The performance of the vector predictor (10b) is similar to that of the lattice method (10a) and is reported in Appendix B, together with that of the function (10c).
Remote Sensing. The second application regards a real case study; we consider high-resolution elevation data obtained with aerial laser scanners. The sample array Y comes from USGS (2017) and has 340×455 pixels (see Fig. 5); this would yield a vector SAR model with 154,700 rows. Further, the image extrapolation may require the inclusion of spatial trend components (e.g. Grillenzoni, 2004), such as bivariate polynomials g d ( i , j ) of degree d 1 . Given the implied computational issues, we only perform analysis with the lattice implementation (2); in the case of a triangular model with quadratic trend, the forecasting function becomes
y ^ i + h , j + k = α + g 2 ( i + h , j + k ) + ϕ y ^ i + h 1 , j + k 1 + y ^ i + h 1 , j + k + y ^ i + h , j + k 1 / 3 , g 2 ( i + h , j + k ) = β 1 ( i + h ) + β 2 ( j + k ) + β 3 ( i + h ) 2 + β 4 ( j + k ) 2 + β 5 ( i + h ) ( j + k ) ,
with h , k = 1 , 2 H , K , which can be sequentially managed by cells. The AR part of the rook model involves forward values in the lower-right side; these may be provided by the polynomial itself: y ˜ i + h + 1 , j + k + 1 = g ^ 2 ( i + h + 1 , j + k + 1 ) .
In the application to the USGS image, we left the right portion of 55 columns (about 12 % of the array) as out-of-sample data to forecast. LS estimates of the parameters are reported in Table 3 and graphical results are displayed in Figure 5. Despite the better in-sample fitting (with R 2 =0.98), the rook model has a disappointing forecasting performance both in terms of MAPE statistic and visual appearance (Figure 5b). The reason is partly due to the (biased) estimates of its parameters, with α ^ < 0 , ϕ ^ > 1 ; however, adjusting their values reduces the MAPE statistics but do not change the (uniform) path of the predicted surface.
In real life applications, where small and inner portions of images must be restored, the spatial polynomials g ( i , j ) may be fitted locally (around missing pixels) or may be replaced by nonparametric smoothers. In any event, the role of the AR components remains fundamental, as the coefficient ϕ ^ is the most significant in both models of Table , with statistic T ϕ ^ > 100 .

3. Point and Polygonal Data

When the data are irregularly distributed in space (either as point processes or polygonal areas), notations and representations have to change. The spatial process is now defined as { y s } , s = 1 , 2 , N ; where the index s also applies to the planar coordinates x s = [ i s , j s ] (latitude and longitude) of the polygon centers (geometric or political), see Figure 1c,d. These variables are irregularly distributed in the plane but have a fixed (non-stochastic) nature; moreover, they may influence the level of the process y s , according to spatial trends. Thus, the SAR(1) model becomes
y s = α + ϕ y s 1 + β x s + e s , s = 1 , 2 N ,
y = α 1 N + ϕ W y + X β + e , e N 0 , σ e 2 I N ,
= I N ϕ W 1 α 1 N + X β + e ,
where s 1 refers to the unit which is closest to the s-th term, and the vector of regressors x s = [ i s , j s , ] may include exogenous covariates.
Unlike the previous section, the adjacency matrix W has an irregular structure, which depends on the rule of contiguity. The most common rule is to put each observation y s in relation to its nearest neighbor (NN) term y r , according to the Euclidean distance: min r { d s r = [ ( i s i r ) 2 + ( j s j r ) 2 ] 1 / 2 } . Furthermore, under the unilateral constraint, the north-west (NW) direction may be followed as in the lattice case. However, polygonal data have not a lexicographic order; hence, the unilateral constraint may simply be defined along the north direction. In this setting, the observations { y s , x s } are ordered according to the shortest distance from the northern border, i.e. according to the inverse of the latitude 1 / i s . Such ordering may be denoted by ( i s ) or simply ( s ) , and each equation of the model (13a) can be written in sequential form, as a time series model
y ( s ) = α + ϕ y ( s 1 ) + β x ( s ) + e ( s ) , y ( 0 ) = 0 ,
where y ( s 1 ) is the north NN of y ( s ) . The property E [ y ( s 1 ) e ( s ) ] = 0 as in time series is the basis for unbiased LS estimation and forecasting. However, it does not hold in the simple NN linkage, even when data are ordered along the latitude.
As an illustration of NN matrices, we simulate N=30 random points with [ i s , j s ] IU ( 0 , 1 ) 2 and consider q=3 contiguous terms, see Figure f6. Under the north ordering of data, the entries of matrices are concentrated around the null main diagonal diag( W )= 0 , and with the unilateral constraint the array is lower triangular. In order to obtain non-sparse matrices, LeSage and Pace (2004) ordered data according to the sum ( i s + j s ) , but just used the simple NN rule. The matrices W k in Figure f6 yield SAR(3) models, but if they are combined as W ¯ = 3 1 k = 1 3 W k they provide constrained SAR(3,1). The analogous of the lattice model (4a) then becomes
y ( s ) = α + ϕ q 1 k = 1 q y ( s k ) + β x ( s ) + e ( s ) .
The econometric literature often discusses models with autocorrelated errors, such as u = I N θ W 1 e , e.g. Kelejian and Prucha (2007). Apart from estimation difficulties, it should be noted that they are encompassed by unconstrained SAR(2); indeed, the combined system I N θ W I N ϕ W y = I N θ W α + X β + e involves just 2 spatial lags. In place of u , it is better to insert into the model (13b) a lagged term on the exogenous part + W X γ , that corresponds to γ = θ β in the model with u errors. This is the Durbin model (LeSage and Pace, 2009), which is useful when x ( s ) are autocorrelated, i.e. ( i s , j s ) have a non-random pattern.
When the matrix W is lower triangular, the transfer function ( I N ϕ W ) is always nonsingular and its inverse admits a convergent power expansion
y = α ( 1 ϕ ) 1 N + X β + k = 1 ϕ k W k X β + k = 1 ϕ k W k e + e .
This expression shows that the process y ( s ) , in the location s-th, depends on the values of the inputs x ( r ) and shocks e ( r ) in the north locations ( r < s ), and with weights ϕ k W k which decline with the distance. To appreciate this diffusion effect, note that when W is strictly lower triangular, then also W 2 is lower triangular but with nonzero entries shifted away from the main diagonal. Hence, W 2 reflects the second-order contiguity (the neighbors of the neighbors), and so forth for W k O the null matrix. The same feature, however, does not hold in non-triangular matrices, such as those based on the simple NN.
Figure 6. Contiguity matrices and planar graphs of N=30 random centroids on the unit square [0,1]. Panels: (a,c) North nearest neighbors (NN); (b,d) Multidirection NN. Matrices: W 1 first NN (blue); W 2 second NN (red); W 3 third NN (green).
Figure 6. Contiguity matrices and planar graphs of N=30 random centroids on the unit square [0,1]. Panels: (a,c) North nearest neighbors (NN); (b,d) Multidirection NN. Matrices: W 1 first NN (blue); W 2 second NN (red); W 3 third NN (green).
Preprints 115048 g006
The unilateral model (14) corresponds to a time-series AR with unequally spaced observations. This feature is more clear in the presence of multiple lagged terms y ( s k ) ; the suitable treatment is to weight the observations by the inverse Euclidean distance 1 / d ( s , s k ) . The analog of the constrained model (15), with north NN linkage and with row normalization, is then given by
y ( s ) = α + ϕ k = 1 q w ( s k ) y ( s k ) + β x ( s ) + e ( s ) , w ( s k ) = d ( s , s k ) 1 / k = 1 q d ( s , s k ) 1 , d ( s , s k ) = i ( s ) i ( s k ) 2 + j ( s ) j ( s k ) 2 1 / 2 ,
this also admits a vector representation as (13b) with triangular matrix W , having q decreasing coefficients per row.
Alternatively, since the inverse distances decay too fast, one can use the exponential weights w ( s k ) λ d ( s , s k ) , with 0 < λ < 1. In vector form, one has to define the diagonal matrices Λ k = λ D k where D k = diag [ 0 0 , d ( k , 1 ) , d ( k + 1 , 2 ) d ( N , N k ) ] contain the k-th north NN distances of the centroids. These arrays must be multiplied by the k-lag contiguity matrices W k , then summed as W = k Λ k W k and finally normalized by rows. The resulting model is nonlinear in the parameters and requires iterative algorithms; however, the value of λ may be selected a-priori in the range [.9, 1), then the estimators compute the suitable value of ϕ .

3.1. Estimation Methods

Writing the models (14)-(16) in vector form, as y ( s ) = δ z ( s ) + e ( s ) , with δ = α , ϕ , β and z ( s ) = 1 , y ¯ ( s 1 ) , x ( s ) , where y ¯ ( s 1 ) is the average of q-north NN terms, then the LS estimator of the parameters δ is given by
δ ^ N = s = q + 1 N z ( s ) z ( s ) 1 s = q + 1 N z ( s ) y ( s ) , = δ + s = q + 1 N z ( s ) z ( s ) 1 s = q + 1 N z ( s ) e ( s ) ,
its unbiasedness follows from E z ( s ) e ( s ) = 0 by the sequential structure of the models, in particular E [ e ( s ) y ¯ ( s 1 ) ] = 0 in the unilateral model.
To see what happens in detail, consider the LS estimator in matrix form
δ ^ N = Z Z 1 Z y = δ + Z Z 1 Z e
with regressors Z = 1 N , W y , X , the crucial term is W y . From Eq. (13c)
E Z e = E ( W y ) e = E ( I N ϕ W ) 1 ( α 1 N + X β + e ) W e = E e ( I N ϕ W ) 1 W e = E e H e , ( say )
this term vanishes only if the trace (tr) of H = I N ϕ W 1 W is 0, because
E y W e = E tr e H e = tr H E ( e e ) = tr H σ e 2 I N .
Now, tr ( H ) = 0 in general holds if W is strictly lower triangular, because H is itself triangular with null diagonal.
As regards the convergence of LS estimator, multiplying Eq. (17) by 1 / N , under regularity conditions (stability and existence of second-order moments) one has that Plim N Z Z / N = Σ z , the covariance matrix of z ( s ) . Whereas
Plim N Z e / N = E z ( s ) e ( s ) = tr ( H ) σ e 2 ,
which is 0 if W is triangular. Finally, under regularity conditions ( Σ z positive definite, | ϕ | < 1 and x s stationary) one has
N δ ^ N δ N D N 0 , Σ z 1 σ e 2 ,
where the convergence is in distribution (D), see Yao and Brockwell (2006).
For multilateral SAR models ( W non-triangular), the LS is generally biased and inconsistent, and alternative estimators must be sought. The ML method may be suitable if the sample size N is moderate and the observations have Gaussian distribution. This estimator maximizes the log-likelihood function
( α , ϕ , β , σ ) N log ( σ ) + log det I N ϕ W e e / 2 σ e 2 , e ( α , ϕ , β ) = I N ϕ W y α 1 N + X β ,
which requires det ( · ) > 0 , i.e. the stability condition ϕ [ 0 , 1 ) (LeSage and Pace, 2009, p.63). This requirement is not necessary in the LS method; moreover, numerical issues may arise in the optimization of Eq. (19), when computing log det ( · ) = i = 1 N log 1 ϕ λ i , with the eigenvalues λ i of W . This techniques cannot be applied to triangular matrices, because λ i =0 for all i.
Another method which is suitable for parameter estimates in multilateral W is that of generalized moments (GM, Kelejian and Prucha, 1999); it arises by applying the instrumental variables (IV) method to the LS estimator (17). The first step is the definition of valid IVs: on the basis of the predictor decomposition y ^ = k ϕ k W k X β + α , they may be given by M = [ X , W X , W 2 X ] . This provides the projection matrix P = M M M 1 M and the regressors V = P Z , which lead to the IV estimator δ ˜ N = V Z 1 V y . This solution is also of GM-type, as minimizes the functional Q N ( δ ) = m S 1 m , with the empirical moments m = M e / N and S = M e e M / N . In the following, we check the effectiveness of ML and GM estimators with simulation experiments.

3.2. Forecasting Functions

In forecasting, we have L out-of-sample units with value y L = 0 , whose coordinates and regressors x l = 1 , i l , j l , x 1 l . . . x m l are known for all l = 1 , 2 L . Their locations ( i l , j l ) may be inside or outside the observed region; in both cases they are placed at the end of the data matrix X N + L . If the data are ordered in a certain direction (e.g. north-south with i ( s ) i ( s + 1 ) ), and all L units are outside the observed region, and in the same direction, then the weight matrix is nearly block-diagonal: W N + L = diag W N , W L , where W L is overlapped to W N for the contiguity of L units with the in-sample ones. In the other cases, it has a more complex structure, with triangular sub-matrices under the unilateral constraint
W N + L = W N N O N L W L N W L L
In any case, the estimation of parameters δ is performed with in-sample data y N and the matrix W N N ; the other blocks will be used in forecasting.
The forecasting function of y L depends on the SAR representation used for y ; Kelejian and Prucha (2007) also considered models with autocorrelated errors u . In the reduced (MA) form (13c), the in-sample (fitted values) and out-of-sample forecasts are jointly computed as
y ˜ L = E y L | X N + L , W N + L , y ˜ N + L = I N + L ϕ W N + L 1 α 1 N + L + X N + L β , = y ˜ N , y ˜ L ,
where the contiguity matrix may have q-entries per row (the spatial lags) and may use non-uniform (IDW) weights. The matrix solution (20) is nearly automatic, but in absence of exogenous variables it provides constant forecasts.
The second predictor comes from the structural (AR) representation (13b); it is not automatic and must be managed sequentially by rows as Eq. (13a)
y ^ L = E y L | y N , X N + L , W N + L , y ^ N + l = α + ϕ w N + l y ^ N + l 1 + x N + l β , l = 1 , 2 L , y ^ N + l 1 = y 1 , y 2 , , y N , y ^ N + 1 y ^ N + l 1 , 0 , , 0 ,
where w l , x l are the l-th rows of the matrices W N + L , X N + L . Note that for non-triangular W , Eq. (21) involves missing values in the running vector y ^ N + l 1 . If these values are provided by Eq. (20), then the vector of observations and forecasts at l-th step becomes y ^ N + l 1 = y 1 y N , y ^ N + 1 y ^ N + l 1 , y ˜ N + l y ˜ N + L and, at the end l = L , it will only contain the improved forecasts (21).
LeSage and Pace (2004) and Goulard et al. (2017) also discussed solutions based on the best linear unbiased predictor (BLUP) of Goldberger. This approach arises from the conditional mean and variance of Gaussian random vectors:
E y L | y N = μ L + Σ L N Σ N N 1 y N μ N , y ˜ ^ L = y ˜ L + Σ L N Σ N N 1 y N y ˜ N ,
where y ˜ L is a subvector the predictor in Eq. (20) and Σ L N , Σ N N come from the partition of the joint covariance matrix of y N + L , given by
Σ N + L = I N + L ϕ W N + L 1 I N + L ϕ W N + L 1 σ e 2 = Σ N N Σ N L Σ L N Σ L L .
The solution (22)-(23) requires a double matrix inversion: first I N + L ϕ W N + L 1 and next Σ N N 1 ; in the presence of large N, this may introduce approximation errors. To reduce numerical instability, one may partition the simpler array Ω N + L = Σ N + L 1 and exploit the algebric relationship Σ L N Σ N N 1 = Ω L L 1 Ω L N (see LeSage and Pace, 2008). The latter involves a single matrix inversion Ω L L 1 , of a submatrix of lower size L N . Finally, the conditional variance V y L | y N of the BLUP estimator may provide the dispersion of the forecasts (22)
V y ˜ ^ L = Σ L L Σ L N Σ N N 1 Σ N L ,
this matrix requires the estimation of the variance σ e 2 , which can be carried out with the in-sample residuals e ^ N = y N y ^ N of Eqs. (20).
As a final remark, notice that all formulae displayed until now can be extended to unconstrained SAR(q) models, by replacing the IR matrix with one of order q-th. For the predictors (20),(21) this yields the following dispersion
V ( y ˜ ) = I N + L k = 1 q ϕ k W k 1 I N + L k = 1 q ϕ k W k 1 σ e 2 ,
where the k-th lag contiguity matrices W k have size N + L .

3.3. Simulations and Applications

To compare the various estimators, we perform simulation experiments of SAR models (13)-(16) on a random grid of N=150 points, unformly distributed in the unit square: [ i s , j s ] IU ( 0 , 1 ) 2 , as in Figure f6c. We generate realizations of the process { y s } with independent inputs x s IU ( 0 , 2 ) Uniform and e s IN ( 0 , 1 ) Normal; parameters α = 1 , ϕ = 0 . 65 , 0 . 95 , β = 1 ; contiguity matrices W with q=1,3 lags, with north NN and simple NN links. Subsequently, M=500 replications are fitted with LS, ML, GM estimators and relative biases, relative RMSE and p-values of Normality test were computed for the coefficents α ^ , ϕ ^ , β ^ and then averaged. Note that relativization (e.g. bias ( ϕ ^ N ) / | ϕ 0 | ) allows to sum the individual statistics so as to directly compare the various methods.
The results are reported in Table Section 2.2, where "variable grid" means that at each replication the centroids change and "variable weights" indicates that IDW are used in the matrices W . The Durbin model includes a lagged term on the exogenous part, as β 1 x + β 2 W x , with coefficients [ β 1 , β 2 ] = [ 1 , . 65 ] . The software used for ML and GM estimatess is the Matlab package of LeSage and Pace (2009); it fits the Durbin model only with the ML method. Main conclusions from Table Section 2.2 are that ML and GM methods do not improve the LS estimates in the case of unidirectional (triangular) W matrices. However, they significantly outperform LS in the multidirectional (e.g. simple NN) case; in particular, the ML method is the best. However, these results are not homogeneous as regards unbiasedness and efficiency of estimates, where the GM estimator may locally have smaller bias.
Forecasting. Unlike estimation, prediction evaluation of SAR methods cannot be based on pre-specified models and contiguity matrices, but it must refer to data generated from autonomous sources. As in the lattice case, we considered the random surfaces (11), sampled at random points i s IU [ 1 , n ] , j s IU [ 1 , m ] , with s = 1 , 2 N ; however, the results are too favourable to unilateral SAR models. We then turn to the real USGS (2007) data in Figure 4, with n=340, m=455; we sample N=150 cells and withhold for forecasting L=30 outside, see Figure 7a. The fitted model is SAR (15) with q=3 and x s = [ i s , j s ] , and forecast functions (20)-(22); the unilateral model is estimated with LS and the multilateral one with ML.
As a final application, we consider original point data concerning the measurement of stable isotopes of oxygen ( δ 18 O) and hydrogen ( δ 2 H) in the ground water. Mapping isoscapes is useful for physical monitoring of the hydrological cycle and for archeological and forensic investigations, regarding the path of people movement. We consider the datasets of Gautam et al. (2020), recorded in South-Korea in 2010, with N=130; and Moreiras Reynaga et al. (2021), recorded in Mexico in 2007, with N=234. We estimate SAR model (15) with q=3 for y s = δ 18 O , δ 2 H and x s = latitude, longitude, to forecast about 20% of observations; the results are in Tab. 6 and Fig. 8. They confirm the better performance of unilateral SAR models with the AR algorithm (21), especially when the data have significant autocorrelation (i.e. large | ϕ | ) and marked spatial trends (significant β 1 , 2 ). The reduction of MAPE staristics in Tab. 6 ranges from -18% in Korea to -38% in Mexico.
Table 6. Parameter estimates (and T-statistics) and MAPE forecast statistics of the SAR model (15) with q=3 applied to the water isotope data of South Korea δ 2 H (first block) and Mexico δ 18 O (second block).
Table 6. Parameter estimates (and T-statistics) and MAPE forecast statistics of the SAR model (15) with q=3 applied to the water isotope data of South Korea δ 2 H (first block) and Mexico δ 18 O (second block).
Model Method α ^ ϕ ^ β ^ 1 β ^ 2 σ ^ e MAPE ( 20 ) MAPE ( 21 ) MAPE ( 22 )
Unilateral LS (17) 123.7 0.337 -0.557 -2.51 4.28 0.069 0.065 0.065
T-stat., R 2 " (1.70) (4.72) (-1.04) (-3.37) 0.478 . . .
Multilat. ML (19) 97.9 0.622 -0.376 -1.98 3.53 0.080 0.079 0.079
T-stat., R 2 " (1.69) (9.46) (-0.86) (-3.41) 0.645 . . .
Unilateral LS (17) 4.30 0.636 0.089 0.093 1.28 0.377 0.369 0.370
T-stat., R 2 " (1.69) (8.17) (3.19) (2.72) 0.336 . . .
Multilat. ML (19) -0.85 0.479 0.053 0.101 1.26 0.600 0.597 0.588
T-stat., R 2 " (-0.36) (7.41) (1.88) (2.89) 0.351 . . .
Figure 8. Water isotopes data and SAR forecasts (22): Red=out-of-sample data, Green= forecasts with W triangular; Black=forecasts with W multilateral. Panels: a) South-Korea sample locations; b) Latitudinal view of Hydrogen isotope; c) Mexico sample locations; d) Longitudinal view of Oxigen isotope.
Figure 8. Water isotopes data and SAR forecasts (22): Red=out-of-sample data, Green= forecasts with W triangular; Black=forecasts with W multilateral. Panels: a) South-Korea sample locations; b) Latitudinal view of Hydrogen isotope; c) Mexico sample locations; d) Longitudinal view of Oxigen isotope.
Preprints 115048 g008

4. Conclusions

In this paper we have compared unilateral and multilateral SAR models in forecasting spatial data of both lattice and polygonal type. SAR systems are natural extensions of classical autoregressions, their difference is in the treatment of the spatial dependence: while unilateral models choose a single direction in space, as in time series, multilateral models consider multiple directions. These approaches lead to different contiguity matrices: triangular and sparse, usually computed with the nearest-neighbor approach. While triangular SAR models can be consistently estimated with least squares, multilateral SAR require maximum likelihood or method of moments; in the latter cases, numerical complexity increase with the dimension of the contiguity matrices. Instead, LS is not sensitive to the size of the lattice or the number of spatial units involved and can manage even large scale systems, see Eqs. (9) and (17). Simulation experiments on small-medium scale systems have shown that ML and GM are suitable for multilateral models, but LS is preferable for the unilateral ones, especially in presence of unstable systems ( | ϕ | 1 ).
As in time series, unilateral models with data ordered in a certain direction, yield recursive (sequential) calculations, which allow simple and unconditional forecasting functions. Instead, multilateral models involve forward values which may be pre-estimated only with the reduced form (20). In the applications we have seen that out-of-sample forecasting of multilateral models benefits from the BLUP improvement (22), but the performance of the unilateral models with (21) remains better, see Tabs. 5 and 6. As regards the practical usage of SAR point forecasts, we mention the possibility to use them as alternatives to non-parametric smoothers, such as Kriging and Kernels, which are particularly sensitive to the border effects. Further, in order to overcome the limits of unidirectionality, one can estimate unilateral models in various directions (e.g. NS, SN, WE, EW), and then combine their forecasts. This solution is suitable for points inside the perimeter of the investigated area, but it can also be applied to external points by using diagonal paths (e.g. NE).
These are open topics for further research; other important areas of application of triangular contiguity matrices are to space-time (ST) data; in particular, to STAR and dynamic panel-data models (see Grillenzoni, 2004 and Copiello and Grillenzoni, 2021). In the latter reference it is shown how the unidirectional specification improves the performance even of robust (M-type) estimators, designed for data with outliers. Finally, the unilateral specification is also suitable for testing the Granger causality between spatial processes (e.g. Herrera et al., 2012) because it is related to diffusion effects which follow specific directions in the space.

Appendix A. Contiguity Matrices of SAR Lattice Models

We shows the derivation of the vector representation of a SAR ( p , q , r ) model for lattice data, where p=spatial lag, q=number of units and r=number of parameters. Consider the array X in Fig. A1 of size n=4, m=5, and a deterministic triangular scheme SAR(1,3,1): x i j = x i 1 , j + x i 1 , j 1 + x i , j 1 , i , j = 2 4 , 5 . Let x = vec j ( X ) be the column vectorization of length N=20; then, subject to fixed border conditions (first row and column), the vector representation becomes x = W x x , where W x is the contiguity matrix in Fig. A1. In general, W x has m square blocks of size n on the main diagonal and m 1 blocks on the lower sub-diagonal, with central rows with q non-zero elements.
As regards the corresponding SAR(1,3,1) process y i j = α + ϕ y i 1 j + y i 1 , j 1 + y i j 1 / 3 + e i j , with border conditions y 00 =0, i.e. y i 1 = y 1 j = α + e i j , its vector representation becomes y = α 1 N + ϕ W y 1 3 y + e , where the matrix W y is obtained by equating to zero all diagonal entries of W x in Figure A1. If r=q, the matrix W y is decomposed into r triangular matrices which provide y = α + k = 1 r ϕ k W k y + e . Analogously, Figure A2 shows the contiguity matrix W x for a rook model SAR(1,4,1); the corresponding W y is obtained by equating to zero all the diagonal entries.
Figure A1. Vector representation of a triangular scheme x i j = ( x i 1 , j + x i 1 , j 1 + x i , j 1 ) on a lattice of size 4×5, with fixed border conditions.
Figure A1. Vector representation of a triangular scheme x i j = ( x i 1 , j + x i 1 , j 1 + x i , j 1 ) on a lattice of size 4×5, with fixed border conditions.
Preprints 115048 g0a1
Figure A2. Vector representation of a rook model x i j = ( x i 1 , j + x i + 1 , j + x i , j 1 + x i , j + 1 ) on a lattice of size 4×5, with fixed border conditions.
Figure A2. Vector representation of a rook model x i j = ( x i 1 , j + x i + 1 , j + x i , j 1 + x i , j + 1 ) on a lattice of size 4×5, with fixed border conditions.
Preprints 115048 g0a2

Appendix B. More Results on Surface Forecasting

We provide additional results of the experiment of Figure 4 and Table 2, which deals with forecasting a rough random surface (11) with models (4a,b). Figure A3 shows the forecasts of the vector algorithms (10b,c); as one can see, the results of the AR method (10b) are similar to those of the lattice function (10a), in Figure 4c,d. Instead, the forecasts of the reduced (MA) form (10c) are nearly constants in both models, due to the absence of exogenous variables, see Figure A3c,d.
Figure A3. Forecasts of the surface in Figure 4b provided by the models (4a,b) and the predictors (10b,c).
Figure A3. Forecasts of the surface in Figure 4b provided by the models (4a,b) and the predictors (10b,c).
Preprints 115048 g0a3
Table A1 provides the average values, over 10 replications of the surface (11), of the parameter estimates of the models (4a,b) and of their MAPE statistics. The results agree with those of Table 2, which regard a single replication; in particular, the forecasts of the triangular model (in lattice and vector form) are better than those of the rook scheme. The ML estimates are necessary for the multilateral model, but the unilateral one may also rely on the fast LS. In summary, the best methods, also from the graphical viewpoint, are unilateral models with the AR predictors.
Table A1. Average values, over 10 replications of the surface (11), of parameter estimates (and T-statistics) and forecast statistics of the models (4a,b) with predictors (10a-c).
Table A1. Average values, over 10 replications of the surface (11), of parameter estimates (and T-statistics) and forecast statistics of the models (4a,b) with predictors (10a-c).
Model Method α ¯ ϕ ¯ R ¯ 2 MAPE
Triang. (10a) LS (2) 0.118 (4.87) 0.879 (37.0) 0.641 0.116
Rook (10a) ML (19) 0.231 (17.3) 0.766 (58.2) 0.738 0.183
AR-Triang. (10b) LS (9) 0.107 (4.78) 0.891 (40.6) 0.668 0.117
AR-Rook (10b) ML (19) " " " 0.217
MA-Triang. (10c) LS (9) " " " 0.231
MA-Rook (10c) ML (19) " " " 0.584

References

  1. Anselin, L. Spatial Econometrics. In A Companionto Theoretical Econometrics; Baltagi, B.H.; Ed.; Blackwell Publishing, Oxford, 2003; pp. 310-330.
  2. Awang, N.; Shitan, M. Estimatingtheparametersofthesecondorderspatialunilateral autoregressivemodel. International Journal of Statistical Sciences 2006, 5, 37–58. [Google Scholar]
  3. Bao, Y.; Liu, X.; Yang, L. Indirectinferenceestimationforspatialautoregressions. Econometrics 2020, 8, 34. [Google Scholar] [CrossRef]
  4. Barbiero, T.; Grillenzoni, C. Planninganovelregionalmethanenetwork: Demandforecasting and economicevaluation. Energy Conversion and Management: X 2022, 16, 100294. [Google Scholar] [CrossRef]
  5. Basu, S.; Reinsel, G.C. Propertiesofthespatialnilateralfirst-orderARMAmodel. Advances in Applied Probability 1993, 25, 631–648. [Google Scholar] [CrossRef]
  6. Bhattacharyya, B.B.; Richardson, G.D.; Franklin, L.A. Asymptoticinferencefornearunitroots in spatialautoregression. Annals of Statistics 1997, 28, 173–179. [Google Scholar]
  7. Bustos, O.; Ojeda, S.; Vallejos, R. SpatialARMAmodelsanditsapplicationstoimagefiltering. Brazilian Jour.of Probab. Stat. 2009, 23, 141–165. [Google Scholar]
  8. Copiello, S.; Grillenzoni, C. Robustspacetimemodelingofsolarphotovoltaicdeployment. Energy Reports 2021, 7, 657–676. [Google Scholar] [CrossRef]
  9. Fingleton, B. Spatial Autoregression. Geograph. Anal. 2009, 41, 385–391. [Google Scholar] [CrossRef]
  10. Gautam, M.K.; Song, B.-Y.; Shin, W.-J.; Bong, Y.-S.; Lee, K.-S. DatasetsforspatialvariationofO and HisotopesinwatersandhairacrossSouthKorea. Data inBrief 2020, 30, 105666. [Google Scholar]
  11. Goulard, M.; Laurent, T.; Thomas-Agnan, C. Aboutpredictionsinspatialautoregressivemodels: optimal and almostoptimalstrategies. Spatial Economic Analysis 2017, 12, 304–325. [Google Scholar] [CrossRef]
  12. Granger, C.W.J. Usefulconclusionsfromsurprisingresults. J. Econom. 2012, 169, 142–146. [Google Scholar] [CrossRef]
  13. Grillenzoni, C. Adaptivespatio-temporalmodelsforsatelliteecologicaldata. Jour. of Agric. Biol. &Environ.Statistics 2004, 9, 158–180. [Google Scholar]
  14. Grillenzoni, C. Statisticsforimagesharpening. Statistica Neerlandica 2008, 62, 173–192. [Google Scholar] [CrossRef]
  15. Herrera, M.; Ruiz, M.; Mur, J. Someexamplesoftheuseofanewtestforspatialcausality. Estadística Espa ˜nola 2012, 54, 53–70. [Google Scholar]
  16. Jarque, C.M.; Bera, A.K. Atestfornormalityofobservationsandregressionresiduals. International Statistical Review 1987, 55, 163–172. [Google Scholar] [CrossRef]
  17. Kelejian, H.H.; Prucha, I.R. Ageneralizedmomentsestimatorfortheautoregressiveparameter in aspatialmodel. Internat. Econom.Rev. 1999, 40, 509–533. [Google Scholar] [CrossRef]
  18. Kelejian, H.H.; PruchaI, R. Therelativeefficienciesofvariouspredictorsinspatialeconometric models containingspatiallags Regional ScienceandUrbanEconomics 2007, 37, 363-374.
  19. Lee, L.F. Consistencyandefficiencyofleastsquaresestimationformixedregressive,spatial autoregressivemodels. Econometric Theory 2002, 18, 252–277. [Google Scholar] [CrossRef]
  20. Lee, L.F. Bestspatialtwo-stageleastsquaresestimatorsforaspatialautoregressivemodel with autoregressivedisturbances. Econometric Reviews 2003, 22, 307–335. [Google Scholar] [CrossRef]
  21. LeSage, J.P.; Pace, R.K. Modelsforspatiallydependentmissingdata. Journal of Real Estate Finance and Economics 2004, 29, 233–254. [Google Scholar] [CrossRef]
  22. LeSage, J.P.; Pace, R.K. (2008). Spatialeconometricmodels,prediction.In Shekhar, S.; Xiong, H.(eds), Encyclopedia of Geographical Information Science, 1095-1100. Springer:NewYork.
  23. LeSage, J.P.; Pace, R.K. IntroductiontoSpatialEconometrics. CRCPress,Taylor&Francis Group. 2009.
  24. Mojiri, A.; Waghei, Y.; Nili-Sani, H.R.; MohtashamiBorzadaran, H.R. Thestationaryregions for theparameterspaceofunilateralsecond-orderspatialARmodel. Random Operatorsand Stochastic Equations 2018, 26, 185–191. [Google Scholar] [CrossRef]
  25. Mojiri, A.; Waghei, Y.; Nili-Sani, H.R.; Mohtashami Borzadaran, H.R. Non-stationaryspatial autoregressivemodelingforthepredictionoflatticedata. Communications inStatistics-Simulation and Computation 2021, 100, 1–13. [Google Scholar]
  26. MoreirasReynaga, D.K.; Millaire, J.-F.; ChávezBalderas, X.; RománBerrelleza, J.A.; LópezLuján, L.; Longstaffe, F.J. BuildingMexicanisoscapes:Oxygenandhydrogenisotopedataof meteoric watersampledacrossMexico. Data inBrief 2021, 36, 107084. [Google Scholar]
  27. Sheffield, S. Whatisarandomsurface?Proc.Int.Cong.Math.2022,InternationalMathemati-cal Union, EMSPress. 2022.
  28. Sjodin, B. (2017), GenerateRandomSurfacesinCOMSOLMultiphysics(R). Avalibaleat:https://www.comsol.
  29. Tjøstheim, D. StatisticalSpatialSeriesModellingII:SomefurtherResultsinUnilateral Processes. Advances in Applied Probability 1983, 15, 562–684. [Google Scholar] [CrossRef]
  30. USGS. U.S. Geological Survey, Digital Elevation Models (DEMs). 2017. Avalilableat: https://www.sciencebase.
  31. Yao, Q.; Brockwell, P.J. Gaussian Maximum Likelihood Estimation for ARMAModels II: Spatial Processes. Bernoulli 2006, 13, 403–429. [Google Scholar]
Figure 1. Examples of lattice (a,b) and polygonal (c,d) data, with unilateral (a,c) and multilateral (b,d) spatial dependence schemes, with respect to their nearest neighbors.
Figure 1. Examples of lattice (a,b) and polygonal (c,d) data, with unilateral (a,c) and multilateral (b,d) spatial dependence schemes, with respect to their nearest neighbors.
Preprints 115048 g001
Figure 2. Contiguity matrices W of the models (4a,b) on a lattice of size n=4, m=7. Red dots have value 1 and correspond to border values; while the blue are 1/3 in Panel (a) and 1/4 in Panel (b); see Appendix A.
Figure 2. Contiguity matrices W of the models (4a,b) on a lattice of size n=4, m=7. Red dots have value 1 and correspond to border values; while the blue are 1/3 in Panel (a) and 1/4 in Panel (b); see Appendix A.
Preprints 115048 g002
Figure 3. Distribution of LS and ML estimates in 200 replications of the SAR models (4a,b) with α 0 =1, ϕ 0 =1 (green), and mean values of α ^ , ϕ ^ (red): Panels (a,b) LS (9) of model (4a); Panels (c,d) ML (19) of model (4b).
Figure 3. Distribution of LS and ML estimates in 200 replications of the SAR models (4a,b) with α 0 =1, ϕ 0 =1 (green), and mean values of α ^ , ϕ ^ (red): Panels (a,b) LS (9) of model (4a); Panels (c,d) ML (19) of model (4b).
Preprints 115048 g003
Figure 4. Panels (a,b): Simulation of the random surface (11) of size n=m=32. Panels (c,d): Out-of-sample forecasts of Y (:,26:32) with models (4a,b) and predictor (10a).
Figure 4. Panels (a,b): Simulation of the random surface (11) of size n=m=32. Panels (c,d): Out-of-sample forecasts of Y (:,26:32) with models (4a,b) and predictor (10a).
Preprints 115048 g004
Figure 5. Application of SAR models (4a,b) with quadratic trend to USGS (2027) data: out-of-sample forecasts of the image portion Y (:,401:455).
Figure 5. Application of SAR models (4a,b) with quadratic trend to USGS (2027) data: out-of-sample forecasts of the image portion Y (:,401:455).
Preprints 115048 g005
Figure 7. Graphical results of a single replication of the estimates in Table 5: a) Random sampling of USGS surface (points to forecast are in red); b) Longitudinal display of data and forecasts (21) (unilateral=green, multilateral=black); c) Contiguity matrix of the unilateral model with q=3; d) Contiguity matrix of the multilateral model.
Figure 7. Graphical results of a single replication of the estimates in Table 5: a) Random sampling of USGS surface (points to forecast are in red); b) Longitudinal display of data and forecasts (21) (unilateral=green, multilateral=black); c) Contiguity matrix of the unilateral model with q=3; d) Contiguity matrix of the multilateral model.
Preprints 115048 g007
Table 1. Performance of LS (9) and ML (19) estimators applied to triangular and rook SAR models (4a,b) with n=m=32: R-bias=relative bias, R-RMSE=relative root mean squared error, N-test=p-values of the normality test. The bold character enhances the best methods (LS or ML) over model classes (Triang. and Rook).
Table 1. Performance of LS (9) and ML (19) estimators applied to triangular and rook SAR models (4a,b) with n=m=32: R-bias=relative bias, R-RMSE=relative root mean squared error, N-test=p-values of the normality test. The bold character enhances the best methods (LS or ML) over model classes (Triang. and Rook).
Method LS Triang. LS Rook ML Triang. ML Rook α 0 = 1
Stat. α ^ ϕ ^ α ^ ϕ ^ α ^ ϕ ^ α ^ ϕ ^ ϕ 0
R-bias 0.015 0.013 0.279 0.325 0.295 0.324 0.014 0.014
R-RMSE 0.078 0.078 0.289 0.334 0.302 0.329 0.062 0.066 0.5
N-test >0.5 0.415 0.456 0.267 0.319 0.065 0.221 0.138
R-bias 0.021 0.007 0.330 0.134 0.372 0.148 0.023 0.008
R-RMSE 0.088 0.032 0.339 0.136 0.384 0.153 0.074 0.028 0.75
N-test >0.5 >0.5 0.413 0.135 >0.5 >0.5 0.192 0.052
R-bias 0.007 0.0003 0.004 0.0001 0.049 0.003 0.023 0.0002
R-RMSE 0.053 0.003 0.054 0.0004 0.067 0.004 0.043 0.0002 1
N-test >0.5 >0.5 0.412 0.119 0.126 0.001 0.092 0.001
R-bias 0.0057 0.0001 0.003 0.0000 0.510 0.024 2.270 0.023
R-RMSE 0.047 0.002 0.035 0.0001 0.512 0.024 2.286 0.024 1.025
N-test >0.5 >0.5 >0.5 0.001 >0.5 0.001 0.450 0.001
Table 2. Parameter estimates (with T-statistics in parentheses) of SAR models (4a,b) on the surface Y (:,1:25) in Figure 4b, and forecast statistics on the data Y (:,26:32).
Table 2. Parameter estimates (with T-statistics in parentheses) of SAR models (4a,b) on the surface Y (:,1:25) in Figure 4b, and forecast statistics on the data Y (:,26:32).
Model α ^ ϕ ^ 1 ϕ ^ 2 ϕ ^ 3 R 2 MAPE
Triang. (4a) LS 0.096 (4.45) 0.893 (40.3) . . 0.686 0.143
Triang. (4a) ML 0.247 (15.2) 0.739 (45.1) . . 0.665 0.125
SAR(1,3,3) LS 0.094 (4.42) 0.448 (13.3) 0.097 (2.69) 0.352 (10.1) 0.701 0.140
Rook (4b) LS –0.008 (-0.39) 1.008 (50.6) . . 0.788 0.864
Rook (4b) ML 0.198 (16.3) 0.795 (64.9) . . 0.753 0.224
Table 3. Estimates of SAR models with bivariate polynomials applied to the USGS (2017) image: Y (:,1:400) in-sample data, Y (:,401:455) out-of-sample.
Table 3. Estimates of SAR models with bivariate polynomials applied to the USGS (2017) image: Y (:,1:400) in-sample data, Y (:,401:455) out-of-sample.
Model α ^ ϕ ^ β ^ 1 β ^ 2 β ^ 3 β ^ 4 β ^ 5 σ ^ e 2 MAPE
Triang. (4a) 8.50 0.961 –0.024 –0.011 11e-5 43e-7 –88e-7 338.4 0.848
T-stat., R 2 25.56 104.9 –10.58 –5.47 22.55 0.735 –2.07 0.913 .
Rook (4b) –6.12 1.026 0.007 0.016 –58e-6 17e-6 –17e-6 88.15 1.362
T-stat., R 2 –35.82 214.0 6.29 15.87 –22.80 5.50 –7.82 0.977 .
Table 4. Results of the simulation experiments on the SAR systems (13)-(15) with spatial lags q=1-3 and single ϕ . The statistics are average values (over α ^ , ϕ ^ , β ^ ) of relative biases, relative RMSE and p-value of N-test. Bold characters indicate the best results.
Table 4. Results of the simulation experiments on the SAR systems (13)-(15) with spatial lags q=1-3 and single ϕ . The statistics are average values (over α ^ , ϕ ^ , β ^ ) of relative biases, relative RMSE and p-value of N-test. Bold characters indicate the best results.
W North NN Simple NN
Estimator LS ML GM LS ML GM ϕ lags
Ave. Rbias 0.0205 0.1338 0.0235 0.1363 0.0287 0.0216 0.65 1
Ave. RRMSE 0.1336 0.2402 0.1831 0.1887 0.1343 0.1581
Ave. Ntest 0.3218 0.2983 0.1675 0.1561 0.3401 0.3337
Ave. Rbias 0.0077 0.1208 0.0098 0.1232 0.0400 0.0075 0.65 1
Ave. RRMSE 0.1396 0.2445 0.1924 0.1873 0.1430 0.1609 (Variable
Ave. Ntest 0.3274 0.3554 0.2291 0.1637 0.2499 0.1626 grid)
Ave. Rbias 0.0340 0.1330 0.0217 0.1324 0.0259 0.0376 0.65 3
Ave. RRMSE 0.1586 0.2416 0.2365 0.1998 0.1390 0.2035
Ave. Ntest 0.2442 0.2612 0.0936 0.1811 0.2559 0.0960
Ave. Rbias 0.0298 0.1289 0.0228 0.1487 0.0208 0.0343 0.65 3
Ave. RRMSE 0.1518 0.2360 0.2228 0.2088 0.1326 0.1963 (IDW
Ave. Ntest 0.2530 0.2561 0.1673 0.1805 0.2900 0.1736 weights)
Ave. Rbias 0.0312 0.0980 0.0248 0.0581 0.0109 0.0164 0.95 3
Ave. RRMSE 0.1475 0.2514 0.3190 0.1208 0.1086 0.1648
Ave. Ntest 0.1516 0.1678 0.0010 0.2611 0.3033 0.1601
Ave. Rbias 0.0663 0.3405 . 0.3333 0.0704 . 0.65 3
Ave. RRMSE 0.2849 0.4659 . 0.4141 0.2764 . (Durbin
Ave. Ntest 0.2584 0.3891 . 0.1286 0.1313 . model)
Table 5. Average Values, over 100 Replications, of Parameter Estimates (and T-statistics) and MAPE Forecast Statistics of SAR Model (15) on Random Samples of USGS (2007) Data.
Table 5. Average Values, over 100 Replications, of Parameter Estimates (and T-statistics) and MAPE Forecast Statistics of SAR Model (15) on Random Samples of USGS (2007) Data.
Model Method α ¯ ϕ ¯ β ¯ 1 β ¯ 2 σ ¯ e MAPE ( 20 ) MAPE ( 21 ) MAPE ( 22 )
Unilateral LS (17) 68.4 0.543 0.012 -0.018 54.3 0.948 0.815 0.830
T ¯ -stat., R ¯ 2 " (3.0) (5.1) (0.21) (-0.31) 0.213 . . .
Multilat. ML (19) 76.5 0.492 0.005 -0.009 50.2 0.996 0.951 0.899
T ¯ -stat., R ¯ 2 " (4.2) (6.3) (0.06) (-0.19) 0.328 . . .
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Alerts
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated