Preprint
Article

A U-Statistic for Testing Lack of Dependence in Functional Partially Linear Regression Model

Altmetrics

Downloads

84

Views

26

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

18 July 2024

Posted:

18 July 2024

You are already at the latest version

Alerts
Abstract
Functional partially linear regression model, contains a functional linear part and a non-parametric part, in which testing the linear relationship between the response and the functional predictor is of fundamental importance. When functional data cannot be approximated with a few principal components, based on a pseudo estimate for the unknown non-parametric component, we develop a U-statistic of order-two in this paper. Under some regularity conditions, we use the martingale central limit theorem to prove that the proposed test statistic is asymptotically normal. The finite sample performance with simulation studies and real data application are assessed to verify the proposed test procedure.
Keywords: 
Subject: Computer Science and Mathematics  -   Probability and Statistics

1. Introduction

In the past few decades, functional data analysis has been widely developed and applied in various fields, such as medicine, biology, economics, environmetrics, chemistry (see [1,2,3,4,5]). An important model in functional data analysis is partial functional linear model, which includes the parametric linear part and the functional linear part. To make the relationships between variables more flexible, the parametric linear part is usually replaced by the non-parametric part. This model is known as functional partially model, which has been studied in [6,7,8]. The functional partially linear regression model is defined as follows:
Y = g ( u ) + 0 1 α ( s ) X ( s ) d s + ε ,
where Y is response variable, X ( · ) is functional predictor with mean μ 0 ( · ) and covariance operator C . The slope function α ( · ) is an unknown function. g ( · ) is a general continuous function defined on a compact support Ω . ε is a random error with mean zero and finite variance σ 2 , and is independent of the predictor X ( · ) . When the g ( · ) is constant, the model (1) becomes a functional linear model. See [9,10,11]. When g ( · ) is parametric linear part, the model (1) becomes a partially functional linear model, which has been studied in [12,13,14].
Hypothesis testing plays a critical role in statistical inference. For testing the linear relationship between the response and the functional predictor in functional linear model, functional principal component analysis (FPCA) of the functional predictor X ( · ) in the literature is a major idea to construct test statistic. See [9,10,15]. Taking into account the flexibility of non-parametric functions, [6] introduced the functional partially linear regression model. [7] and [8] constructed the estimators of the slope functions based on spline and FPCA respectively, and the estimate of non-parametric components in their papers adopted B-spline. When the predictors are measured with additive error, [16] studied the estimators of slope function and non-parametric function by FPCA and kernel smoothing techniques. [17] established the estimators of slope function, non-parametric component and mean of response variable when existing missing responses at random.
However, testing the relationship between the response variable and the functional predictor in functional partially linear regression model has been rarely considered so far. In this paper, the following hypothesis testing for model (1) will be considered:
H 0 : α ( t ) = α 0 ( t ) v . s . H 1 : α ( t ) α 0 ( t ) ,
where α 0 ( t ) is an assigned function. Without loss of generality, let α 0 ( t ) = 0 . For testing (2), [18] constructed a chi-square test using FPCA when the functional data can be approximated with a few principal components. In particular, when the functional data cannot be approximated with a few principal components, only several researches have considerated this situation in functional data analysis. [19] constructed a FLUTE test based on order-four U-statistic for the testing in functional linear model, which can be computationally very costly. In order to save calculation time, [20] developed a faster test using a order-two U-statistic. Motivated by this, we propose a non-parametric U-statistic that combines the functional data analysis with the classical kernel method for testing (2).
The paper is organized as follows. In Section 2, we construct a new test procedure for testing in functional partially linear regression model. The theoretical properties of the proposed test statistic under some regularity conditions will be considered in Section 3. Simulation study is conducted in Section 4 to assess the finite sample performance of our proposed test procedure. Section 5 reports the testing result for spectrometric data. All the proofs of main theoretical results are presented in Appendix.

2. Test statistic

Suppose that Y and U are real-valued random variables. { X ( · ) } is a stochastic process with sample paths in L 2 [ 0 , 1 ] , which is the set of all square integrable functions defined on [ 0 , 1 ] . Let · , · , · represent the inner product and norm in L 2 [ 0 , 1 ] , respectively. { ( Y i , X i ( · ) , U i ) , i = 1 , 2 , , n } is a random sample from model (1),
Y i = 0 1 α ( s ) X i ( s ) d s + g ( U i ) + ε i , i = 1 , 2 , , n .
For any given α ( t ) L 2 [ 0 , 1 ] , we move α ( t ) to the left,
Y i X i , α = g ( U i ) + ε i , i = 1 , 2 , , n ,
Then model (4) becomes classical non-parametric model. A pseudo kernel estimate of the non-parametric function using Nadaraya-Watson regression method can be constructed as follows:
g ^ ( U i ) = j i n K h ( U j U i ) ( Y j X j , α ) k i n K h ( U k U i ) , i = 1 , 2 , , n ,
where K h ( · ) = K ( · / h ) / h with K ( · ) being a preselected kernel function and h being the bandwidth whose optimal value can be determined by some data-driven methods such as the cross-validation methods. Here we estimate non-parametric g ( U i ) without ith sample.
Let
W i = ( W i 1 , , W i ( i 1 ) , W i ( i + 1 ) , , W i n ) T ,
X i , α = ( X 1 , α , , X i 1 , α , X i + 1 , α , , X n , α ) T ,
Y i = ( Y 1 , , Y i 1 , Y i + 1 , , Y n ) T , X i = ( X 1 , , X i 1 , X i + 1 , , X n ) T ,
where W i j = K h ( U j U i ) / k i K h ( U k U i ) . So the pseudo estimate (5) of non-parametric function can be rewritten in matrix as
g ˇ ( U i ) = W i T ( Y i X i , α ) .
If we replace g ( U i ) by g ˇ ( U i ) in the model (3), we have
Y ˇ i = X ˇ i , α + ε i ,
where X ˇ i ( t ) = X i ( t ) W i T X i ( t ) , Y ˇ i = Y i W i T Y i . If we denote μ i t μ ( U i , t ) = E [ X 1 ( t ) | U i ] , where “≜” stand for “defined as”. Then μ ^ i t = W i T X i ( t ) can be estimator of the conditional expectation μ i t for any t [ 0 , 1 ] .
For an arbitrary orthonormal basis { ϕ j } j = 1 in L 2 [ 0 , 1 ] , the predictor X ( · ) and the slope function α ( · ) have following expansions. Denote the number of truncated basis function by p,
X i ( t ) = j = 1 p ξ i j ϕ j ( t ) + j = p + 1 ξ i j ϕ j ( t ) , α ( t ) = j = 1 p β j ϕ j ( t ) + j = p + 1 β j ϕ j ( t ) ,
where ξ i j = X i , ϕ j , β j = α , ϕ j . Let ξ ˇ i j = X ˇ i , ϕ j , the model (7) can be rewritten as follows:
Y ˇ i = j = 1 ξ ˇ i j β j + ε i = j = 1 p ξ ˇ i j β j + j = p + 1 ξ ˇ i j β j + ε i .
Denote ξ i = ( ξ i 1 , ξ i 2 , , ξ i p ) T , which has mean μ and covariance matrix Σ . Let
ξ i = ( ξ 1 , , ξ i 1 , ξ i + 1 , , ξ n ) T , ξ ˇ i = ( ξ ˇ i 1 , ξ ˇ i 2 , , ξ ˇ i p ) T ,
ξ ˇ i = ( ξ ˇ 1 , , ξ ˇ i 1 , ξ ˇ i + 1 , , ξ ˇ n ) T , β = ( β 1 , β 2 , , β p ) T .
For model (3), we define the approximation error as
e i = 0 1 α ( s ) X i ( s ) d s k = 1 p ξ i k β k .
To analyze the effect of the approximation error, we impose following condition on functional predictors and regression function.
(C1) Functional predictors { X i , i = 1 , , n } and regression function α ( t ) satisfy:
(i) Functional predictors { X i , i = 1 , , n } belongs to a Sobolev ellipsoid of order-two, then there exists a universal constant C, such that j = 1 ξ i j 2 j 4 C 2 , for i = 1 , , n .
(ii) Regression function satisfies α 2 ( t ) d t K , where K is a constant.
Using Cauchy-Schwarz inequality, we have
e i 2 = j = p + 1 ξ i j β j 2 j = p + 1 ξ i j 2 j 4 j = p + 1 j 4 α j 2 C 2 K p 4 .
Then the approximation error can be ignored as p . Model (7) becomes:
Y ˇ i = j = 1 p ξ ˇ i j β j + ε i ,
which is a high-dimensional partial linear model. Since
E ( X i E [ X i | U i ] ) ( Y i E ( Y i | U i ) ) 2
can be used as an effective measure of the distance between α ( · ) and 0 when testing (2). Motivated by [21], we construct following test statistic by estimating (11).
T n p = 1 2 n 2 n 2 1 i = 2 n j = 1 i 1 Δ i j X ˇ Δ i j Y ˇ ,
where
Δ i j X ˇ = X ˇ i X ˇ ¯ , X ˇ j X ˇ ¯ + X ˇ i X ˇ j , X ˇ i X ˇ j 2 n , Δ i j Y ˇ = Y ˇ i Y ˇ ¯ Y ˇ j Y ˇ ¯ + Y ˇ i Y ˇ j 2 2 n ,
where X ˇ ¯ ( t ) and Y ˇ ¯ are sample means of X ˇ i ( t ) and Y ˇ i . By some calculations, we can get E [ Δ i j ( X ˇ ) ] = 0 , E [ Δ i j ( Y ˇ ) ] = 0 . The test statistic T n p can served as a measurement of distance between α ( · ) and 0 under null hypothesis. Large values of test statistic T n p are in favor of alternative hypothesis and leads to rejection of null hypothesis.

3. Asymptotic theory

To achieve the asymptotic properties of the proposed test, we first suppose following conditions based on [19] and [21]. Denote
μ ( U i ) = ( μ 1 ( U i ) , μ 2 ( U i ) , , μ p ( U i ) ) T E [ ξ | U i ] ,
Σ * ( U i ) = E [ ξ i ξ i T | U i ] μ ( U i ) μ T ( U i ) , Σ * = Σ E [ μ ( U 1 ) μ T ( U 1 ) ] .
A condition on the dimension of matrix Σ * is:
(C2) As n , p ; Σ * > 0 , tr ( Σ * 4 ) = o ( tr 2 ( Σ * 2 ) ) .
(C3) There exists a m-dimensional random vector Z i = ( Z i 1 , , Z i m ) T for a constant m p so that ξ i = E [ ξ i | U i ] + Γ ( U i ) Z i . Here Z i satisfies E ( Z i ) = 0 , var ( Z i ) = I m , and for any U i , Γ ( U i ) is a p × m matrix such that Γ ( U i ) Γ T ( U i ) = Σ * ( U i ) . Each random vector { Z i , i = 1 , n } is assumed to have finite 4th moments and E ( Z i j 4 ) = 3 + Δ for some constant Δ . Furthermore, we assume
E Z i j 1 l 1 Z i j 2 l 2 Z i j d l d = E Z i j 1 l 1 E Z i j 2 l 2 E Z i j d l d
for k = 1 d l k 4 and j 1 j 2 j d , where d is a positive integer.
(C4) β T Σ * β = o ( h 2 ) , and β T Σ * 3 β = o ( tr ( Σ * 2 ) / n ) .
(C5) The error term satisfies E [ ε 4 ] < + .
(C6) The random variable U has a compact support Ω , and its density function f ( · ) has continuous second order derivative and bounded away from 0 on its support. The kernel function K ( · ) is a symmetric probability density function with a compact support. Also, it is Lipschitz continuous.
(C7) E ( ξ 1 | U 1 ) and g ( · ) are Lipschitz continuous, and have continuous second order derivatives.
(C8) The sample size n and the smoothing parameter h are assumed to satisfy lim n h = 0 , lim n n h = , lim n n h 4 = 0 .
(C9) The truncated number p and the sample size n are assumed to satisfy p = o ( n 2 h 2 ) .
Condition (C2) has been adopted in many studies about high-dimensional data (see [21,22,23]). Condition (C3) resembles a factor model. To analyze the local power, we also impose the condition (C4) on the coefficient vector β . In fact, (C4) can be served as the local alternatives as its distance measurement between β and 0. This local alternative can be also found in [21]. (C5) is the typical assumptions for the error term ε . Conditions (C6-C8) are very common in non-parametric smoothing. (C9) is a technical condition which is needed to derive the theorems.
We will show the asymptotic theory of our proposed test statistic under the null hypothesis and the local alternative (C4) in the following two theorems.
Theorem 1. 
Suppose that the conditions (C1), and (C3-C9) hold, then we have
( i ) E ( T n p ) = C * ( α ) 2 + o tr ( Σ * 2 ) / n ; ( i i ) T n p C * ( α ) 2 = 1 n 2 i = 2 n j = 1 i 1 X i μ i t , X j μ j t ε i ε j + o p tr ( Σ * 2 ) / n ,
where C * ( α ) = E [ X i μ i t , α ( X i μ i t ) ] , and it can be regarded as the covariance operator of random variable X i μ i t .
Theorem 2. 
Suppose that the conditions (C1-C3) and (C5-C9) hold, then under the null hypothesis or the local alternative (C4), we have
n ( T n p C * ( α ) 2 ) σ 2 2 tr ( Σ * 2 ) D N ( 0 , 1 ) , as n ,
where D represents convergence in distribution.
Theorem 2 indicates that under the local alternative hypothesis (C4), the proposed test statistic has the following asymptotic local power for the nominal level α ,
Ψ ( β ) = Φ z α + n C * ( α ) 2 σ 2 2 tr ( Σ * 2 ) ,
where Φ ( · ) is the cumulative distribution function of the standard normal distribution, and z α denotes its upper α quantile. Let η ( α ) = C * ( α ) / σ 2 2 tr ( Σ * 2 ) , and this quantity can be viewed as a signal-to-noise ratio. When the term η ( α ) = o ( 1 / n ) , the power converges to α , then the power converge to 1 if it has a high order of 1 / n . This implies that the proposed test is consistent. Our proposed test statistic also process the identical asymptotic local power. The performance of the power will be shown by simulation in Section 4.
By Theorem 2, the proposed test statistic rejects H 0 at a signification level α if
n T n p 2 σ ^ 2 tr ( Σ * 2 ) ^ z α ,
where σ ^ 2 and tr ( Σ * 2 ) ^ z α are consistent estimators of σ 2 and tr ( Σ * 2 ) , respectively. We use the similar method as in [24] to estimate the trace. That is,
tr ( Σ * 2 ) ^ = Y 1 n 2 Y 2 n + Y 3 n ,
where Y 1 n = 1 A n 2 i j X ˇ i T , X ˇ j 2 , Y 2 n = 1 A n 3 i j k X ˇ i T , X ˇ j X ˇ j T , X ˇ k , Y 2 n = 1 A n 4 i j k l X ˇ i T , X ˇ j X ˇ k , X ˇ l with A n m = n ! / ( n m ) ! . And the simple estimator σ ^ 2 = ( n 1 ) 1 i = 1 n ( Y ˇ i Y ˇ ¯ ) 2 is used, which is consistent under the null hypothesis testing.

4. Simulation study

In this section, to evaluate the finite sample performance of the proposed test, some simulation studies are conducted. We generate 1,000 Monte Carlo samples in each simulation. For basis expansion and FPCA, we use the implementation in the R package fda.
Here we compare the proposed test T n p with the chi-square test T n constructed by [18]. The cumulative percentage of total variance (CPV) method is used to estimate the number of principal components in T n . Define CPV explained by the first m empirical functional principal components as
CPV = i = 1 m λ ^ i i = 1 p λ ^ i ,
where { λ ^ i } i = 1 p is the estimate of the eigenvalue of covariance operator. The minimal m for which CPV(m) exceeds a desired level, 95% is chosen in this section. We denote p as the number of basis functions used to fit curves. The simulated data is generated from the following model:
Y i = 0 1 α ( s ) X i ( s ) d s + g ( U i ) + ε i , i = 1 , 2 , , n .
where g ( U i ) = 2 U i or g ( U i ) = 2 + sin ( 2 π U i ) , and { U i , i = 1 , 2 , , n } is independently generated from the uniform distribution U ( 0 , 1 ) . To analyze the impact of different error distributions, the following four distributions will be selected: (1) ε i N ( 0 , 1 ) , (2) ε i t ( 3 ) / 3 , (3) ε i Γ ( 1 , 1 ) 1 , (4) ε i ( lnorm ( 0 , 1 ) e ) / e ( e 1 ) . All results about g ( U i ) = 2 U i are presented in supplementary materials.
We next report the simulation results for two data structures of the predictor X ( t ) . Because fitting curves with several basis functions are not reasonable for functional data that cannot be approximated by a few principal components, the performance of T n and T n p will be compared based on the change in the number of basis functions used in the fitting curves, and the performance of the proposed test T n p when the number of basis functions used to fit curves is large enough (at this point, the value of T n cannot be calculated) will be presented.
1. The predictor X ( t ) = j = 1 50 ξ j ϕ j ( t ) , where ξ j follows a normal distribution with mean zero and variance λ j = 10 ( ( j 0.5 ) π ) 2 , ϕ j ( t ) = 2 ( j 0.5 ) π t for j = 1 , 2 , , 50 . The slope function α ( t ) = c ( 2 sin ( π t / 2 ) + 3 2 sin ( 3 π t / 2 ) ) with c varying from 0 to 0.2. c = 0 corresponds to the null hypothesis. The number of basis functions used to fit curves and the sample size are taken as: p = 11 , 49 , n = 50 , 100 . Under different error distributions, Table 1 and Table 2 evaluate the empirical size and power of both tests for different non-parametric functions when the nominal level α is 0.05 .
From Table 1 and Table 2, it can be seen that, (i) For different error distributions and different non-parametric functions, the performance of both tests is stable; (ii) Due to the asymptotic distribution of T n p is for functional data that cannot be approximated by a few principal components, the power of proposed test is slightly lower than that of T n . (iii) As the sample size n increases, the power increases, while the power does not increase or decrease significantly as the value of p increases. In fact, for this data structure of the predictor, no matter how many basis functions are selected to fit the function curves, the number of principal components we finally select is very small.
2. The functional predictor is generated based on the expansion (8), where ϕ k ’s are Fourier basis functions on [0,1] defined as ϕ 1 ( t ) = 1 , ϕ 2 ( t ) = 2 sin ( 2 π t ) , ϕ 3 ( t ) = 2 cos ( 2 π t ) , ϕ 4 ( t ) = 2 sin ( 4 π t ) , ϕ 4 ( t ) = 2 cos ( 4 π t ) , . The first p of these basis functions will be used to generate the prediction function and slope function. Let X i ( t ) = j = 1 p ξ i j ϕ j ( t ) , α ( t ) = j = 1 p β j ϕ j ( t ) , where p = 11 , 49 , 201 , 365 , n = 50 , 100 , 200 , the coefficient of slope function { β i = | β | / p , i = 1 , , p } with | β | 2 = c 10 2 and c varying from 0 to 1. c = 0 corresponds to the case in which H 0 is true. The coefficients of predictor ξ i j follow the moving average model:
ξ i j = ρ 1 Z i j + ρ 2 Z i ( j + 2 ) + + ρ T Z i ( p + T 1 ) ,
where the constant T controls the range of dependence between the components of predictor X ( t ) . { Z i j , Z i ( j + 1 ) , , Z i ( p + T 1 ) } are independently generated from the normal distribution N ( 0 , I p + T 1 ) with T = 10 . The element at ( j , k ) position of the covariance matrix Σ for coefficient vector ξ i is l = 1 T | j k | ρ l ρ l + | j k | I { | j k | < T } , where { ρ k , k = 1 , , T } is independently generated from the uniform distribution on [0,1].
The Epanechnikov kernel is adopted in estimating non-parametric part g ( · ) . we select the bandwidth with cross validation (CV). When the significant level α is 0.05, Table 3 and Table 4 show the empirical size and power of both tests for different non-parametric functions with different error distributions.
From Table 3 and Table 4, the number of basis functions used for fitting functions has a very important impact on the test. Specifically, (i) Under different error distributions, with the increase of p, the empirical size of test T n is much greater than the nominal level, while our proposed test T n p has a stable performance; (ii) As the sample size n increases, the power increases, while the power decreases as the value of p increase. In fact, for this data structure of the predictor, the number of selected principal components is too large to make the test statistics based on FPCA perform well. Instead, the proposed test has great advantages(see bold numbers in Table 3 and Table 4).
Furthermore, to verify the asymptotic theory of our proposed test, when ( n , p ) = ( 200 , 365 ) , Figure 1 and Figure 2 draw the null distributions and the q-q plots of T n p for g ( u ) = 2 u and g ( u ) = 2 + s i n ( 2 π u ) , respectively. The null distributions are represented by the dashed lines, while the solid lines are density function curves of standard normal distributions. For different n , p , Figure 3 and Figure 4 respectively show the empirical power functions of the proposed test statistics under four different error distribution functions when the non-parametric function is a linear function and a trigonometric function. When ( n , p ) = ( 200 , 201 ) , ( 100 , 201 ) , ( 200 , 365 ) , the empirical power functions of the proposed test are represented by solid lines, dashed lines and dotted lines respectively. It can be seen that from Figure 3 and Figure 4, the power increases rapidly as long as c increases slightly. As the sample size n increases, the power increases, while the power decreases as p increases. The proposed test is stable under different error distributions. These are consistent with the conclusions in Table 3 and Table 4 (i.e. p < n ).

5. Application

This section applies the proposed test to the spectral data, which has been described and analyzed in the literature (see [25,26]). This dataset can be obtained on the following platforms: http://lib.stat.cmu.edu/datasets/tecator. There are 215 meat samples. Each sample contains chopped pure meat, which contains different absorption spectra and fat, protein and water contents. The observations of the spectral measurement are some curves, denoted by X i ( · ) , which corresponds to the absorbance measured on the grid with 100 wavelengths ranging from 850nm to 1050nm in step. Fat, protein and water content (in percentage) are measured by chemical analysis method. Denote the fat contents as response variable Y i , the protein contents as Z i , and the moisture content as U i . Similar to [27], the following two models will be used to assume the relationship between them:
Y i = 850 1050 α ( t ) X i ( t ) d t + g ( Z i ) + ε i ,
Y i = 850 1050 α ( t ) X i ( t ) d t + g ( U i ) + ε i .
Here we mainly study the test in models (14) and (15): α ( t ) = 0 . The number of basis functions used for fitting function curves p is selected as 129. Figure 5 shows the estimation of slope function α ( t ) in models (14) and (15).
The calculation results are as follows: (i) For model (14), the value of statistic is T n p = 31.186 ; p-value is 0. (ii) For model (15), the value of statistic is T n p = 0.867 ; p-values are 0.386.
From this we can seen that the conclusions. The test on model (14) is significant, while the test on model (15) is not significant. This result can also be reflected from Figure 5. It is obvious that the estimated value of α ( t ) on the right side of Figure 5 is much smaller than that on the left side.

6. Conclusion

In this paper, we constructed a U-statistic of order-two for testing linearity between the functional predictor and the response variable in functional partially linear regression model. The proposed test procedure didn’t depend on the estimate of covariance operator of predictor function. The asymptotic distribution of proposed test statistic is normal under a null hypothesis and a local alternative assumption. Furthermore, numerical simulations show that our proposed test performs well when functional data cannot be approximated by a few principal components. Finally, the real data has applied to our proposed test to verify its feasibility.

Author Contributions

Conceptualization, F.Z. and B.Z.; methodology, B.Z.; software, F.Z.; validation, B.Z.; formal analysis, F.Z.; investigation, F.Z.; resources, F.Z.; data curation, F.Z.; writing—original draft preparation, F.Z.; writing—review and editing, B.Z.; visualization, B.Z.; supervision, B.Z.; project administration, B.Z.; funding acquisition, F.Z. and B.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Nos.12271370,

Data Availability Statement

Not applicable.

Acknowledgments

The authors are grateful to the editor, associate editor, and referees for reviewing this manuscript, and hope to receive constructive comments and suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

We present some lemmas in order to complete the proofs of Theorem 1 and Theorem 2. Without loss of generality, we assume that and μ 0 ( t ) = 0 and E [ g ( U ) ] = 0 in the sequel. Let C n = log ( 1 / h ) / ( n h ) + h 2 . With reference to the asymptotic theory of non-parametric estimation, the pseudo estimation of non-parametric function satisfies sup u Ω g ˇ ( u ) g ( u ) = O p ( C n ) . Denote D g ( U i ) = g ( U i ) g ˇ ( U i ) , D μ i t = μ i t μ ^ i t , for i = 1 , 2 , , n . Similarly as the lemmas in [21], it is easy to derive the following lemmas.
Lemma A1. 
If (C1), (C3) and (C4) hold, for any square matrix M,it can be shown that
( i ) E Z 1 Z 1 T M Z 1 Z 1 T = M + M T + tr ( M ) I p + Δ diag ( M ) ; ( i i ) E Z 1 Z 2 T M Z 2 Z 1 T = tr ( M ) I p ; ( i i i ) E ( X 1 μ 1 t , α X 1 μ 1 t , X 2 μ 2 t X 2 μ 2 t , α ) 2 = o ( tr ( Σ * 2 ) ) .
Lemma A2. 
If (C1-C3) and (C5-C9) hold, then we have
( i ) E X ˇ 1 , X ˇ 2 4 = o ( n tr 2 ( Σ * 2 ) ) ; ( i i ) E C * ( X ˇ 1 ) , X ˇ 1 2 = o ( n tr 2 ( Σ * 2 ) ) .
Lemma A3. 
If (C1-C9) hold, we can get
( i ) E X ˇ 1 , X ˇ 1 = O ( tr ( Σ * ) ) ; ( i i ) E Y ˇ 1 2 = O ( 1 ) ; ( i i i ) E X ˇ ¯ 12 , X ˇ ¯ 12 = O tr ( Σ * ) / n ; ( i v ) E Y ˇ ¯ 12 2 = O C n 2 ; ( v ) E X ˇ ¯ 12 , X ˇ ¯ 12 Y ˇ ¯ 12 2 = O tr ( Σ * ) / n 2 ,
where X ˇ ¯ i j ( t ) , Y ˇ ¯ i j represent the sample means of X ˇ ( t ) and Y ˇ without ith and jth samples, for i , j = 1 , 2 , , n . That is
X ˇ ¯ i j ( t ) = 1 n 2 k i , j X ˇ k ( t ) , Y ˇ ¯ i j = 1 n 2 k i , j Y ˇ k .
Proof. Proof of Theorem 1. Write
n n 2 Δ i j X ˇ P i j ( 1 ) + P i j ( 2 ) + P i j ( 3 ) + P i j ( 4 ) , n n 2 Δ i j Y ˇ L i j ( 1 ) + L i j ( 2 ) + L i j ( 3 ) + L i j ( 4 ) ,
where
P i j ( 1 ) = 1 1 n X ˇ i , X ˇ j , P i j ( 2 ) = 1 2 n X ˇ i , X ˇ i + X ˇ j , X ˇ j 2 E X ˇ 1 , X ˇ 1 , P i j ( 3 ) = 1 2 n X ˇ i + X ˇ j , X ˇ ¯ i j , P i j ( 4 ) = 1 2 n X ˇ ¯ i j , X ˇ ¯ i j E [ X ˇ 1 , X ˇ 1 ] n 2 , L i j ( 1 ) = 1 1 n Y ˇ i Y ˇ j , L i j ( 2 ) = 1 2 n Y ˇ i 2 + Y ˇ j 2 2 E Y ˇ 1 2 , L i j ( 3 ) = 1 2 n Y ˇ i + Y ˇ j Y ˇ ¯ i j , L i j ( 4 ) = 1 2 n Y ˇ ¯ i j 2 E Y ˇ 1 2 n 2 ,
then the expectation of test statistic T n p is:
E [ T n p ] = j < i l , k = 1 4 E P i j ( l ) L i j ( k ) .
To prove the conclusion (i) in Theorem 1, it needs to be calculated one by one for ( l , k ) , l , k = 1 , 2 , 3 , 4 . Because of the similarity of calculation in different cases of ( l , k ) , here we mainly consider the case where ( l , k ) = ( 1 , 1 ) ,
E P i j ( 1 ) L i j ( 1 ) G 1 ( 1 , 1 ) + G 2 ( 1 , 1 ) + G 3 ( 1 , 1 ) + G 4 ( 1 , 1 ) + G 5 ( 1 , 1 ) + G 6 ( 1 , 1 ) ,
where
G 1 ( 1 , 1 ) = ( n 1 ) 2 n 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) D g ( U 2 ) , G 2 ( 1 , 1 ) = ( n 1 ) 2 n 2 E [ X 1 μ ^ 1 t , α X 1 μ ^ 1 t , X 2 μ ^ 2 t X 2 μ ^ 2 t , α ] , G 3 ( 1 , 1 ) = ( n 1 ) 2 n 2 E [ X 1 μ ^ 1 t , X 2 μ ^ 2 t ε 1 ε 2 ] , G 4 ( 1 , 1 ) = 2 ( n 1 ) 2 n 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 2 μ ^ 2 t , α D g ( U 1 ) , G 5 ( 1 , 1 ) = 2 ( n 1 ) 2 n 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) ε 2 , G 6 ( 1 , 1 ) = 2 ( n 1 ) 2 n 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , α ε 2 .
For the above six items, we will analyze them one by one. Firstly, we consider the first term. Since
E X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) D g ( U 2 ) = 2 E [ D μ 1 t , X 2 μ 2 t D g ( U 1 ) D g ( U 2 ) ] + E [ D μ 1 t , D μ 2 t D g ( U 1 ) D g ( U 2 ) ] = O W 12 W 21 ξ 2 T ξ 2 μ T ( U 2 ) ξ 2 g 2 ( U 1 ) = O tr ( Σ * ) / n 2 h ,
then G 1 ( 1 , 1 ) = o tr ( Σ * 2 ) / n holds. For the second term, we have
E [ X 1 μ ^ 1 t , α X 1 μ ^ 1 t , X 2 μ ^ 2 t X 2 μ ^ 2 t , α ] = E [ X 1 μ 1 t , α X 1 μ 1 t , X 2 μ 2 t X 2 μ 2 t , α ] + 2 E D μ 1 t , α D μ 1 t , D μ 2 t D μ 2 t , α + 2 E X 1 μ 1 t , α D μ 1 t , D μ 2 t D μ 2 t , α + 2 E X 1 μ 1 t , α X 2 μ 2 t , D μ 1 t D μ 2 t , α + 2 E X 1 μ 1 t , α X 1 μ 1 t , D μ 2 t D μ 2 t , α + 2 E D μ 1 t , α X 1 μ 1 t , D μ 2 t D μ 2 t , α + 2 E X 1 μ 1 t , α D μ 1 t , D μ 2 t X 2 μ 2 t , α = C * ( α ) 2 + O 2 β Σ * 2 β / n h + β T Σ * β tr ( Σ * ) / n 2 .
Combined with (C1),(C3) and (C9), G 2 ( 1 , 1 ) = C * ( α ) 2 + o tr ( Σ * 2 ) / n holds. The error term ε i with mean zero is independent of the predictor, hence it is easy to see that both the third term G 3 ( 1 , 1 ) and the sixth term G 6 ( 1 , 1 ) are zero. For the other two cross terms G 4 ( 1 , 1 ) and G 5 ( 1 , 1 ) ,we need to prove that they are high-order infinitesimals of tr ( Σ * 2 ) / n . In fact,
E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 2 μ ^ 2 t , α D g ( U 1 ) = E X 1 μ 1 t , α X 1 μ 1 t , D μ 2 t D g ( U 2 ) + E D μ 1 t , α D μ 1 t , X 2 μ 2 t D g ( U 2 ) + E D μ 1 t , α X 1 μ 1 t , D μ 2 t D g ( U 2 ) + E X 1 μ 1 t , α D μ 1 t , D μ 2 t D g ( U 2 ) + E D μ 1 t , α D μ 1 t , D μ 2 t D g ( U 2 ) = O n E [ W 23 2 β T ( ξ 1 ξ 1 T ξ 1 μ T ( U 1 ) ) ξ 3 g ( U 3 ) ] = O E [ β T Σ * ( U 1 ) μ ( U 2 ) g ( U 2 ) f 1 ( U 2 ) ] / n h = o tr ( Σ * 2 ) / n .
Finally, for G 5 ( 1 , 1 ) , we have
E X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) ε 2 = E [ X 1 μ 1 t + D μ 1 t , X 2 μ 2 t + D μ 2 t ( W 12 ε 2 2 ) ] = E [ ( ξ 1 μ ( U 1 ) ) T ξ 1 W 21 W 12 σ 2 ] + E [ W 12 2 ξ 2 T ( ξ 2 μ ( U 1 ) ) σ 2 ] = O tr ( Σ * ) / n 2 h .
Using (C3) and the following fact tr 2 ( Σ * ) p tr ( Σ * 2 ) , we obtained tr ( Σ * ) / tr ( Σ * 2 ) = o ( n h ) , i.e. A 5 ( 1 , 1 ) = o tr ( Σ * 2 ) / n . Then, it can be seen that
E P i j ( 1 ) L i j ( 1 ) = C * ( α ) 2 + o tr ( Σ * 2 ) / n .
For the rest, refer to calculation of the above mean E P i j ( 1 ) L i j ( 1 ) and the proof of Theorem 1 in [21]. The proof of the conclusion (i) in Theorem 1 is completed. The conclusion (ii) of Theorem 1 can be found in the proof of Theorem 2, here we omit it. □
Proof. Proof of Theorem 2
By the Throrem 1, we have the fact
n ( E ( T n p ) C * ( α ) 2 ) σ 2 2 tr ( Σ * 2 ) = o ( 1 ) ,
then we only need to prove that
n ( T n p E T n p ) σ 2 2 tr ( Σ * 2 ) D N ( 0 , 1 ) .
Denote T n p ( k , l ) = n n 2 1 i > j P i j ( k ) L i j ( l ) E P i j ( k ) L i j ( l ) , where k , l = 1 , 2 , 3 , 4 . Then we have
n ( T n p E T n p ) = k = 1 4 l = 1 4 T n p ( k , l ) .
In order to obtain the asymptotic properties of above equation, we will find the asymptotic order of all terms T n p ( k , l ) . These items are divided into the following two groups according to the treatment methods.
  • Group 1: ( k , l ) = ( 1 , 1 ) , ( 1 , 2 ) , ( 1 , 3 ) , ( 1 , 4 ) , ( 2 , 1 ) , ( 2 , 3 ) , ( 3 , 1 ) , ( 3 , 3 ) , ( 4 , 1 ) , ( 4 , 3 ) .
  • Group 2: ( k , l ) = ( 2 , 2 ) , ( 2 , 4 ) , ( 3 , 2 ) , ( 3 , 4 ) , ( 4 , 2 ) , ( 4 , 4 ) .
Since the methods are similar, the cases of ( k , l ) = ( 1 , 1 ) and ( k , l ) = ( 2 , 2 ) will be considered respectively in detail in each group. Firstly, for T n p ( 1 , 1 ) , we can rewrite
T n p ( 1 , 1 ) T n p , 1 ( 1 , 1 ) + T n p , 2 ( 1 , 1 ) + T n p , 3 ( 1 , 1 ) + T n p , 4 ( 1 , 1 ) + T n p , 5 ( 1 , 1 ) + T n p , 6 ( 1 , 1 ) + T n p , 7 ( 1 , 1 ) + T n p , 8 ( 1 , 1 ) + T n p , 9 ( 1 , 1 ) ,
where
T n p , 1 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t D g ( U i ) D g ( U j ) E [ X i μ ^ i t , X j μ ^ j t D g ( U i ) D g ( U j ) ] } , T n p , 2 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t X j μ ^ j t , α D g ( U i ) E [ X i μ ^ i t , X j μ ^ j t ] X j μ ^ j t , α D g ( U i ) } , T n p , 3 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t X i μ ^ i t , α D g ( U j ) E [ X i μ ^ i t , X j μ ^ j t ] X i μ ^ i t , α D g ( U j ) } , T n p , 4 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t D g ( U i ) ε j E [ X i μ ^ i t , X j μ ^ j t D g ( U i ) ε j ] } , T n p , 5 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t D g ( U j ) ε i E [ X i μ ^ i t , X j μ ^ j t D g ( U j ) ε i ] } , T n p , 6 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i X i μ ^ i t , X j μ ^ j t X j μ ^ j t , α ε i , T n p , 7 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i X i μ ^ i t , X j μ ^ j t X i μ ^ i t , α ε j , T n p , 8 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , α X i μ ^ i t , X j μ ^ j t X j μ ^ j t , α E [ X i μ ^ i t , α X i μ ^ i t , X j μ ^ j t X j μ ^ j t , α ] } , T n p , 9 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i X i μ ^ i t , X j μ ^ j t ε i ε j .
To prove (A3), We shall prove
T n p ( 1 , 1 ) E T n p ( 1 , 1 ) σ 2 2 tr ( Σ * 2 ) = T n p , 91 ( 1 , 1 ) σ 2 2 tr ( Σ * 2 ) + o p ( 1 ) ,
where T n p , 91 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i X i μ i t , X j μ j t ε i ε j .
It is easy to see that the means of 9 items in the right equation of (A4) are all zero. Then in order to calculate their asymptotic order, it is necessary to prove their second moment. Due to the similarity of calculation of the first 8 items, we use the first item T n p , 1 ( 1 , 1 ) as an example to consider.
E ( T n p , 1 ( 1 , 1 ) ) 2 = n 2 n 2 2 i = 2 n E Q i , 1 ( 1 , 1 ) Q i , 1 ( 1 , 1 ) + n 2 n 2 2 i = 2 n j i E Q i , 1 ( 1 , 1 ) Q j , 1 ( 1 , 1 ) + o ( tr ( Σ * 2 ) ) ,
where
Q i , 1 ( 1 , 1 ) = j = 1 i 1 X i μ ^ i t , X j μ ^ j t D g ( U i ) D g ( U j ) .
Let’s calculate E Q i , 1 ( 1 , 1 ) Q i , 1 ( 1 , 1 ) and E Q i , 1 ( 1 , 1 ) Q j , 1 ( 1 , 1 ) ( i j ) .
E Q i , 1 ( 1 , 1 ) Q i , 1 ( 1 , 1 ) = ( i 1 ) E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) 2 D g ( U 2 ) 2 + ( i 1 ) ( i 2 ) E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , X 3 μ ^ 3 t D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) ( i 1 ) B 11 ( 1 , 1 ) + ( i 1 ) ( i 2 ) B 12 ( 1 , 1 ) ,
E Q i , 1 ( 1 , 1 ) Q j , 1 ( 1 , 1 ) = Ξ 1 E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , X 3 μ ^ 3 t D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) + Ξ 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 3 μ ^ 3 t , X 4 μ ^ 4 t D g ( U 1 ) D g ( U 2 ) D g ( U 3 ) D g ( U 4 ) Ξ 1 B 12 ( 1 , 1 ) + Ξ 2 B 13 ( 1 , 1 ) ,
where Ξ 1 = ( i 1 ) ( j 1 ) , Ξ 2 = ( i 1 ) ( j 1 ) ( i 1 ) ( j 1 ) .
Using the Cauchy-Schwarz inequality and Lemma A.2, we can get
B 11 ( 1 , 1 ) = O ( C n 4 n tr ( Σ * 2 ) ) .
For B 12 ( 1 , 1 ) and B 13 ( 1 , 1 ) ,
B 12 ( 1 , 1 ) = E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , X 3 μ ^ 3 t D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) = O ( E W 23 tr ( Σ * ( U 1 ) Σ * ( U 3 ) ) D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) + E W 13 W 21 tr ( Σ * ( U 1 ) Σ * ( U 3 ) ) D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) + E W 12 W 13 tr ( Σ * ( U 1 ) ) tr ( Σ * ( U 3 ) ) D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) + E μ T ( U 1 ) μ ( U 2 ) μ T ( U 3 ) μ ( U 1 ) D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) ) = O ( E [ tr ( Σ * ( U 1 ) Σ * ( U 2 ) ) g 2 ( U 1 ) g ( U 2 ) ] / n 2 + E 2 [ tr ( Σ * ( U 1 ) ) g 2 ( U 1 ) ] / n 3 + E [ μ T ( U 2 ) μ ( U 1 ) μ T ( U 1 ) μ ( U 2 ) g ( U 1 ) / n 2 ] + E [ μ T ( U 2 ) μ ( U 1 ) μ T ( U 1 ) μ ( U 2 ) g 2 ( U 1 ) g 2 ( U 2 ) / n 2 ] ) ,
B 13 ( 1 , 1 ) = O ( E [ D μ T ( U 1 ) D μ ( U 2 ) ( μ ( U 3 ) μ ^ ( U 3 ) ) T ( μ ( U 4 ) μ ^ ( U 4 ) ) D g ( U 1 ) D g ( U 2 ) D g ( U 3 ) D g ( U 4 ) ] ) = O μ T ( U 1 ) μ ( U 2 ) μ T ( U 3 ) μ ( U 4 ) g ( U 1 ) g ( U 2 ) g ( U 3 ) g ( U 4 ) / n 2 .
So we can have T n p , 1 ( 1 , 1 ) = o p tr ( Σ * 2 ) . Apply similar methods to the T n p , 1 ( 1 , 1 ) , the terms { T n p , k ( 1 , 1 ) , k = 2 , , 7 } are all equal to o p ( tr ( Σ * 2 ) ) . For T n p , 9 ( 1 , 1 ) , rewrite
T n p , 9 ( 1 , 1 ) T n p , 91 ( 1 , 1 ) + T n p , 92 ( 1 , 1 ) + T n p , 93 ( 1 , 1 ) + T n p , 94 ( 1 , 1 ) ,
where
T n p , 91 ( 1 , 1 ) = 2 ( n 1 ) n 2 i = 2 n j = 1 i 1 X i μ i t , X j μ j t ε i ε j , T n p , 92 ( 1 , 1 ) = 2 ( n 1 ) n 2 i = 2 n j = 1 i 1 X i μ i t , D μ j t ε i ε j , T n p , 93 ( 1 , 1 ) = 2 ( n 1 ) n 2 i = 2 n j = 1 i 1 X j μ j t , D μ i t ε i ε j , T n p , 94 ( 1 , 1 ) = 2 ( n 1 ) n 2 i = 2 n j = 1 i 1 D μ i t , D μ j t ε i ε j .
Since the means of above four formulas are zero, in order to prove that (A5) is true, it is necessary to verify the second moments of T n p , 9 k ( 1 , 1 ) , k = 2 , 3 , 4 are the high-order infinitesimal of quantity tr ( Σ * 2 ) . In fact,
E [ T n p , 92 ( 1 , 1 ) ] 2 = E [ T n p , 93 ( 1 , 1 ) ] 2 = 4 ( n 1 ) 2 σ 4 n 4 i = 2 n ( i 1 ) E X 1 μ 1 t , D μ 2 t 2 = O E [ X 1 μ 1 t , D μ 2 t X 1 μ 1 t , D μ 2 t ] = O n E [ W 23 2 ξ 3 T Σ * ( U 1 ) ξ 3 ] = o ( tr ( Σ * 2 ) ) ,
E [ T n p , 94 ( 1 , 1 ) ] 2 = O E [ D μ i t , D μ j t D μ i t , D μ j t ] = O E [ tr ( Σ * ( U 3 ) μ ( U 1 ) μ T ( U 1 ) f 1 ( U 3 ) ) ] / n h + O tr 2 ( Σ * ) / n 2 h = o ( tr ( Σ * 2 ) ) .
Then the equation (A5) holds. Similarly, for Group 2, that is, when ( k , l ) = ( 2 , 2 ) , ( 2 , 4 ) , ( 3 , 2 ) , ( 3 , 4 ) , ( 4 , 2 ) , ( 4 , 4 ) , there is a similar proof process for the asymptotic behavior of each item in the group, here we only consider T n p ( 2 , 2 ) . By careful calculation,
E [ ( X ˇ 1 , X ˇ 1 + X ˇ 2 , X ˇ 2 2 E ( X ˇ 1 , X ˇ 1 ) ) 2 ] = O ( E [ X 1 μ 1 t , X 1 μ 1 t 2 ] + E 2 [ X 1 μ 1 t , X 1 μ 1 t ] + E [ X 1 μ 1 t , X 1 μ 1 t X 2 μ 2 t , X 2 μ 2 t ] ) = O ( E [ ( ξ 1 μ ( U 1 ) ) T ( ξ 1 μ ( U 1 ) ) ( ξ 1 μ ( U 1 ) ) T ( ξ 1 μ ( U 1 ) ) ] + E [ ( ξ 1 μ ( U 1 ) ) T ( ξ 1 μ ( U 1 ) ) ( ξ 2 μ ( U 2 ) ) T ( ξ 2 μ ( U 2 ) ) ] + E 2 [ ( ξ 1 μ ( U 1 ) ) T ( ξ 1 μ ( U 1 ) ) ] ) = O ( E [ 2 tr ( Σ * 2 ( U 1 ) ) + tr 2 ( Σ * ) + Δ tr ( diag ( Γ T ( U 1 ) Γ ( U 1 ) ) Γ T ( U 1 ) Γ ( U 1 ) ) ] ) .
Using the fact E [ tr ( diag ( Γ T ( U 1 ) Γ ( U 1 ) ) Γ T ( U 1 ) Γ ( U 1 ) ) ] = E [ tr ( Σ * 2 ( U 1 ) ) ] = O ( tr ( Σ * 2 ) ) , we have
E [ ( X ˇ 1 , X ˇ 1 + X ˇ 2 , X ˇ 2 2 E ( X ˇ 1 , X ˇ 1 ) ) 2 ] = O ( tr 2 ( Σ * ) + tr ( Σ * 2 ) ) .
In addition, by a simple calculation,
E [ ( Y ˇ 1 2 + Y ˇ 2 2 2 E [ Y ˇ 1 2 ] ) 2 ] = 2 E [ ( Y ˇ 1 2 ) 2 ] + 2 E [ Y ˇ 1 2 Y ˇ 2 2 ] 4 E 2 [ Y ˇ 1 2 ] = O ( 1 ) .
Combined (A6), (A7) with Cauchy-Schwarz inequality, we have
E | T n p ( 2 , 2 ) | 1 4 n E [ ( X ˇ 1 , X ˇ 1 + X ˇ 2 , X ˇ 2 2 E ( X ˇ 1 , X ˇ 1 ) ) 2 ] E [ Y ˇ 1 2 + Y ˇ 2 2 2 E ( Y ˇ 1 2 ) ] 2 = o tr ( Σ * 2 ) .
Denote T ˇ n p 1 n 2 1 / 2 i = 2 n j = 1 i 1 X i μ i t , X j μ j t ε i ε j , by condition (C1), we only need to consider T ˇ n p = n 2 1 / 2 i = 2 n j = 1 i 1 ( ξ i μ ( U i ) ) T ( ξ j μ ( U j ) ) ε i ε j . Then, by Slutsky’s theorem, if the following conclusion can be obtained, Theorem 2 will be proved.
T ˇ n p var ( T ˇ n p ) D N ( 0 , 1 ) .
By some simple calculations, we have var ( T ˇ n p ) = σ 4 tr ( Σ * 2 ) . Let Z n i = j = 1 i 1 X i μ i t , X j μ j t ε i ε j / n 2 , v n i = E [ Z n i 2 | F i 1 ] , X u i = ( ξ i , U i , ε i ) T , where F i = σ { X u 1 , , X u i } is a σ -algebra produced by { X u i , k = 1 , , i } , V n = i = 2 n v n i . It is easy to check E [ Z n i | F i ] = 0 , and { i = 2 j Z n i , F j : 2 j n } is a martingale with mean 0. The martingale central limit theorem follows if we can check
V n var ( T ˇ n p ) P 1 , as n ;
i = 2 n σ 4 tr 1 ( Σ * 2 ) E { Z n i 2 I ( | Z n i | > η σ 2 tr ( Σ * 2 ) ) | F i 1 } P 0 for η > 0 .
Note that
v n i = σ 2 n 2 j = 1 i 1 ε j 2 ( ξ j μ ( U j ) ) T Σ * ( ξ j μ ( U j ) ) + 2 k < l < i ε k ε l ( ξ k μ ( U k ) ) T Σ * ( ξ l μ ( U l ) ) .
Then we define
V n var ( T ˜ n p ) C n 1 + C n 2 ,
where
C n 1 = 1 n 2 σ 2 tr ( Σ * 2 ) j < i ε j 2 ( ξ j μ ( U j ) ) T Σ * ( ξ j μ ( U j ) ) , C n 2 = 2 n 2 σ 2 tr ( Σ * 2 ) k < l < i ε k ε l ( ξ k μ ( U k ) ) T Σ * ( ξ l μ ( U l ) ) .
It is easy to check E [ C n 1 ] = 1 , and
var ( C n 1 ) = E [ C n 1 2 ] 1 = 1 n 4 ( σ 4 Σ * 2 ) 2 i = 2 n ( ( i 1 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) σ 4 ε 1 4 ] + ( i 1 ) ( i 2 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 2 μ ( U 2 ) ) T Σ * ( ξ 2 μ ( U 2 ) ) σ 4 ε 1 2 ε 2 2 ] ) + 1 n 4 ( σ 4 Σ * 2 ) 2 i = 2 n j i ( ( i 1 ) ( j 1 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) σ 4 ε 1 4 ] + ( ( i 1 ) ( j 1 ) ( i 1 ) ( j 1 ) ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 2 μ ( U 2 ) ) T Σ * ( ξ 2 μ ( U 2 ) ) σ 4 ε 1 2 ε 2 2 ] ) 1 = E [ ε 1 4 ] n σ 4 tr 2 ( Σ * 2 ) O ( E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ] ) = E [ ε 1 4 ] n σ 4 tr 2 ( Σ * 2 ) O ( E [ tr ( Σ * ( U 1 ) Σ * Σ * ( U 1 ) Σ * ) ] + tr 2 ( Σ * ( U 1 ) Σ * ) + Δ tr ( diag ( Γ T ( U 1 ) Σ * Γ ( U 1 ) ) Γ T ( U 1 ) Σ * Γ ( U 1 ) ) ) = E [ ε 1 4 ] n σ 4 tr 2 ( Σ * 2 ) O ( tr ( Σ * 4 ) + tr 2 ( Σ * 2 ) ) ,
By (C2), we have C n 1 P 1 . Similarly, we can obtain E [ C n 2 ] = 0 , and
var ( C n 2 ) = E [ C n 2 2 ] = O ( 2 n 2 2 tr 2 ( Σ * 2 ) i = 2 n ( ( i 1 ) ( i 2 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 2 μ ( U 2 ) ) ( X 2 μ ( U 2 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ] + i = 2 n j i ( i 1 ) ( j 1 ) ( ( i 1 ) ( j 1 ) 1 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 2 μ ( U 2 ) ) ( ξ 2 μ ( U 2 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ] ) ) = O tr ( Σ * 4 ) tr 2 ( Σ * 2 ) .
Combined with tr ( Σ * 4 ) = o ( tr 2 ( Σ * 2 ) ) , then we have C n 2 P 0 . Thus, equation (A8) holds. Finally, we only need to prove (A9). Therefore, using the law of large numbers, and the fact E [ Z n i 2 I ( | Z n i | > η σ 2 tr ( Σ * 2 ) ) ] E ( Z n i 4 | F i 1 ) / ( η 2 σ 4 tr ( Σ * 2 ) ) , we only need to show that 2 i n E ( Z n i 4 ) = o ( tr 2 ( Σ * 2 ) ) . By simple calculation, we can get
i = 2 n E [ Z n i 4 ] = 1 n 2 2 j < i E [ ( ( ξ i μ ( U i ) ) T ( ξ j μ ( U j ) ) ) 4 ε i 4 ε j 4 ] + i = 2 n E [ Z n i 4 ] = 1 n 2 2 j < i k j E [ ( ( ξ i μ ( U i ) ) T ( ξ j μ ( U j ) ) ) 2 ( ( ξ i μ ( U i ) ) T ( ξ k μ ( U k ) ) ) 2 ε i 4 ε j 2 ε k 2 ] = E 2 ( ε 1 4 ) n 2 2 i = 2 n ( i 1 ) E [ ( ξ 1 μ ( U 1 ) ) T ( ξ 2 μ ( U 2 ) ) ] 4 + 3 σ 4 E ( ε 1 4 ) n 2 2 i = 2 n ( i 1 ) ( i 2 ) E ( ( ξ 1 μ ( U 1 ) ) T ( ξ 2 μ ( U 2 ) ) ) 2 ( ( ξ 1 μ ( U 1 ) ) T ( ξ 3 μ ( U 3 ) ) 2 ) = O E [ ( ξ 1 μ ( U 1 ) ) T ( ξ 2 μ ( U 2 ) ) ] 4 n 2 + O E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ] 2 n
Combined (C2) and Lemma 2, equation (A9) holds. Thus, the proof of Theorem 2 completed. □

References

  1. Crainiceanu, C.M.; Staicu, A.M.; Di, C.Z. Generalized multilevel functional regression. Journal of the American Statistical Association 2009, 104, 1550–1561. [Google Scholar] [CrossRef] [PubMed]
  2. Leng, X. Müller, H.G. Time ordering of gene co-expression. Biostatistics 2006, 7, 569–584. [Google Scholar] [CrossRef] [PubMed]
  3. Kokoszka, P.; Miao, H.; Zhang, X. Functional Dynamic Factor Model for Intraday Price Curves. Journal of Financial Econometrics 2014, 13, 456–477. [Google Scholar] [CrossRef]
  4. Ignaccolo, R. and Ghigo, S. and Giovenali, E. Analysis of air quality monitoring networks by functional clustering. Environmetrics 2008, 19, 672–686. [Google Scholar] [CrossRef]
  5. Yao, F.; Müller, H.G. Functional quadratic regression. Biometrika 2010, 97, 49–64. [Google Scholar] [CrossRef]
  6. Lian, H. Functional partial linear model. Journal of Nonparametric Statistics 2011, 23, 115–128. [Google Scholar] [CrossRef]
  7. Zhou, J.; Chen, M. Spline estimators for semi-functional linear model. Statistics & Probability Letters 2012, 82, 505–513. [Google Scholar] [CrossRef]
  8. Tang, Q. Estimation for semi-functional linear regression. Statistics 2015, 49, 1262–1278. [Google Scholar] [CrossRef]
  9. Cardot, H.; Ferraty, F.; Mas, A.; Sarda, P. Testing hypotheses in the functional linear model. Scandinavian Journal of Statistics 2003, 30, 241–255. [Google Scholar] [CrossRef]
  10. Kokoszka, P.; Maslova, I.; Sojka, J.; Zhu, L. Testing for lack of dependence in the functional linear model. Canadian Journal of Statistics 2008, 36, 207–222. [Google Scholar] [CrossRef]
  11. James, G.M.; Wang, J.; and Zhu, J. Functional linear regression that’s interpretable. The Annals of Statistics 2009, 37, 2083–2108. [Google Scholar] [CrossRef]
  12. Shin, H. Partial functional linear regression. Journal of Statistical Planning and Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
  13. Yu, P.; Zhang, Z.; Du, J. A test of linearity in partial functional linear regression. Metrika 2016, 79, 953–969. [Google Scholar] [CrossRef]
  14. hu, H.; Zhang, R.; Yu, Z.; Lian, H,; Liu, Y. Estimation and testing for partially functional linear errors-in-variables models. Journal of Multivariate Analysis 2019, 170, 296–314. [Google Scholar] [CrossRef]
  15. Cardot, H.; Goia, A.; Sarda, P. Testing for no effect in functional linear regression models, some computational approaches. Communications in Statistics-Simulation and Computation 2004, 33, 179–199. [Google Scholar] [CrossRef]
  16. Zhu, H.; Zhang, R.; Li, H. Estimation on semi-functional linear errors-in-variables models. Communications in Statistics-Theory and Methods 2019, 48, 4380–4393. [Google Scholar] [CrossRef]
  17. Zhou, J,; Peng, Q. Estimation for functional partial linear models with missing responses. Statistics & Probability Letters 2020, 156, 108598. [Google Scholar] [CrossRef]
  18. Zhao, F.; Zhang, B. Testing Linearity in Functional Partially Linear Models. Acta Mathematicae Applicatae Sinica, English Series 2024, 40, 875–886. [Google Scholar] [CrossRef]
  19. Hu, W.; and Lin, N.; Zhang, B. Nonparametric testing of lack of dependence in functional linear models. PLoS ONE 2020, 15, e0234094. [Google Scholar] [CrossRef]
  20. Zhao, F.; Lin, N.; Hu, W.; Zhang, B. A faster U-statistic for testing independence in the functional linear models. Journal of Statistical Planning and Inference 2022, 217, 188–203. [Google Scholar] [CrossRef]
  21. Zhao, F.; Lin, N.; Zhang, B. A new test for high-dimensional regression coefficients in partially linear models. Canadian Journal of Statistics 2023, 51, 5–18. [Google Scholar] [CrossRef]
  22. Cui, H.; Guo, W.; Zhong, W. Test for high-dimensional regression coefficients using refitted cross-validation variance estimation. The Annals of Statistics 2018, 46, 958–988. [Google Scholar] [CrossRef]
  23. Zhong, P.; Chen, S. Tests for High-Dimensional Regression Coefficients With Factorial Designs. Journal of the American Statistical Association 2011, 106, 260–274. [Google Scholar] [CrossRef]
  24. Chen, S.; Zhang, L.; Zhong, P. Tests for high-dimensional covariance matrices. Journal of the American Statistical Association 2010, 105, 810–819. [Google Scholar] [CrossRef]
  25. Ferraty, F.; c and Vieu, P. Nonparametric functional data analysis: theory and practice; Springer: New York, USA, 2006. [Google Scholar]
  26. Shang, H.L. Bayesian bandwidth estimation for a semi-functional partial linear regression model with unknown error density. Computational Statistics 2014, 29, 829–848. [Google Scholar] [CrossRef]
  27. Yu, P.; Zhang, Z.; Du, J. Estimation in functional partial linear composite quantile regression model. Chinese Journal of Applied Probability and Statistics 2017, 33, 170–190. [Google Scholar] [CrossRef]
Figure 1. The null distributions and the q-q plots of our proposed test when g ( u ) = 2 u .
Figure 1. The null distributions and the q-q plots of our proposed test when g ( u ) = 2 u .
Preprints 112592 g001
Figure 2. The null distributions and the q-q plots of our proposed test when g ( u ) = 2 + s i n ( 2 π u ) .
Figure 2. The null distributions and the q-q plots of our proposed test when g ( u ) = 2 + s i n ( 2 π u ) .
Preprints 112592 g002
Figure 3. Empirical power functions of our proposed test when g ( u ) = 2 u .
Figure 3. Empirical power functions of our proposed test when g ( u ) = 2 u .
Preprints 112592 g003
Figure 4. Empirical power functions of our proposed test when g ( u ) = 2 + s i n ( 2 π u ) .
Figure 4. Empirical power functions of our proposed test when g ( u ) = 2 + s i n ( 2 π u ) .
Preprints 112592 g004
Figure 5. (a) The estimator of the slope function in model (14); (b) The estimator of the slope function in model (15)
Figure 5. (a) The estimator of the slope function in model (14); (b) The estimator of the slope function in model (15)
Preprints 112592 g005
Table 1. Empirical size and power when g ( u ) = 2 u for two tests.
Table 1. Empirical size and power when g ( u ) = 2 u for two tests.
( n , p ) c N(0,1) t(3) Γ ( 1 , 1 ) lnorm(0,1)
T n T n p T n T n p T n T n p T n T n p
(50,11) 0 0.069 0.060 0.074 0.077 0.071 0.071 0.084 0.053
0.05 0.097 0.101 0.133 0.102 0.135 0.107 0.148 0.111
0.1 0.250 0.211 0.292 0.244 0.269 0.225 0.337 0.283
0.15 0.474 0.383 0.555 0.448 0.494 0.415 0.576 0.470
0.2 0.713 0.584 0.761 0.631 0.738 0.602 0.755 0.656
(100,11) 0 0.049 0.052 0.055 0.052 0.058 0.059 0.049 0.052
0.05 0.233 0.195 0.275 0.225 0.217 0.195 0.312 0.268
0.1 0.689 0.603 0.746 0.660 0.715 0.618 0.743 0.652
0.15 0.961 0.877 0.956 0.913 0.963 0.899 0.931 0.884
0.2 0.998 0.984 0.986 0.975 0.995 0.975 0.978 0.965
(100,49) 0 0.057 0.060 0.050 0.061 0.051 0.049 0.055 0.047
0.05 0.224 0.203 0.282 0.255 0.236 0.225 0.305 0.288
0.1 0.741 0.607 0.757 0.665 0.718 0.615 0.747 0.659
0.15 0.962 0.900 0.947 0.871 0.950 0.884 0.938 0.886
0.2 0.998 0.981 0.987 0.977 0.997 0.978 0.988 0.969
Table 2. Empirical size and power when g ( u ) = 2 + sin ( 2 π u ) for two tests.
Table 2. Empirical size and power when g ( u ) = 2 + sin ( 2 π u ) for two tests.
( n , p ) c N(0,1) t(3) Γ ( 1 , 1 ) lnorm(0,1)
T n T n p T n T n p T n T n p T n T n p
(50,11) 0 0.062 0.064 0.072 0.069 0.071 0.072 0.060 0.054
0.05 0.087 0.091 0.109 0.096 0.112 0.103 0.117 0.106
0.1 0.235 0.197 0.250 0.219 0.241 0.208 0.293 0.249
0.15 0.420 0.359 0.502 0.419 0.449 0.389 0.506 0.449
0.2 0.658 0.545 0.724 0.605 0.694 0.573 0.735 0.638
(100,11) 0 0.062 0.062 0.050 0.046 0.063 0.060 0.054 0.060
0.05 0.217 0.205 0.239 0.219 0.235 0.208 0.255 0.240
0.1 0.668 0.568 0.714 0.617 0.696 0.605 0.748 0.644
0.15 0.946 0.874 0.947 0.883 0.946 0.852 0.927 0.873
0.2 0.996 0.980 0.993 0.972 1.000 0.979 0.987 0.964
(100,49) 0 0.050 0.062 0.056 0.063 0.047 0.067 0.060 0.056
0.05 0.226 0.216 0.249 0.202 0.209 0.199 0.277 0.256
0.1 0.674 0.562 0.734 0.611 0.692 0.582 0.733 0.639
0.15 0.943 0.873 0.943 0.890 0.941 0.855 0.925 0.889
0.2 0.997 0.976 0.994 0.981 0.998 0.980 0.982 0.955
Table 3. Empirical size and power when g ( u ) = 2 u for two tests.
Table 3. Empirical size and power when g ( u ) = 2 u for two tests.
( n , p ) c N(0,1) t(3) Γ ( 1 , 1 ) lnorm(0,1)
T n T n p T n T n p T n T n p T n T n p
(50,11) 0 0.097 0.066 0.088 0.076 0.096 0.074 0.104 0.059
0.25 0.378 0.454 0.442 0.518 0.389 0.443 0.487 0.572
0.5 0.589 0.695 0.669 0.749 0.610 0.702 0.725 0.783
0.75 0.765 0.832 0.811 0.861 0.769 0.847 0.832 0.882
1 0.865 0.917 0.883 0.925 0.867 0.912 0.889 0.922
(100,11) 0 0.060 0.067 0.086 0.064 0.072 0.058 0.062 0.049
0.25 0.511 0.711 0.582 0.738 0.569 0.739 0.618 0.760
0.5 0.820 0.932 0.857 0.932 0.846 0.922 0.841 0.924
0.75 0.949 0.984 0.943 0.975 0.935 0.973 0.917 0.967
1 0.982 0.998 0.971 0.986 0.971 0.989 0.958 0.984
(100,49) 0 0.245 0.064 0.224 0.058 0.236 0.058 0.202 0.048
0.25 0.541 0.498 0.570 0.543 0.534 0.501 0.563 0.564
0.5 0.754 0.771 0.804 0.807 0.767 0.776 0.769 0.798
0.75 0.884 0.910 0.899 0.936 0.899 0.904 0.879 0.887
1 0.957 0.966 0.949 0.968 0.956 0.964 0.928 0.934
Table 4. Empirical size and power when g ( u ) = 2 + sin ( 2 π u ) for two tests.
Table 4. Empirical size and power when g ( u ) = 2 + sin ( 2 π u ) for two tests.
( n , p ) c N(0,1) t(3) Γ ( 1 , 1 ) lnorm(0,1)
T n T n p T n T n p T n T n p T n T n p
(50,11) 0 0.099 0.069 0.087 0.075 0.086 0.063 0.101 0.064
0.25 0.353 0.420 0.423 0.484 0.383 0.434 0.482 0.538
0.5 0.574 0.661 0.640 0.714 0.602 0.677 0.695 0.758
0.75 0.751 0.806 0.783 0.849 0.750 0.814 0.804 0.860
1 0.841 0.893 0.869 0.913 0.842 0.897 0.874 0.913
(100,11) 0 0.067 0.058 0.060 0.054 0.065 0.055 0.072 0.059
0.25 0.486 0.662 0.545 0.713 0.537 0.697 0.591 0.742
0.5 0.783 0.901 0.819 0.906 0.814 0.907 0.832 0.916
0.75 0.930 0.983 0.918 0.966 0.930 0.980 0.915 0.955
1 0.976 0.996 0.959 0.983 0.977 0.994 0.954 0.972
(100,49) 0 0.236 0.066 0.212 0.066 0.218 0.061 0.243 0.065
0.25 0.543 0.475 0.540 0.506 0.526 0.480 0.658 0.612
0.5 0.744 0.745 0.781 0.791 0.747 0.750 0.835 0.815
0.75 0.885 0.892 0.890 0.911 0.875 0.891 0.913 0.913
1 0.948 0.964 0.937 0.956 0.938 0.955 0.947 0.956
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated