Preprint
Article

Concavity of the Conditional Mle for Logit Panel Data Models with Imputed Covariates

Altmetrics

Downloads

134

Views

27

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

18 July 2023

Posted:

20 July 2023

You are already at the latest version

Alerts
Abstract
In estimating logistic regression models, convergence of the maximization algorithm is critical, however, this may fail. Numerous bias correction methods for maximum likelihood estimates of the parameters have been conducted for cases of complete data sets and also longitudinal models. For binary response fixed effects panel data model, the conditional logit estimator is consistent for balanced data. When faced with missing covariates problem, researchers opt for various imputation techniques to make the data complete and without loss of generality consistent estimates still suffice asymptotically. For maximum likelihood estimates of the parameters for logistic regression in cases of imputed covariates, the optimal choice of an imputation technique which yields the best estimates with minimum variance is still elusive. The main aim of this paper is to examine the behaviour of the Hessian matrix with optimal values of the imputed covariates vector which will make the Newton-Raphson algorithm to converge faster through a reduced absolute value of the product of the score function and the inverse fisher information component. We focus on a method used to modify the conditional likelihood function through partitioning of the covariate matrix. We also confirm from the moduli of the Hessian matrices that the log likelihood of a panel data logistic model has a global maximum as the parameter estimates. Simulation results reveal that model based simulation perform better than classical imputation techniques yielding estimates with smaller bias and higher precision for the conditional maximum likelihood estimation of nonlinear panel models with single fixed effects.
Keywords: 
Subject: Computer Science and Mathematics  -   Probability and Statistics

1. Introduction

Parameter estimation is a key goal of inferential statistics and most researchers attempt to fit data into models that would produce the best of all possible parameter estimates. The motivation behind the parameter estimation is to make inferences about a study population using sample information and this calls for very well spelled out ways of ensuring that unbiased and precise estimates are achieved in every parameter estimation technique so applied. During sample data collection researchers encounter missing values in the study variables, a problem that leads to complications in statistical analyses through inaccurate estimates that may eventually lead to incorrect inferences and policy actions.
Specifically, when the response variable is binary, problems of missing covariates are further compounded by the nonlinear treatment of the model specification. Studies on missingness and parameter estimation have shown that frequentist techniques of imputation result into biased estimates with significant loss of power [1,2,3]. This problem cuts across every model with no exception to the logit model for binary choice response variables and several studies have made attempts to come up with reliable imputation techniques for missing observations so as to reduce the estimates’ bias.
The logistic regression methods are often applied to procedures for describing the relationship between dichotomous outcome variables and other model covariates. The general method of estimating the logistic regression parameter is maximum likelihood and in a very general sense the maximum likelihood method yields values for the unknown parameters that maximize the probability of the observed set of data. Maximum likelihood (ML) method is occasionally prone to convergence problem, which occurs when the maximum likelihood estimates (MLE) do not exist. This subject of assessing the behaviour of MLE for logistic regression model is important, as the applications of the logistic model stretch far and wide across research disciplines. There exist numerous works which discuss convergence problem on logistic regression model by Cox et al. [4] or the bias reduction by Firth D., Anderson J.A. and Richardson C. [5, 6]. Other studies outline many assumptions of the distributions ML estimators resulting from bias reduction technique, and impact of varying sample size to MLE [7, 8].
S. Lee [9] notes that statistical inference based on the logistic regression model is highly dependent on the asymptotic properties of the maximum likelihood estimator. This means that under large sample situations, the sampling distribution of the maximum likelihood (ML) estimators for the logistic regression coefficients is asymptotically unbiased and normal. Conversely, in small samples, the asymptotic properties of maximum likelihood estimations may not hold due to biased estimates [9, 10]. Firth’s method has been introduced as one of the penalization techniques to correct or reduce the small sample bias of the ML estimators of the linear regression model [10, 11]. A comparison of the performance of the linear regression model based on maximum likelihood (ML) and Firth’s penalized maximum likelihood estimation methods was done by S. Lee. The results showed that, as compared with penalized maximum likelihood, the LR model based on asymptotic ML worked slightly better in terms of statistical power although the difference in performance was not practically important [9]. The present paper centers to evaluate the susceptibility of the Hessian matrix to different imputation techniques by comparing the magnitudes of the determinants obtained from the Hessian of the log-likelihood function with the imputed covariate vector.
This section gives the background of the study by introducing the concept of panel data econometrics and mentions various approaches to panel data models’ estimation where the conditional maximum likelihood estimation is mentioned as a solution to the incidental parameter problem in logistic panel data model.
Following the background introduction above, the rest of the paper is organized as follows. Section 2 focuses on the specification of the nonlinear binary choice panel data models. Section 3 highlights the incidental parameter problem in estimating logistic panel data model and shows how the conditional maximum likelihood approach circumvents it. In section 4 we discuss the parameter estimation for a logit panel data model in which the covariate vector is partitioned into sample present values and missing or imputed values. This discerns the impact of missingness on the Hessian of the conditional maximum likelihood estimator of nonlinear binary choice logistic model. Monte Carlo simulation results are also given and discussed in this section to assess the impact of missingness on the determinant of the Hessian matrix and the parameter estimates. Finally, section 5 gives the summary and conclusions from the study and consequently provide recommendations for further study, based on the key findings from this work.

2. Model Specification

2.1. Panel Data

Observing experimental subjects or units over a repeated number of times collates a set of data referred to as panel data which provides two kinds of information – cross-sectional and time series. This unique characteristic allows panel data to account for individual differences that are time-invariant by applying regression techniques which allow us to adequately take advantage of the different types of information.
The logit panel data model then develops from the logistic regression to model binary choice response variables which have had wide applications in almost all research fields that conduct pretest and posttest studies with the aim of discerning the impact of the test. As such, for N units each observed T times, we have a total of N × T observations made.

2.2. The Logit Panel Data Model

Consider a population unit i is observed a time t for a binary response variable y against K explanatory variables to which we specify a general fixed effects panel data model as a special case of the generalized longitudinal model in the form
Preprints 79891 i001
In model (1), β is a 1 × k vector of parameters of the k × 1 column vector x i t . The parameter c i captures all individual-specific time-invariant characteristics of the study population unit. For fixed effects (FE) models the parameter c i is assumed to be correlated with x i t leaving only u i t ~ N ( 0 , σ y 2 ) as the stochastic part of the model. Compounding c i and u i t as a stochastic entity gives a random effects (RE) model. Let y i t 0,1 for all i and t where the values 0 and 1 consecutively represent the failure and success of an event occurring as described in the research experiment. We then have y i t as a binary choice variable following a binomial distribution with probability of success for individual i at time t as p i t = Pr y i t = 1 = E y i t x i t , c i = F x i t ' β + c i and E y i t x i t , c i = 1 × p i t + 0 × 1 p i = p i t . This is so since u i t ~ N ( 0 , σ y 2 ) by assumption, then strict exogeneity holds. The link function   F relates the binary outcome to the functional forms of the explanatory variables.
Under random sampling,   P r = 1 x i t , c i = E y i t = 1 x i t , c i ; β and if the binary response model above is correctly specified, we have the linear probability model (LPM) specified as
Preprints 79891 i002
Adopting a linear probability model poses an absurdity of predicting "probabilities" of the response variable as either less than zero or greater than one. This shortfall is however addressed by specifying a monotonically increasing function F such that   F ( . ) : R 0,1 and
Preprints 79891 i003
This study adopts the logistic distribution as a nonlinear functional form of F as
Preprints 79891 i004
which is between zero and one for all values of x i t ' β + c i . This is the cumulative distribution function (CDF) for a logistic variable whose parameters can be estimated. Equation (4) now represents what is known as the (cumulative) logistic distribution function which mitigates the limitations of LPM and has a total of   K + N parameters to be estimated as β ^ and c i ^ . It is verifiable that as x i t ' β + c i ranges from −∞ to +∞, F ranges between 0 and 1 and that F is nonlinearly related to x i t ' β + c i , thus satisfying the requirements for a probability function . F is nonlinear not only in X but also in the parameter vector β as can be seen clearly from (4) which implies that we cannot use the familiar OLS procedure to estimate the parameters. However, through linearization, this problem becomes more apparent than real. This is done by obtaining the odds ratio P r y i t = 1 x i t , c i P r y i t = 0 x i t , c i   in favor of success i.e. the ratio of the probability that y i t = 1 to the probability that y i t = 0 . It is realized that the logarithm of the odds ratio, is not only linear in X , but also (from the estimation viewpoint) linear in the parameters.
Preprints 79891 i005
This log of the odds ratio (5) is called the logit, and hence the name logit model.
Among the approaches explored to estimate fixed effects models include: (a) Demeaning variables (b) Unconditional maximum likelihood or least squares dummy variables (LSDV) and (c) Conditional maximum likelihood estimation which is the most preferred method for logistic regressions.
The approaches used so far in estimating panel data models with fixed effects and continuous dependent variable y i t aim at controlling for these effects by eliminating their presence from the model and estimating the coefficients of the regressors. For categorical dependent variables, where specific nonlinear functions that preserve the structure of the dependent variable are considered, the conditional maximum likelihood estimation partials or ‘conditions’ the fixed effects out of the likelihood function by conditioning the probability of the regressand on the total number of events observed for each subject [12].
When the panel data sets are unbalanced due to cases of missing covariates, the estimation methods get computationally complicated and produce inefficient parameter estimates [13]. Various causes of missingness have been mentioned in literature among them being delayed entry, early exit or intermittent non-response from a study unit. As such, approaches suggested in literature on how to handle missing observations become valid in such cases. In this study we inspect the impact of missing data on the conditional maximum likelihood estimation procedures in nonlinear panel data models with an attempt to establish the best.

3. Incidental Parameter Problem and MLE

3.1. Incidental Parameter Problem

As specified in model (1), the presence of individual effects c i complicates the computation of parameter estimates greatly, hence to obtain consistent estimates for the parameters for static linear models, we simply difference out the fixed effects. The number of parameters c i increase with increase in sample size, a notion referred to as the incidental parameter problem realized by Neyman and Scott [14] and later reviewed by Lancaster [15]. For example, in the model (1) c i and β are unknown parameters to be estimated and as N for fixed T , the number of parameters c i increases with N . As such for nonlinear panel data models c i cannot be consistently estimated for fixed T [16].
For linear panel data regression model with fixed T , only β can be estimated consistently by first getting rid of c i using the within transformation. This is possible for the linear case because the MLE of β and c i are asymptotically independent [17]. For qualitative binary choice model with fixed T , this is not possible as demonstrated by Chamberlain [12].
Hsiao [17] simply illustrates how the inconsistency of the ML estimate of c i is transmitted into inconsistency for β ^ m l e .This is done in the context of a logistic model with one regressor xit that is observed over two periods, with xi1 = 0 and xi2 = 1where as N with T = 2 , p l i m β ^ m l e = 2 β . Greene [18] shows that despite the large number of incidental parameters, one can still force maximum likelihood estimation for the fixed effects model by including a large number of dummy variables.

3.2. The Unconditional Likelihood Function

The logistic model is estimated by means of Maximum Likelihood (ML), a technique that seeks a particular vector β ^ M L that gives the greatest likelihood of observing the outcomes in the sample y 1 , y 2 , conditional on the explanatory variables x.
By assumption, p i t = Pr y i t = 1 = E y i t x i t , c i = F x i t ' β + c i and 1 p i t = Pr y i t = 0 = 1 F x i t ' β + c i It then follows that the probability of observing the entire sample is
Preprints 79891 i006
The log likelihood function for the sample is
Preprints 79891 i007
The MLE β ^ M L maximizes this log likelihood function (7).

3.3. Conditional Likelihood Function for Logistic Panel Data Model

If F is the logistic CDF then (7) gives the logistic log likelihood as:
Preprints 79891 i008
The logistic model is preferred over the alternative probit model because it yields a consistent estimator of β without making any assumptions about how c i is related to x i t except strict exogeneity. This is possible, because the logistic functional form enables us to eliminate c i from the estimating equation, once we condition on the "minimal sufficient statistic" for c i . As such we obtain the conditional likelihood function whose parameters are estimated.
Considering the conditional probabilities when T = 2 , we have that:
Preprints 79891 i009
Preprints 79891 i010
and
Preprints 79891 i011
Note that conditioning is on y i 1 + y i 2 = 1 , for which y i t changes between the two time periods ensures that the c i ' s are eliminated and therefore t y i t is a sufficient statistic for the fixed effects.
Probabilities (10) and (11) are conditional on y i 1 + y i 2 = 1 and are independent of c i .
The probability distribution function is expressed as
Preprints 79891 i012
The conditional log likelihood function from (8) is then given as
Preprints 79891 i013
where d 01 i selects the individuals for which the dependent variable changed from 0 to 1 while d 10 i selects the cases for which the dependent variable changed from 1 to 0.
Hence, by maximizing the conditional log likelihood function (13) we obtain consistent estimates of β, regardless of whether c i and x i t are correlated. Generally, from an assessment of bias and root mean square errors, the conditional logit estimator performs better than the unconditional logit estimator when various imputation techniques are performed more so when the sample size is large [19].
The trick is thus to condition the likelihood on the outcome series ( y i 1 , y i 2 ), and in the more general case. For example, if T = 3, we can condition on t y i t = 1 , with possible sequences (1,0,0) , (0,1,0) , (0,0,1) , or on t y i t = 2 with possible sequences (1,1,0) , (0,1,1) , (1,0,1). The general conditional probability of the response variable ( y i 1 , y i 2 , . , y i T ) given t y i t is
Preprints 79891 i014
where B i = d i 1 , d i 2 , , d i T d i t = 0,1 a n d t d i t = t y i t

4. Parameter Estimation with Imputed Covariate Sub-Matrix

4.1. Partitioned Covariate Matrix

In the presence of missing observations in the covariate vector x i t , we express it as a sum of two vectors x i t s and x i t I for the sample-present covariate values and the missing covariate values respectively. Therefore, we have the conditional probabilities (10) and (11) as
Preprints 79891 i015
and
Preprints 79891 i016
respectively, where x i I = x i 2 I x i 1 I and x i s = x i 2 s x i 1 s .
The conditional log likelihood function is then obtained using equations (15) and (16) as
Preprints 79891 i017
where d 01 i selects the individuals for which the dependent variable changed from 0 to 1 while d 10 i selects the cases for which the dependent variable changed from 1 to 0.
Consistent estimates of the parameters of equation (17) are solved for by iterative technique using Newton-Raphson algorithm.

4.2. Newton-Raphson Algorithm and the Hessian Matrix Optimization of the Log likelihood function

Newton-Raphson method is an iterative procedure to calculate the roots of function   f . In this method, we want to approximate the roots of the function by calculating
Preprints 79891 i018
where x n + 1 are the n + 1 t h iteration. The goal of this method is to make the approximated result as close as possible with the exact result (that is, the roots of the function). If f is defined as the gradient function (score vector) then the first derivative of f gives the Hessian matrix which is the matrix of the second order derivatives of the likelihood function.
Starting from an initial estimate, β ( 0 ) , the algorithm consists of iterating the estimate at step h as
Preprints 79891 i019
where, s β = l n L β is the score vector and J β = 2 l n L β β ' is the observed information matrix obtained by computing and negating the Hessian matrix.
The score vector and observed Hessian matrix from the log likelihood function are respectively,
Preprints 79891 i020
Preprints 79891 i021
where D = Δ x i s e Δ x i s β + Δ x i I e Δ x i I β e Δ x i s β + e Δ x i I β
For well-defined parameter estimates of the log likelihood function, it is sufficient that (a) the log likelihood function must be concave indicating that the model is identified; (b) the Hessian matrix must be negative semi definite yielding a negative curvature of the log likelihood plot. This means that the determinant of the Hessian matrix, when evaluated at a critical point of a function, is equal to the Gaussian curvature of the function considered as a manifold. Concavity of the log-likelihood function is easily established when all eigenvalues of its Hessian are negative. Therefore, the necessary condition for a function to be concave is that the determinant of the Hessian matrix of the function should be greater than zero.
In this study we confirm that the conditional log likelihood function of the logit panel data model preserves its concavity even when different imputation techniques are applied to the missing covariates matrix X. Establishing the concavity or convexity of the log-likelihood function becomes a necessary condition to help know whether the solutions or parameter estimates are optimally local or global. For the nonlinear logit panel data model, the maximum likelihood estimates are yielded when the Hessian matrix is negative semi-definite resulting from a strictly concave log-likelihood function.
We use simulations to assess the relationship between the Hessian modulus and the properties of the parameter estimates for the conditional MLE of logit panel data model with various imputation techniques for missing covariates.

4.3. Monte Carlo Simulation

In this section, we present the results of Monte Carlo simulation to investigate the concavity of the log likelihood function through the behaviour of the hessian matrix when different imputation techniques are used to fill up for the missing covariates. Focusing on the conditional ML estimator for the logistic model given by the maximization of (17) the simulation results will compare the properties of the Hessian matrices of the conditional log likelihood function resulting from the new data sets obtained after imputation.
The simulation compares different sets of panel data generated by imputing covariates with imposed missingness patterns. This is achieved by imputing the missing observations and substituting the imputed vector x i t I into the conditional log likelihood function (17) where the item based and model based imputation methods are used to fill up for the missing covariates. We consider a binary response variable that is specified by the relation model:
Preprints 79891 i022
where x i t is a vector of five explanatory variables drawn from uniform, binomial and normal distributions (see Table 4.1) and the error term v i t has a logistic distribution given by v i t = l n u i t 1 + u i t with u i t ~ N ( 0,1 ) . β1 to β5 were fixed as β1=1, β2=-1, β3=1, β4=1 and β5=1. The fixed effects c i are obtained as functions of x ( 1 ) and T by the relation c i = T x ( 1 ) n + a i   with a i ~ N ( 0,1 ) .
Table 4. 1. Description of variables.
Table 4. 1. Description of variables.
Preprints 79891 i023
To establish the sample sizes we imposed an expected probability of success as P r y i t = 1 x i t , c i = 0.5 and plausible coefficients of variation (CoV) as 0.2, 0.14 and 0.09 respectively in the relation N 1 p r ( y ) × C o V 2 . In order to factor in small, medium and large sample size possibilities, three different values of N ( N = 50, N =100 and N = 250 ) were used for all sets of data fitted into the models to enable in-depth comparisons and also assess the impact of sample size on the determinant of the Hessian matrix of the log-likelihood function. To evaluate the impact of the proportion of missingness we use two proportions of 10% and 30% by randomly deleting the desired proportion of observations from the data set and imputing them back for each sample size.
For each data set specified, we find the determinants of the Hessian matrices and plot against the corresponding data code for ease of comparison across sample sizes. We use the determinant of the Hessian matrix as a generalization of the second derivative test for single variable functions. As such if the determinant of the Hessian is positive we have an optimum or extreme value – a maximum if the matrix is negative (semi) – definite. This shows that the log likelihood is a concave function. The imputation techniques used herein are: mean imputation; median imputation; last value carried forward; multiple imputation with chained equations and Bayesian imputation.
Table 2. Parameter Estimates by Conditional MLE for Complete Unimputed Panel Data Set.
Table 2. Parameter Estimates by Conditional MLE for Complete Unimputed Panel Data Set.
Preprints 79891 i024
Table 3. Parameter Estimates by Conditional MLE for Mean Imputed Panel Data Set.
Table 3. Parameter Estimates by Conditional MLE for Mean Imputed Panel Data Set.
Preprints 79891 i025
Table 4. Parameter Estimates by Conditional MLE for LVCF Imputed Panel Data Set.
Table 4. Parameter Estimates by Conditional MLE for LVCF Imputed Panel Data Set.
Preprints 79891 i026
Table 5. Parameter Estimates by Conditional MLE for Median Imputed Panel Data Set.
Table 5. Parameter Estimates by Conditional MLE for Median Imputed Panel Data Set.
Preprints 79891 i027
Table 6. Parameter Estimates by Conditional MLE for Bayesian (MICE) Imputed Panel Data Set.
Table 6. Parameter Estimates by Conditional MLE for Bayesian (MICE) Imputed Panel Data Set.
Preprints 79891 i028
Table 7. Comparative Parameter Biases and Determinants of the Hessian Matrices across Different Imputed Panel Data Sets with Varying sample sizes.
Table 7. Comparative Parameter Biases and Determinants of the Hessian Matrices across Different Imputed Panel Data Sets with Varying sample sizes.
Preprints 79891 i029a
Preprints 79891 i029b

5. Dcussion, Conclusion and Recommendation

Through the variation of the simulated sample sizes, the study confirms the asymptomatic properties of parameter bias. The results show that the parameter estimates improve with increasing sample size. The precision of the estimates asymptotically increase thereby making them more statistically significant.
The key objectives of this study was to focus on a method used to modify the conditional likelihood function through partitioning of the covariate matrix in a bid to curb the incidental parameter problem and to assess the susceptibility of the Hessian matrix of the log likelihood function on the imputation techniques employed in completing a panel data set with missing covariates.
Undeniably, of all the classical imputation techniques, mean and median imputation do not introduce much undue bias into the data set and therefore perform relatively better than last value carried forward technique and mode imputation. However, model based technique for imputation like MICE yields even better estimates with even more reduced bias and precision [20]. Figure 1 shows the varying and reducing trends of the parameter estimates across sample sizes and across imputation methods used in this study.
The value of x i I impacts inversely on the elements of the Hessian matrix and consequently its determinant. As seen, the study reveals that the smaller the determinant, the larger the parameter estimates which signifies an increased bias for smaller sample sizes. This indicates that by increasing the determinant of the hessian matrix through a reduction in x i I values, we reduce the product J β ( h 1 ) 1 s β ( h 1 ) towards zero.
From the N-R algorithm (30), therefore, the inverse of the Hessian J serves to reduce the product J β ( h 1 ) 1 s β ( h 1 ) to yield convergence in the iterations of β ( h ) . An increasing Hessian modulus therefore ensures faster convergence of the parameter estimates with more precision as seen from table 7 and figure 2. The positive moduli of the Hessian for the conditional MLEs are sufficient for concavity of the log likelihood function that give the optimum estimates of the parameters.
A key importance of deriving the estimators is to increase the theoretical understanding of the estimators and also reduce the computational complexity while estimating logit panel models. As observed from the Monte Carlo results, unbalancedness in a data set biases the parameter estimates and the different imputation techniques employed in this study respond differently to the concavity of the Hessian matrix, hence the bias and efficiency of the estimates are affected too.
Although studies show that the within estimator performs relatively well in many standard situations, this study has demonstrated the advantage of working with the conditional likelihood function over the unconditional likelihood as a working trick to eliminate the fixed effects from the estimation process thereby limiting our concentration on the parameter estimates only.
As a recommendation, further developments can be done on this study by considering multiple fixed effects and observed time periods greater than T=2.

References

  1. Janssen KJ, Donders ART, Harrell FE Jr., Vergouwe Y, Chen Q, Grobbee DE, Moons KG. Missing covariate data in medical research: to impute is better than to ignore. Journal of Clinical Epidemiology 2010; 63(7):721–727. [CrossRef]
  2. Donders ART, van der Heijden GJ, Stijnen T, Moons KG. Review: a gentle introduction to imputation of missing values. Journal of Clinical Epidemiology 2006; 59(10):1087–1091. [CrossRef]
  3. Knol MJ, Janssen KJ, Donders ART, Egberts AC, Heerdink ER, Grobbee DE, Moons KG, Geerlings MI. Unpredictable bias when using the missing indicator method or complete case analysis for missing confounder values: an empirical example. Journal of Clinical Epidemiology 2010; 63(7):728–736. [CrossRef]
  4. Cox, D.R. and Hinkley, D.V. (1974) Theoretical Statistics. Chapman and Hall, London.
  5. Firth, D. (1993) Bias Reduction of Maximum Likelihood Estimates. Biometrika, 80, 27-38. [CrossRef]
  6. Anderson, J.A. and Richardson, C. (1979) Logistic Discrimination and Bias Correction in Maximum Likelihood Estimation. Technometrics, 21, 71-78.
  7. McCullagh, P. (1986) The Conditional Distribution of Goodness-of-Fit Statistics for Discrete Data. Journal of the American Statistical Association, 81, 104-107. [CrossRef]
  8. Shenton, L.R. and Bowman, K.O. (1977) Maximum Likelihood Estimation in Small Samples. Griffin’s Statistical Monograph No. 38, London. [CrossRef]
  9. S. Lee, “Detecting differential item functioning using the logistic regression procedure in small samples,” Applied Psychological Measurement, vol. 41, no. 1, pp. 30–43, 2017.
  10. R. Puhr, G. Heinze, M. Nold et al., “Firth’s logistic regression with rare events: accurate effect estimates and predictions?” Statistics in Medicine, vol. 36, no. 14, pp. 2302–2317, 2017.
  11. D. Firth, “Bias reduction of maximum likelihood estimates,” Biometrika, vol. 80, no. 1, pp. 27–38, 1993. [CrossRef]
  12. Chamberlain, G., (1980). Analysis of Covariance with Qualitative Data, Review of Economic Studies 47, 225-238. [CrossRef]
  13. Matyas, L. and Lovrics, L. (1991). Missing observations and panel data- A Monte-Carlo analysis, Economics Letters 37(1), 39-44. [CrossRef]
  14. Neyman, J. and Scott, E.L., (1948). Consistent estimation from partially consistent observations. Econometrica 16, 1-32.
  15. Lancaster, T., (2000). The incidental parameter problem since 1948, Journal of Econometrics, 95, 391–413. [CrossRef]
  16. Baltagi, B.H. (2001). Econometric Analysis of Panel Data, 2nd edition, New York, John Wiley.
  17. Hsiao, C., (2003). Analysis of Panel Data, 2nd edition, New York, Cambridge University Press.
  18. Greene, W., (2004a). The behaviour of the maximum likelihood estimator of limited dependent variable models in the presence of fixed effects. Econometrics Journal 7(1): 98. [CrossRef]
  19. Opeyo P.O., Olubusoye O.E. and Odongo L.O. (2014), Conditional Maximum Likelihood Estimation for Logit Panel Models with Non-Responses, International Journal of Science and Research, 3(7), 2242-2254.
  20. Opeyo, P. , Cheng, W. and Xu, Z. (2023) Superiority of Bayesian Imputation to Mice in Logit Panel Data Models. Open Journal of Statistics, 13, 316-358. [CrossRef]
Figure 1. Comparative Parameter Estimates by Conditional MLE for Different Imputed Panel Data Sets with Varying sample sizes and Proportions of Missingness.
Figure 1. Comparative Parameter Estimates by Conditional MLE for Different Imputed Panel Data Sets with Varying sample sizes and Proportions of Missingness.
Preprints 79891 g001
Figure 2. Comparative Determinants of the Hessian Matrices across Different Imputed Panel Data Sets with Varying sample sizes.
Figure 2. Comparative Determinants of the Hessian Matrices across Different Imputed Panel Data Sets with Varying sample sizes.
Preprints 79891 g002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated