Preprint
Article

Bayesian Joint Modeling Analysis of Longitudinal Proportion and Survival Data

Altmetrics

Downloads

121

Views

30

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

06 July 2023

Posted:

10 July 2023

You are already at the latest version

Alerts
Abstract
This paper focuses on a joint model to analyse longitudinal proportion and survival data. After transforming the longitudinal proportional data by a logit function, we adopt a partially linear mixed-effect model for it, in which nonlinear function of time is fitted via using B-splines technique and a centered Dirichlet Process Mixture Model (CDPMM) is specified for a general distribution of random effects. The survival data is assumed to a Cox proportional hazard model, the sharing random effects joint model is developed for the two types of data. Combining the Gibbs sampler and the Metropolis-Hastings algorithm, we propose a Bayesian Lasso (BLasso) method to simultaneously estimate unknown parameters and select important covariates. Simulation studies are conducted to investigate the finite sample performance of the proposed methods. An example from the MA.5 research experiment is used to illustrate the proposed methodologies.
Keywords: 
Subject: Computer Science and Mathematics  -   Probability and Statistics

1. Introduction

Joint analysis of longitudinal and survival data have been widely applied to cancer and HIV/AIDS clinical studies where time to event outcomes, such as disease free and overall survivals, are usually the primary endpoints. In particular, after Faucett and Thomas [1] and Wulfsohn and Tsiatis [2] introduced what is the standard joint model, the joint model has been extensively explored. The advantages of joint models have been discussed by researchers [3,4,5,6,7,8]. However, in clinical trials, some patients with poor quality of life may drop out of the study due to disease recurrence or death. In this case, the absence of quality of life measures resulting from the withdrawal of these patients is informative. In order to provide strong evidence for the tradeoff between intensive treatment and the associated poor quality of life for patients, we conducted joint modeling of longitudinal life measures and survival data to investigate the relationship between them. For longitudinal quality of life and survival data, Henderson et al. [9] and Zeng and Cai [10] considered the use of shared normal distribution random effects to jointly analyze the relationship between longitudinal quality of life and survival time. Tang et al. [11] considered a novel semiparametric joint model for multivariate longitudinal and survival data to analyze data from the International Breast Cancer Study. Longitudinal life quality measurement data can be linearly converted into longitudinal proportion data whose value range is in the unit interval (0,1) [12]. For the longitudinal component, there are two methods to deal with it. The first method applied the classic linear mixed model to the longitudinal scale data after logit transformation, and the second method directly used the simplex distribution to model the longitudinal scale data. The models established by the two methods both used the EM algorithm and the Laplace approximation to estimate the unknown parameters. In order to be more flexible and practical, this paper will use a partial linear mixed effect model to the logit transformed longitudinal scale data, and use the B-splines method to model the nonlinear function in the model. Meanwhile, to promote the feasibility of our proposed model, we use CDPMM method to model random effects.
In addition, variables selection in the joint model is also considered. In traditional regression models, methods of variable selection include the forward selection method, backward elimination method, stepwise selection method or some information criterion such as the Akaike information criterion. But these approaches are computationally expensive and unstable for the complicated models with a large number of covariates. Therefore, some penalized likelihood methods are gradually proposed, especially Lasso of Tibshirani [13] estimating linear regression coefficients by L 1 -constrained least squares. Tibshirani [13] proposed that Lasso estimates can be interpreted as posterior norm estimates when the regression parameters have independent and identical Laplacian priors (ie, double exponential priors). Motivated by this connection, under the Bayesian framework, Park and Casella [14] proposed the BLasso variable selection method by imposing the double exponential prior on the regression coefficients and the gamma distribution on the shrinkage parameter. The BLasso method has been applied to linear regression model [15], semi-parametric structural equation model [16] and joint models of longitudinal and survival data [17]. Therefore, this paper extended that BLasso variable selection method to the joint model of longitudinal proportion data and survival data and a BLasso approach is developed to simultaneously estimate unknown parameters and identify the significant effect of the important covariates.
The rest of this paper is organized as follows. In Section 2, the joint model of longitudinal proportion and survival data is introduced. In Section 3, the Bayesian estimations of joint model are proposed. In Section 4, three numerical simulations are presented to evaluate the performance of the proposed methods. In Section 5, we apply the proposed approach to the MA.5 research experiment data and some concluding remarks are given in Section 6. Technical details are given in Appendix A.

2. Model and Notation

Consider a data set from n individuals. Let y i j be a longitudinal proportional measurement for the i-th individual ( i = 1 , 2 , , n ) at observation time point t i j for j = 1 , 2 , , n i , and y i j ( 0 , 1 ) . We assume y i j * is the logit-transformation of y i j , and y i j * = logit y i j = log 1 y i j y i j . Further, T i * and C i are the true survival time and censoring time, respectively. Let T i = min ( T i * , C i ) denotes the corresponding observed event time, and δ i = 1 ( T i * C i ) denotes the failure indicator, where 1 ( · ) is a indicator function.
Denote y * = y 1 * , y 2 * , , y n * , y i * = y i 1 * , y i 2 * , , y i n i * , T = T 1 , T 2 , , T n , Δ = δ 1 , δ 2 , , δ n . Let b = { b 1 , b 2 , , b n } be time-independent random effects underlying both the longitudinal and survival processes for the i-th individual. Given the random effects b i , assume that y i j * follows partially linear mixed-effect model
y i j * | b i = X i j β + g ( t i j ) + Z i j b i + ε i j ,
where X i j and Z i j represent the time-independent design vectors of fixed and random effects associated with y i j * , respectively. And β is a p 1 × 1 fixed effects regression parameters, b i is a q × 1 random effects vector, g ( t ) is a twice continuous differentiable unknown function, and ε i j is a white noise process with variance σ 2 . Also, we assume that ε i j ’s are independent of b i . To facilitate the feasibility of our proposed model, instead of the traditional normality assumption, which may be violated in some applications [18], it is specified the random effects by Dirichlet process (DP) mixture of normals.
For event time T i , given random effects b i , we assume that T i follows the hazard model
λ i ( t | b i ) = λ 0 ( t ) exp ( W i γ + ϕ b i ) ,
where W i is the known fixed effects design matrix linking the unknown p 2 × 1 parameter vector γ to λ i ( t | b i ) , ϕ is an unknown q × 1 parameter vector linking b i to λ i ( t | b i ) , and λ 0 t is an unknown basic hazard function.
From the above discussion, it is suggested to link models (1) and (2) through shared random effects, called a shared random effects joint model (JMSRE). Parameter ϕ in model JMSRE reflects the correlation between transformed longitudinal proportional data and survival data, given random effects. When ϕ = 0 q × 1 , it means that the longitudinal index is not necessarily related to the event time, that is, longitudinal proportional data and survival data can be modeled separately. So in this case joint modeling is not necessary and longitudinal indicators can be ignored for modeling survival data.
Further, to make Bayesian inference on β based on model (1), we approximate g ( t ) through a B-splines method
g t B 1 t φ 1 + B 2 t φ 2 + + B L t φ L = B t φ ,
where L = d + K + 1 , d is the degree of B-splines, K is the number of knots, φ = φ 1 , φ 2 , , φ L is a L × 1 unknown coefficient vector, and B ( t ) = B 1 ( t ) , B 2 ( t ) , , B L ( t ) .
Denote θ y = β , φ , σ 2 be unknown parameters associated with model (1), and θ T = γ , ϕ be unknown parameters associated with model (2). Thus, given ( θ y , θ T , b ) , a joint likelihood function of ( y * , T , Δ ) can be written as
p ( y * , T , Δ | θ y , θ T , b ) = i = 1 n p y i * | b i ; θ y p T , Δ | b i ; θ T ,
where
p y i * | b i ; θ y = j = 1 n i 1 2 π σ 2 exp y i j * X i j β B ( t i j ) φ Z i j b i 2 2 σ 2 ,
p T , Δ | b i ; θ T = i = 1 n exp ( W i γ + ϕ b i ) j R i exp ( W j γ + ϕ b j ) δ i , R i = j : T j T i .

3. Bayesian estimation of joint model

3.1. Prior specification

To develop Bayesian inference on the considered models, we need specifying the prior distributions for σ 2 , φ , β , γ and ϕ . For conjugation, we consider the following priors for σ 2 , φ :
σ 2 Γ a 0 , b 0 , φ N L 0 , H φ 0 ,
where a 0 , b 0 , and H φ 0 are pre-given hyperparameters, Γ a 0 , b 0 denotes the Gamma distribution with parameter a 0 and the shape parameter b 0 .
Following Tang et al. [17], we specify the DP mixture of normals for the random effects b i , that is, b i i . i . d . g = 1 π g N q ( μ g , Ω g ) with ( μ g , Ω g ) P , where P is a random probability with an unknown form. one common approach is to specify DP prior to approximate P , i.e., P DP ( τ F 0 ) in which F 0 is a base distribution that serves as a starting-point for constructing the nonparametric distribution, and τ is a weight that indicates the researcher’s certainty of F 0 as the distribution of P . To gain the form of F , motivated by Ohlssen et al. [18] and Yang et al. [19], we consider the following truncated approximate centered Dirichlet Process Mixture Model (CDPMM), which can specify the prior distribution of b i as follows:
b i i . i . d . g = 1 G π g N q ( μ g , Ω g ) with μ g = μ g * g = 1 G π g μ g * and ( μ g * , Ω g ) i . i . d . F 0 ,
where 1 G < , and π g is a random probability weight chosen to be independent of ( μ g * , Ω g ) such that 0 π g 1 and g = 1 π g = 1 . In order to ease the computational intensity, according to take moderate value proposed by Ishwaran and Zarepour (2000) [20], we consider G = 25 , and π g is given by the following stick-breaking procedure:
π 1 = ϑ 1 and π g = ϑ g ι = 1 g 1 ( 1 ϑ ι ) for g = 2 , , G ,
where ϑ g i . i . d . Beta ( 1 , τ ) for g = 1 , 2 , , G 1 , and ϑ G = 1 so that g = 1 G π g = 1 . The prior distribution for the unknown parameter τ is given by τ Γ ( a 1 , a 2 ) with prespecified hyperparameters a 1 and a 2 [21]. Here, we took the hyperparameters a 1 and a 2 to be 25 and 5, respectively.
Sampling from posterior distributions of b i with the above specified DP prior is quite difficult and inefficient. An efficient and flexible method for solving the above question is to represent b i in terms of a latent variable L i { 1 , 2 , , G } , which records each b i ’s cluster membership and conveys its parametric value to the distribution of b i . Let L = { L 1 , L n , , L n } , π = { π 1 , π 2 , , π G } , μ * = { μ 1 * , μ 2 * , , μ G * } and Ω = { Ω 1 , Ω 2 , , Ω G } in which Ω g = diag ( ω g 1 , ω g 2 , , ω g q ) , and they can be reformulated by
L i | π i . i . d g = 1 G π g δ g ( · ) and ( π , μ * , Ω ) f 1 ( π ) f 2 ( μ * ) f 3 ( Ω ) ,
where δ g ( · ) denotes a discrete probability measure concentrated at g, f 1 ( π ) is specified by the stick-breaking prior as given in Equation (5), f 2 ( μ * ) = g = 1 G f 2 ( μ g * ) and f 3 ( Ω ) = g = 1 G j = 1 q f 3 ( ω g j ) . Here, μ g * and ω g j can be specified by
μ g * | ξ , Ψ i . i . d N q ( ξ , Ψ ) , ξ | ξ 0 , Ψ 0 N q ( ξ 0 , Ψ 0 ) , ψ j 1 | c 1 , c 2 Γ ( c 1 , c 2 ) for j = 1 , 2 , , q ,
ω g j 1 | ω j a , ϖ j Γ ( ω j a , ϖ j ) and ϖ j | ϖ j a , ϖ j b Γ ( ϖ j a , ϖ j b ) ,
respectively, where Ψ = diag ( ψ 1 , ψ 2 , , ψ q ) , Γ ( c 1 , c 2 ) denotes the Gamma distribution with parameters c 1 and c 2 , and ξ 0 , Ψ 0 , c 1 , c 2 , ω j a , ϖ j a and ϖ j b are prespecified hyperparameters [18]. We consider ξ 0 = 0 q × 1 , Ψ 0 = I q , c 1 = 11 , c 2 = 2.5 , ω j a = 3 , ϖ j a = n and ϖ j b = 10 in the paper. Given the values of L i , μ * and Ω , we can sample b i from N q ( μ L i , Ω L i ) with μ L i = μ L i * Σ g = 1 G μ g * .
Hereafter we mainly introduce the variable selection principle of BLasso method [14,16] for the proposed joint model JMSRE. We need to identify not only the important variables in models (1) and (2). In particular, we also need to identify whether the parameter ϕ is 0 q × 1 , and our proposed BLasso method does just that. In general, the prior distribution of the regression parameters is set to a multivariate normal distribution. Following the idea of Bayesian Lasso inference [14], we consider the following hierarchical priors for β , γ and ϕ :
β | H β N p 1 ( 0 , H β ) , with H β = d i a g ( h β 1 2 , h β 2 2 , , h β p 1 2 ) , f ( h β 1 2 , h β 2 2 , , h β p 1 2 ) = j = 1 p 1 ϑ β j 2 2 exp ϑ β j 2 2 h β j 2 ,
γ | H γ N p 2 ( 0 , H γ ) , with H γ = d i a g ( h γ 1 2 , h γ 2 2 , , h γ p 2 2 ) , f ( h γ 1 2 , h γ 2 2 , , h γ p 2 2 ) = j = 1 p 2 ϑ γ j 2 2 exp ϑ γ j 2 2 h γ j 2 ,
ϕ | H ϕ N q ( 0 , H ϕ ) , with H ϕ = d i a g ( h ϕ 1 2 , h ϕ 2 2 , , h ϕ q 2 ) , f ( h ϕ 1 2 , h ϕ 2 2 , , h ϕ q 2 ) = j = 1 q ϑ ϕ j 2 2 exp ϑ ϕ j 2 2 h ϕ j 2 ,
where ϑ β = ϑ β 1 , ϑ β 2 , , ϑ β p 1 , ϑ γ = ϑ γ 1 , ϑ γ 2 , , ϑ γ p 2 and ϑ ϕ = ϑ ϕ 1 , ϑ ϕ 2 , , ϑ ϕ q are the regularization parameters that control the tail decay. In particular, in order to better control the effect of tail decay, this paper sets different regularization parameters for different components of the same parameter. Inspired by Park and Casella [14], we further consider the following super-priorities for these tuning parameters:
ϑ β j 2 Γ a ϑ β , b ϑ β , j = 1 , 2 , , p 1 ,
ϑ γ j 2 Γ a ϑ γ , b ϑ γ , j = 1 , 2 , , p 2 ,
ϑ ϕ j 2 Γ a ϑ ϕ , b ϑ ϕ , j = 1 , 2 , , q .

3.2. Bayesian analysis of joint model

To obtain Bayesian estimates of unknown parameters β , φ , σ 2 , b , γ and ϕ , a hybrid algorithm combining the block Gibbs sampler and the Metropolis-Hastings algorithm is employed to iteratively draw { β , φ , σ 2 , b , γ , ϕ } as follows.
(A) Conditional distribution of β
It follows from Equation (3) and Equation (7) that the conditional posterior distribution p ( β | φ , σ 2 , b , y * ) is expressed as
p ( β | φ , σ 2 , b , y * ) exp 1 2 i = 1 n j = 1 n i 1 σ 2 y i j * X i j β B ( t i j ) φ Z i j b i + β H β 1 β ,
which yields
β | φ , σ 2 , b , y * N p 1 A β , V β ,
where V β = i = 1 n j = 1 n i σ 2 X i j X i j + Σ β 1 , A β = V β i = 1 n j = 1 n i X i j y i j * Z i j b i B ( t i j ) φ σ 2 .
(B) Conditional distribution of φ According to Equation (3) and the prior of φ in Equation (4), the conditional distribution p ( φ | β , σ 2 , b , y * ) is given by
p ( φ | β , σ 2 , b , y * ) exp 1 2 i = 1 n j = 1 n i 1 σ 2 y i j * X i j β B ( t i j ) φ Z i j b i + φ ( H φ 0 ) 1 φ ,
which yields
φ | β , σ 2 , b , y * N L A φ , V φ ,
where V φ = i = 1 n j = 1 n i σ 2 B ( t i j ) B ( t i j ) + Σ φ 1 , A φ = V φ i = 1 n j = 1 n i B ( t i j ) y i j * X i j β Z i j b i σ 2 .
(C) Conditional distribution of σ 2
It follows from Equation (3) and the prior of σ 2 in Equation (4) that the conditional distribution p ( σ 2 | β , φ , b , y * ) is given by
p ( σ 2 | β , φ , b , y * ) exp 1 2 i = 1 n j = 1 n i y i j * X i j β B ( t i j ) φ Z i j b i + b 0 σ 2 × ( σ 2 ) 1 2 i = 1 n n i + a 0 1 ,
which yields
σ 2 | β , φ , b , y * Γ 1 2 i = 1 n n i + a 0 , 1 2 y i j * X i j β B ( t i j ) φ Z i j b i + b 0 .
(D) Conditional distribution of b i
For reasons of space, the sampling of b i , i = 1 , 2 , , n follows the steps in Appendix A, which can also be seen in Tang et al. [24]
(E) Conditional distribution of γ
It follows from Equation (3) and Equation (8) that the conditional distribution p ( γ | b , ϕ , T , Δ ) is proportional to
i = 1 n exp ( W i γ + ϕ b i ) j R i exp ( W j γ + ϕ b j ) δ i e x p 1 2 γ T H γ 1 γ ,
which is not a familiar distribution. Therefore, the well-known Metropolis-Hastings (MH) algorithm is adopted to simulate observations from the above given conditional distribution, which is implemented as follows. Given the current value γ ( m ) , new candidates γ are generated from N p 2 ( γ ( m ) , σ γ 2 Σ γ ) , where Σ γ = 2 ( ln ( p ( γ | b , ϕ , T , Δ ) ) / γ γ | γ = γ ( m ) 1 . The new γ ( m ) is accepted with probability
m i n 1 , p ( γ | b , ϕ , T , Δ ) p ( γ ( m ) | b , ϕ , T , Δ ) ,
where
Σ γ = i = 1 n j R i exp ( · ) W j W j j R i exp ( · ) + j R i exp ( · ) W j j R i exp ( · ) W j j R i exp ( · ) 2 + H γ 1 1 .
with exp ( · ) = exp ( W j γ ( m ) + ϕ b j ) . The variance coefficient σ γ 2 can be chosen such that the average acceptance rate are approximately 0.25 or more.
(F) Conditional distribution of ϕ
From Equation (3) and Equation (9), the conditional distribution p ( ϕ | γ , b , T , Δ ) is proportional to
i = 1 n exp ( W i γ + ϕ b i ) j R i exp ( W j γ + ϕ b j ) δ i e x p 1 2 γ T H ϕ 1 γ ,
which is not a familiar distribution. Similar to above (E), given the current value ϕ ( m ) , new candidates ϕ are generated from N q ( ϕ ( m ) , σ ϕ 2 Σ ϕ ) , where Σ ϕ = 2 ( ln ( p ( ϕ | b , γ , T , Δ ) ) / ϕ ϕ | ϕ = ϕ ( m ) 1 . The new ϕ ( m ) is accepted with probability
m i n 1 , p ( ϕ | γ , b , T , Δ ) p ( ϕ ( m ) | γ , b , T , Δ ) ,
where
Σ ϕ = i = 1 n j R i exp ( · ) b j b j j R i exp ( · ) + j R i exp ( · ) b j j R i exp ( · ) b j j R i exp ( · ) 2 + H ϕ 1 1 ,
and exp ( · ) = exp ( W j γ + ϕ ( m ) b j ) . The variance coefficient σ ϕ 2 can be chosen such that the average acceptance rate are approximately 0.25 or more.
By the above iterative process, we can obtain a series of sample { ( β ( m ) , φ ( m ) , σ 2 ( m ) , b i ( m ) , γ ( m ) , ϕ ( m ) ) : m = 1 , 2 , , M } . Then, Bayesian estimates of β , φ , σ 2 , b i , γ and ϕ can be obtained by
β ^ = 1 M m = 1 M β ( m ) , φ ^ = 1 M m = 1 M φ ( m ) , σ 2 ^ = 1 M m = 1 M σ 2 ( m ) ,
b ^ i = 1 M m = 1 M b i ( m ) γ ^ = 1 M m = 1 M γ ( m ) , ϕ ^ = 1 M m = 1 M ϕ ( m ) .
Similarly, the consistent estimates of the posterior covariance matrices of var ( β | y * , X , Z ) , var ( φ | y * , X , Z ) , var ( σ 2 | y * , X , Z ) , var ( γ | W , T , Δ ) , and var ( ϕ | y * , X , Z , W , T , Δ ) can be obtained via the sample covariance matrices. For example
var ^ ( β | y * , X , Z ) = 1 M 1 m = 1 M ( β ( m ) β ^ ) ( β ( m ) β ^ ) .
Therefore, the variance of the corresponding parameter can be obtained through the diagonal elements of the sample covariance matrix of the random sample sequence.

4. Simulation Studies

In this section, we conducted three simulation studies to investigate the finite performance of the above proposed methods.
We considered the model defined in models (1) and (2) with 200 individuals. Here, the specific model is as follows:
y i j * | b i = X 1 i j β 1 + X 2 i j β 2 + X 3 i j β 3 + X 4 i j β 4 + X 5 i j β 5 + X 6 i j β 6 + sin 2 π t i j + b i + ε i j ,
λ i t | b i = λ 0 t exp W 1 i γ 1 + W 2 i γ 2 + W 3 i γ 3 + W 4 i γ 4 + ϕ b i .
The data were generated as follows: observation time t i j was generated randomly between 0 and 3, the covariates X 1 i j and X 6 i j followed the Bernoulli distribution with the success probability 0.5 and 0.3, respectively. The covariates X 2 i j , X 3 i j , X 4 i j and X 5 i j were generated form the multivariate normal distribution N 4 0 , Σ with mean vector 0 and covariance matrix Σ . And the covariance matrix Σ is a symmetric positive definite matrix whose diagonal elements are 1 and the rest are 0.5 . The random error ε i j is generated from the normal distribution with mean 0 and variance σ 2 = 0 . 6 2 . And W i = ( W i 1 , W i 2 , W i 3 , W i 4 ) = ( X 3 i 1 , X 4 i 1 , X 5 i 1 , X 6 i 1 ) . The baseline hazard function λ 0 t = 0.7 and ϕ = 0.6 . The censoring time C i was generated from the Uniform distribution U 0 , 3 , and T i * was generated from the Exponential distribution with mean 1 / λ i t | b i , T i = min ( T i * , C i ) . Our main purpose is to use the proposed approaches to identify the unimportant covariates and estimate nonzero coefficients. Bayesian results were obtained from 200 replications.
To show that our proposed method above can both accurately estimate parameters of interest and identify unimportant variables, but also flexibly capture the feature of nonlinear function g t and random effects b i , we consider three different simulation studies as follows.
Simulation I.
β = β 1 , , β 6 = 1 , 0 , 0 , 0.5 , 0.5 , 1 , γ = γ 1 , , γ 4 = 0 , 1 , 0.5 , 0 ,
g ( t ) = sin 3 4 π t , b i i . i . d 0.6 N ( 0.8 , 0 . 1 2 ) + 0.4 N ( 1.2 , 0 . 5 2 ) ,
Simulation II.
β = β 1 , , β 6 = 1 , 0 , 0 , 0.5 , 0.5 , 1 , γ = γ 1 , , γ 4 = 0 , 1 , 0.5 , 0 ,
g ( t ) = t , b i i . i . d 0.4 N ( 0 , 0 . 3 2 ) + 0.3 N ( 1.5 , 0 . 1 2 ) + 0.3 N ( 1.5 , 0 . 1 2 ) ,
Simulation III.
β = β 1 , , β 6 = 1 , 0.5 , 0.5 , 0.5 , 0.5 , 1 , γ = γ 1 , , γ 4 = 0.5 , 1 , 0.5 , 1 ,
g ( t ) = t 2 , b i i . i . d N ( 0 , 0 . 8 2 ) .
which the mean censoring rates of the survival times in the three simulation studies are 44%, 45%, and 37%, respectively.
For each of the above generated data sets, the proposed semiparametric Bayesian procedure was used to simultaneously evaluate Bayesian estimates of unknown parameters and select the important covariates. The prior hyperparameters a ϑ β = a ϑ γ = a ϑ ϕ = 1 , b ϑ γ = b ϑ β = b ϑ ϕ = 0.1 , corresponding to the hyperparameters of the hyper priors for the adjustment coefficients in Equations (10)-(). a 0 = 1 , b 0 = 1 and H φ 0 = 100 I 4 , corresponding to the prior parameters of σ 2 , φ . We set the degree of B-splines d = 3 , the number of konts K = 4 , and G = 25 . To investigate the convergence of the proposed algorithm, we calculated the estimated potential scale reduction (EPSR) values of parameters. In addition, we also need to test whether the nonlinear function fitted by the B-splines method is convergent. It can be seen from Figure 1 that the EPSR values were less than 1.2 after about 3000 iterations in three simulation studies. Thus, 3000 ( M = 3000 ) observations were collected to calculated Bayesian estimates of parameters after 3000 iterations in producing Bayesian results for each of 200 replications. Results obtained under three simulation studies were reported in Table 1, where “Bias” is the difference between the true value and the mean of the estimates based on 200 replications, “RMS” is the root mean square between the estimates based on 200 replications and its true value, and “F0” is the proportion that parameter is identified to be zero in 200 replications in terms of the criterion that a parameter is identified to be 0 if its 95 % confidence interval contains 0.
Results of Table 1 indicated that (i) Bayesian estimates of parameters are reasonably accurate because their absolute biases are less than 0.10 and the root mean square of their RMS values are less than 0.20; (ii) BLasso could identify the correct models in most cases regardless of prior inputs of parameters because the F0 values corresponding to the important covariates were less than 10 % , but the F0 values corresponding to unimportant covariates were more than 90 % . The performance of the proposed approach to recover the nonlinear function g ( t ) can be measured by the root mean square error
RASE g ( r ) = 1 300 l = 1 300 g ( u l ) g ^ ( r ) ( u l ) , r = 1 , 2 , , 200 ,
where g ^ ( r ) t = B t φ ^ ( r ) , φ ^ ( r ) represents the Bayesian estimated value of the parameter vector φ in the r-th replication. Similar to the root mean square error of nonlinear function, we also calculate the root mean square error of random effects. Figure 2 plots the estimated curve and estimated density of the nonlinear function g ^ ( t ) and the random effects b i of the replication in which the mean of the RASE of the nonlinear function and the random effects is in the middle of the 200 replications, and against true curves and true density in three simulation studies, respectively. Inspection of Figure 2 showed that (i) the Bayesian B-splines method proposed in this paper can be flexible enough fit the true curve of the nonlinear function g ( t ) ; (ii) the CDPMM method proposed is flexible enough to capture the general shapes of our considered three distribution assumptions for b i ; (iii) the estimated 95% confidence region of the nonlinear function can cover its true curve with a fairly narrow area. From Table 2, based on the results of 200 replications in three simulation studies, the estimated means and standard deviation (SD) of random effects b i are enough near to their corresponding true values, and 25%, 50% and 75% quantile of the mean of the RASE of the nonlinear function and the random effects are enough small and close to illustrate that it is robust to apply CDPMM to estimate random effects. All these findings indicated that (i) our proposed Bayesian B-splines can be flexible enough fit the true curve of the nonlinear; (ii) Bayesian procedure could well capture the true information of b i regardless of their true distributions and forms; (iii) BLasso could identify the true model with a high probability.

5. An Example

In this section, we apply the method proposed in the previous sections to the MA.5 research experiment conducted by the Clinical Trial Group of the National Cancer Institute of Canada. The data contains 716 women with early-stage breast cancer before menopause. 356 patients were randomly selected to receive cyclophosphamide, epirubicin, and fluorouracil (CEF) adjuvant chemotherapy as the experimental group. The remaining 360 patients received cyclophosphamide, methotrexate, and fluorouracil (CMF) adjuvant chemotherapy as the control group of the trial. In clinical trials, visits were made before the start of treatment, each of the 6 treatment cycles, and every three months after treatment. At each visit, medical history and physical examination are accepted, and BCQ (Breast Cancer Questionnaire) is used to assess the patient’s quality of life. The dataset has a total of 7807 observations. By the end of the study, a total of 366 patients had died, and the censorship rate was about 49 % . For a detailed study of this data, please refer to the literature of Song et al. [22] and Levine et al. [24]. We linearly convert the evaluated BCQ score into a unit interval ( 0 , 1 ) , and the longitudinal data constrained on the interval ( 0 , 1 ) is the longitudinal proportion data that we are interested in. The survival time of interest in the trial is the recurrence-free survival time (RFS), which is defined as the time from randomization to disease recurrence. Different treatment options, age, and the number of tumor-positive lymph nodes may directly affect RFS and the quality of life of patients. Similar to Tang et al. [17], we fitted the MA.5 research experiment data set to the following model:
y i j * | b i = β 1 EM i + β 2 NODE _ POS i + β 3 AGE i + g t i j + b i + ε i j ,
λ i t | b i = λ 0 t exp γ 1 EM i + γ 2 NODE _ POS i + γ 3 AGE i + ϕ b i ,
where y i j * represents the BCQ score after logit function transformation, EM i is a two-class treatment index, EM i = 1 indicates that the i-th patient underwent CEF treatment, EM i = 0 indicates that the i-th patient underwent CMF treatment. Age and the number of lymph node metastases are binary variables. Patients who are less than or equal to 40 years old belong to the younger group, that is, AGE = 1 . Patients who are older than 40 years old belong to the elderly group, that is, AGE = 0 . When the number of lymph node metastases is 0-3, NODE _ POS = 0 , otherwise it is 1. The nonlinear g ( t ) is estimated by using cubic B-splines function, and the domain of cubic B-splines function is [ min ( t i j ) , max ( t i j ) ] . The prior distributions and the values of all hyperparameters in the case study are the same as those set in the simulation study above. Based on the above settings, we calculated EPSR values for all parameters and are shown in the left panel in Figure 3, which shows that after about 3000 iterations, all EPSR values are less than 1.2. Therefore, we use the 3000 iterations after the 3000th time to calculate the Bayesian estimation. The example analysis results are shown in Table 3.
Inspection of Figure 3 showed that (i) the estimated density of random effects is approximately normally distributed. The distribution settings for random effects in our Simulation III correspond to. This shows that our CDPMM method is effective and reasonable; (ii) the estimated nonlinear function first decreases, then increases and finally stabilizes with time t. Although there are some fluctuations in the tail, this is reasonable and can be ignored. From Table 3, we can see that (i) the risk ratio of randomly receiving CEF and CMF treatment is HR = exp γ 1 = 71.605 % , which means patients who randomly receive CEF chemotherapy have a lower risk; (ii) the credible interval ( 0.171 , 0.321 ) of β 1 does not contain 0, indicating that different adjuvant chemotherapy regimens have a significant impact on the QOL of patients and CEF chemotherapy is more toxic than CMF chemotherapy; (iii) the risk ratio of the number of lymph node metastases greater than or equal to 4 to less than 4 is HR = exp γ 2 = 205.238 % , which means that the number of lymph node metastases is greater than or equal to 4. The risk of breast cancer recurrence is greater in patients, and the RFS is shorter; (iv)the regression coefficient β 2 of the patient’s lymph node metastasis number greater than or equal to 4 is 0.309 , and its credible interval does not contain 0, which is highly significant. This suggests that patients with more lymph nodes have a lower quality of life, which is also consistent with clinical experience; (v) the risk ratio between the young group and the old group is HR = exp ( γ 3 ) = 182.212 % , which means the risk of breast cancer recurrence is higher and the RFS is shorter in the young group; (vii) the credible interval ( 0.168 , 0.344 ) of β 3 = 0.263 does not contain 0, indicating that different ages have a significant impact on the QOL of patients. It indicates that the quality of life of the olderly group is better than that of the young group. (viii) ϕ = 0.259 and the credible interval ( 0.003 , 0.542 ) of ϕ does not contain 0, which means that ϕ is significantly different from 0. This shows that there is a significant correlation between the longitudinal proportional data and survival data, and the JMSRE model proposed in this paper is applicable and reasonable for the analysis of the MA.5 research experiment data.

6. Concluding Remarks

In this paper, a semiparametric joint model for longitudinal proportional data and survival data is proposed by relaxing the normality assumption of random effects and without specifying a nonlinear function that affects longitudinal responses. The advantages of the proposed model include: (i) the proposed method improves the flexibility of jointly modeling longitudinal proportional data and survival data; (ii) the proposed B-splines method can flexibly capture different nonlinear functions; (iii) the proposed CDPMM method can flexibly capture the unimodal, bimodal and multimodal features of random effects; (iv) the computational burden is not heavy, e.g. it takes about 4 minutes to run the replication in the above simulation study and about 2 hours to run the breast cancer dataset.
Our simulation studies and example analysis show that the Bayesian estimation proposed based on the joint model are quite accurate and relatively robust. The nonlinear function curve estimated by Bayesian B-splines is more flexible and can capture the true characteristics of the nonlinear function. And CDPMM method could also well capture the true information of b i . BLasso method could identify the true model with a high probability. Compared with the method proposed by Song et al. [22] to jointly model longitudinal proportional data and survival data, the joint model proposed in this paper is more flexible.
The joint model of longitudinal proportional data and survival data proposed in this paper still has many unsolved problems and we need to solve the following problems in the future: (i) does not make any constraints on the form of the basic hazard function; (ii) the distribution of random effects does not depend on the normality assumption and still has robust inferences; (iii) consider more complex spline models, for example nodes can be automatically selected to improve the performance of the proposed model; (iv) consider a joint model for the variable longitudinal proportional outcome and the multivariate survival outcome.

Appendix A

Conditional distribution of b i
Let θ b i denotes all unknown parameters associated with distribution of b i , for i = 1 , 2 , , n . θ b i can be iteratively draw by using the following steps:
Step (a) Conditional distribution of ξ given ( μ * , Ψ , b ) is given by
ξ | μ * , Ψ , b N q ( A , B ) ,
where B = ( G Ψ 1 + ( Ψ 0 ) 1 ) 1 and A = B ( ( Ψ 0 ) 1 ξ 0 + Ψ 1 g = 1 G μ g * ) .
Step (b) For j = 1 , 2 , , q , the diagonal elements of Ψ is conditionally distributed as
ψ j 1 | μ * , ξ Γ ( c 1 + G 2 , c 2 + 1 2 g = 1 G ( μ g j * ξ j ) 2 ) ,
where μ g j * is the jth element of μ g * and ξ j is the jth element of ξ .
Step (c) For j = 1 , 2 , , q , ϖ j | Ω is conditionally distributed as
ϖ j | Ω Γ ( ϖ j a , ϖ j b + g = 1 G ω g j 1 ) ,
where ω g j is the j th diagonal element of Ω g .
Step (d) Following Ishwaran and Zarepour [20], given π , the conditional distribution of τ can be given by
τ | π Γ ( a 1 + G 1 , a 2 g = 1 G 1 log ( 1 ν g * ) ) ,
where ν g * is a random weight sampled from the the following beta distribution.
Step (e) Given L and τ , the conditional distribution of π can be obtained by following generalized Dirichlet distribution:
π | L , τ Dir ( a 1 * , b 1 * , a 2 * , b 2 * , , a G 1 * , b G 1 * ) .
where a g * = 1 + d g , b g * = τ + ι = g + 1 G d ι for g = 1 , 2 , , G 1 , and d g is the number of L i s whose value equals to g. ν g * is independently generated from a Beta distribution ( a g * , b g * ) . Then, π 1 , π 2 , , π G are obtained from the following formula:
π 1 = ν 1 * , π G = 1 g = 1 G 1 π g , and π g = ι = 1 g 1 ( 1 ν ι * ) ν g * , for g 1 or G .
Step (f) Let L 1 * , L 2 * , , L d * be the d unique values of { L 1 , L 2 , , L n } (i.e., unique number of “clusters”), for g = 1 , 2 , G , μ g * is conditionally distributed as follows:
μ g * | ξ , Ψ N q ( ξ , Ψ ) for g { L 1 * , L 2 * , , L d * } ,
μ g * | ξ , Ψ , Ω , L , b N q ( E g , F g ) for g { L 1 * , L 2 * , , L d * } ,
where F g = ( Ψ 1 + Σ { i : L i = g } Ω i 1 ) 1 and E g = F g ( Ψ 1 ξ + Σ { i : L i = g } Ω i 1 b i ) for g { L 1 * , L 2 * , , L d * } . Given μ g * , μ g = μ g * Σ g = 1 G π g μ g * , μ * = { μ 1 * , μ 2 , , μ G * } and μ = { μ 1 , μ 2 , , μ G } .
Step (g) Given g, for j = 1 , 2 , , q , the jth diagonal element of Ω g is conditionally distributed as
ω g j Γ ( ω j a , ϖ j ) for g { L 1 * , L 2 * , , L d * } ,
ω g j Γ ( d g 2 + ω j a , ϖ j + { i : L i = g } 1 2 ( b i j μ g j ) 2 ) for g { L 1 * , L 2 * , , L d * } ,
where b i j is the jth element of b i and μ g j is the jth element of μ g . Given ω g j , Ω g = diag ( ω g 1 , ω g 2 , , ω g q ) and Ω = { Ω 1 , Ω 2 , , Ω G } .
Step (h) Given π , μ , Ω , b , the conditional distribution of L i is obtained by
L i | π , μ , Ω , b i . i . d Multinomial ( π i g * , g = 1 , 2 , , G ) ,
where π i g * is proportional to ( π g p ( b i | μ g , Ω g ) with b i | μ g , Ω g N q ( μ g , Ω g ) , and π g ( g = 1 , 2 , , G ) are sampled from step (e). Given L i , μ and Ω , the prior of b i is distributed as N q ( μ L i , Ω L i ) with μ L i and Ω L i being the L i elements of sets μ and Ω , respectively.
Step (i) The conditional distribution p ( b i | β , φ , σ 2 , γ , ϕ , Y * , T , Δ , W ) is non-standard and cannot be derived directly via Gibbs sampling for i = 1 , 2 , , n . Specially,
p ( b i | β , φ , σ 2 , γ , ϕ , Y * , T , Δ ) p ( b i | μ L i , Ω L i ) p Y i * | b i ; θ Y p T , Δ | b ; θ T .
The Metropolis-Hastings algorithm used to sample b i is implemented as follows. At the mth iteration with a current value b i ( m ) , a new candidate b i is drawn from the normal distribution N q ( b i ( m ) , σ b 2 Σ b i ) , where Σ b i = ( Ω L i 1 + Ξ i ) 1 and Ξ i = 2 ( ln ( p Y i * | b i ; θ Y p T , Δ | b ; θ T ) / b i b i | b i = b i ( m ) . The new b i is accepted with probability
min 1 , p ( b i | μ L i , Ω L i ) p Y i * | b i ; θ Y p T , Δ | b ; θ T p ( b i ( m ) | μ L i , Ω L i ) p Y i * | b i ( m ) ; θ Y p T , Δ | b i ( m ) , b i ; θ T ,
where b i represents the remaining random effects except the random effects of the ith individual. The variance σ b 2 can be chosen such that the average acceptance rate is approximately 0.25 or more.

References

  1. Faucett, C.L.; Thomas, D.C. Simultaneously modelling censored survival data and repeatedly measured covariates: a gibbs sampling approach. Stat. Med. 1996, 15, 1663–1685. [Google Scholar] [CrossRef]
  2. Wulfsohn, M.S.; Tsiatis, A.A. A Joint model for survival and longitudinal data measured with error. Biometrics 1997, 53, 330–339. [Google Scholar] [CrossRef] [PubMed]
  3. Schluchter, M.D. Methods for the analysis of informatively censored longitudinal data. Stat. Med. 1992, 11, 1861–1870. [Google Scholar] [CrossRef] [PubMed]
  4. Tsiatis, A.A.; Degruttola, V.; Wulfsohn, M.S. Modeling the relationship of survival to longitudinal data measured with error: applications to survival and CD4 counts in patients with AIDS. J. Am. Stat. Assoc. 1995, 90, 27–37. [Google Scholar] [CrossRef]
  5. Tsiatis, A.A.; Davidian, M. Joint modeling of longitudinal and time-to-event data: an overview. Stat. Sin. 2004, 14, 809–834. [Google Scholar] [CrossRef]
  6. Yu, M.; Law, N.J.; Taylor, J.M.G.; Sandler, H.M. Joint longitudinal-survival-cure models and their application to prostate cancer. Stat. Sin. 2004, 14, 835–862. [Google Scholar] [CrossRef]
  7. Diggle, P.J.; Sousa, I.; Chetwynd, A.G. Joint modelling of repeated measurements and time-to-event outcomes: the fourth armitage lecture. Stat. Med. 2008, 27, 2981–2998. [Google Scholar] [CrossRef] [PubMed]
  8. Lawrence, A.; Gould, *!!! REPLACE !!!*; et al. Joint modeling of survival and longitudinal non-survival data: current methods and issues. Stat. Med. 2015, 34, 2181–2195. [Google Scholar] [CrossRef]
  9. Henderson, R.; Diggle, P.; Dobson, A. Joint modelling of longitudinal measurements and event time data. Biostatistics 2000, 1, 465–480. [Google Scholar] [CrossRef]
  10. Zeng, D.; Cai, J. Simultaneous modelling of survival and longitudinal data with an application to repeated quality of life measures. Lifetime Data Analysis 2005, 11, 151–174. [Google Scholar] [CrossRef]
  11. Tang, A.; Zhao, X.; Tang, N. Bayesian variable selection and estimation in semiparametric joint models of multivariate longitudinal and survival data. Biom. J. 2017, 59, 57–78. [Google Scholar] [CrossRef] [PubMed]
  12. Song, P.X.; Tan, M. Marginal models for longitudinal continuous proportional data. Biometrics 2000, 56, 496–502. [Google Scholar] [CrossRef] [PubMed]
  13. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B. 1996, 58, 267–288. [Google Scholar] [CrossRef]
  14. Park, T.; Casella, G. The Bayesian Lasso. J. Am. Stat. Assoc. 2008, 103, 681–686. [Google Scholar] [CrossRef]
  15. Hans, C. Bayesian lasso regression. Biometrika 2009, 96, 835–845. [Google Scholar] [CrossRef]
  16. Guo, R.; Zhu, H.; Chow, S.M.; Ibrahim, J.G. Bayesian Lasso for semiparametric sructural equation models. Biometrics 2012, 68, 567–577. [Google Scholar] [CrossRef]
  17. Tang, A.; Zhao, X.; Tang, N. Bayesian variable selection and estimation in semiparametric joint models of multivariate longitudinal and survival data. Biom. J. 2017, 59, 57–78. [Google Scholar] [CrossRef]
  18. Ohlssen, D.I.; Sharples, L.D.; Spiegelhalter, D.J. Flexible random effects models using Bayesian semiparametric models: applications to institutional comparisons. Stat. Med. 2007, 26, 2088–2112. [Google Scholar] [CrossRef]
  19. Yang, M.; Dunson, D.B.; Baird, D. Semiparametric bayes hierarchical models with mean and variance constraints. Comput. Stat. Data. Anal. 2010, 54, 2172–2186. [Google Scholar] [CrossRef]
  20. Ishwaran, H.; Zarepour, M. Markov chain monte carlo in approximate dirichlet and beta two-parameter process hierarchical models. Biometrika 2000, 87, 371–390. [Google Scholar] [CrossRef]
  21. Chow, S.; Tang, N.; Yuan, Y.; Song, X.; Zhu, H. Bayesian estimation of semiparametric nonlinear dynamic factor analysis models using the Dirichlet process prior. Br. J. Math. Stat. Psychol. 2011, 64, 69–106. [Google Scholar] [CrossRef] [PubMed]
  22. Song, H.; Peng, Y.; Tu, D. Jointly modeling longitudinal proportional data and survival times with an application to the quality of life data in a breast cancer trial. Lifetime Data Analysis 2017, 23, 183–206. [Google Scholar] [CrossRef] [PubMed]
  23. Levine, M.; Bramwell, V.; Pritchard, K.; et al. Randomized trial of intensive cyclophosphamide, epirubicin, and fluorouracil chemotherapy compared with cyclophosphamide, methotrexate, and fluorouracil in premenopausal women with node-positive breast cancer. J. Clini. Oncol. 1998, 16, 2651–2658. [Google Scholar] [CrossRef] [PubMed]
  24. Tang, A.; Duan, X.; Zhao, Y. Bayesian variable selection and estimation in semiparametric simplex mixed-effects models with longitudinal proportional data. Entropy 2022, 24, 1466. [Google Scholar] [CrossRef]
Figure 1. EPSR values of all parameters against iteration numbers for a randomly selected replication in Simulation I (left panel), Simulation II (middel panel) and Simulation III (right panel).
Figure 1. EPSR values of all parameters against iteration numbers for a randomly selected replication in Simulation I (left panel), Simulation II (middel panel) and Simulation III (right panel).
Preprints 78810 g001
Figure 2. Estimated versus trues of nonlinear function g ( t ) (upper panels,) and estimated versus true denisties for random effects b i (lower panels) in Simulation I (left panel), Simulation II (middel panel) and Simulation III (right panel).
Figure 2. Estimated versus trues of nonlinear function g ( t ) (upper panels,) and estimated versus true denisties for random effects b i (lower panels) in Simulation I (left panel), Simulation II (middel panel) and Simulation III (right panel).
Preprints 78810 g002
Figure 3. EPSR values of all parameters (left panel), estimated curve for nonlinear function g t (middle panel) and estimated density for random effects b i (right panel) in the the MA.5 research experiment study.
Figure 3. EPSR values of all parameters (left panel), estimated curve for nonlinear function g t (middle panel) and estimated density for random effects b i (right panel) in the the MA.5 research experiment study.
Preprints 78810 g003
Table 1. Bayesian estimates of parameters in the simulation studies.
Table 1. Bayesian estimates of parameters in the simulation studies.
Pra. True Simulation I Simulation II Simulation III
Bias RMS F0(%) Bias RMS F0(%) True Bias RMS F0(%)
β 1  1.00 -0.005 0.058  0.0 -0.080 0.112  0.0  1.00 -0.015 0.104  0.0
β 2  0.00  0.002 0.036 99.0  0.001 0.049 98.0  0.50 -0.009 0.077  0.0
β 3  0.00 -0.002 0.033 99.0  0.005 0.042 99.5 -0.50  0.011 0.085  0.0
β 4 -0.50  0.005 0.041  0.0  0.003 0.056  0.0 -0.50  0.001 0.079  0.0
β 5  0.50 -0.001 0.037  0.0 -0.009 0.063  0.0  0.50  0.000 0.083  0.0
β 6 -1.00  0.013 0.066  0.0  0.006 0.091  0.0 -1.00  0.028 0.136  0.0
γ 1  0.00  0.010 0.120 95.5 -0.004 0.105 98.0 -0.50  0.002 0.125  0.5
γ 2  1.00  0.001 0.148  0.0 -0.052 0.144  0.0  1.00 -0.008 0.141  0.0
γ 3 -0.50  0.018 0.138  5.0  0.039 0.136  6.5 -0.50  0.008 0.127  2.5
γ 4  0.00 -0.002 0.197 97.0  0.010 0.178 97.5  1.00 -0.007 0.191  0.0
ϕ  0.60  0.006 0.106  0.0 -0.019 0.088  0.0  0.60 -0.002 0.145  1.0
σ 2  0.36 -0.002 0.018 -0.001 0.019  0.36  0.005 0.020
Table 2. Estimated mean and standard deviation for random effects, and quantiles of mean RASE for nonlinear functions and random effects.
Table 2. Estimated mean and standard deviation for random effects, and quantiles of mean RASE for nonlinear functions and random effects.
Est of random effects Quantile of RASE
Mean Est mean SD Est SD 25% 50% 75%
Simulation I -0.011 -0.007 1.004 0.961 0.091 0.115 0.138
Simulation II -0.023  0.014 1.268 1.202 0.135 0.181 0.239
Simulation III -0.001 -0.001 0.879 0.797 0.109 0.152 0.227
Table 3. Bayesian estimatations of parameters in the MA.5 research experiment study.
Table 3. Bayesian estimatations of parameters in the MA.5 research experiment study.
Par. Est SD CI
β 1  0.241 0.038 ( 0.171, 0.321)
β 2  0.309 0.050 ( 0.214, 0.414)
β 3  0.263 0.046 ( 0.168, 0.344)
γ 1 -0.334 0.142 (-0.608,-0.057)
γ 2 0.719 0.143 ( 0.452, 0.999)
γ 3 0.600 0.145 ( 0.315, 0.880)
ϕ 0.259 0.135 ( 0.003, 0.542)
σ 2 0.180 0.003 ( 0.174, 0.186)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated