2.2.2. Experimental Data
The experimental data used to evaluate the strategy correspond to signals and measurements from cardiorespiratory variables recorded in an observational and longitudinal study. This was based on an incremental, submaximal, and multistate cardiopulmonary exercise test performed using a programmable cycle ergometer under controlled environmental conditions. The duration of the test was 45 minutes, divided into five stages: rest (2 minutes), warm-up (4 minutes), exercise (25 minutes), recovery (9 minutes), and final rest (5 minutes). Five consecutive steps with load increments of 25 W, each five minutes, and at a constant pedaling speed of 60 RPM were carried out during the exercise stage.
The recorded information was divided into two databases to analyze the usefulness of the dynamic fitting strategy from a population and personalized perspective. The first database (DB1) corresponds to the longitudinal record of twenty-seven male adult volunteers in three different instants with more than six months of difference between them. The second database (DB2) corresponds to the records of three male adult volunteers matched by the degree of consanguinity, medical history, and lifestyle but with different ages, which is proposed as an approximation of the longitudinal record of the same subject at six different stages of his life, and which is consistent with the approach of stratified medicine [
39,
40,
41,
42]. These subjects are male members of the same family, corresponding to the father and his two sons, who live together and have similar habits. The first two records correspond to the youngest subject, the next three to his brother, and the last one to his father.
All participants were considered healthy, with a physical fitness that included both sedentary and accustomed to physical training subjects, non-smokers, no history or symptoms of cardiovascular, pulmonary, metabolic, or neurological diseases, and without pacemakers or other implanted electrical stimulators. The records are composed of signals of oxygen uptake (), carbon dioxide output (), total minute ventilation (), inspiratory time (TI), tidal volume (VT), breathing frequency (BF), heart rate (HR), alveolar oxygen partial pressure () and alveolar carbon dioxide partial pressure (); measurements of systolic blood pressure (PS), diastolic blood pressure (PD), and mean arterial blood pressure (PM). Records comprised of characteristics of the environment, such as the atmospheric pressure (), inspired fraction of dry () and inspired fraction of dry ().
Quantifiable information related to factors other than age, reported as significantly influencing the cardiorespiratory system, was also recorded. These data account for habits such as diet [
33], physical activity [
35], and quality of sleep [
34,
43], related to measures of the body mass index, average hours of weekly physical activity, and average hours of daily sleep, respectively. Other relevant factors, such as accidents or illnesses, habits such as smoking, or family history of illnesses, are not reported considering that they were considered as exclusion criteria. The use of medication was only reported by the oldest subject of the DB2 and is related to blood pressure control.
Table 1 presents the information recorded for each database related to environmental conditions, characteristics of volunteers, and factors mentioned above. The data presented for DB1 correspond to the mean and standard deviation of the measurements.
All procedures were approved by the Human Research Ethical Committee of the Department of University Research (SIU) of the University of Antioquia (approval certificates 16-59-711, 17-59-711, and 18-59-711). They are conformed to the Declaration of Helsinki. The written consent of all volunteers was obtained after giving them complete information about the experimental protocol and the related risks.
2.2.3. Computational Implementation and Data Analysis
The experimental values of
and
were used as model inputs to simulate the levels of the stimulus of exercise. This implementation has been applied to simulate aerobic exercise in physiological models [
18,
44,
45,
46]. It is justified because the physiological response under this stimulus is directly related to metabolic gas rates of
consumption and
production, which is directly affected by the workload during exercises (the physiological model does not consider the workload as a model input).
The average of each measurement during the last minute for each subject (except for PS, PM, and PD) was used as a representative sample of each phase of the exercise test at steady-state conditions. All experimental data were restricted using as a reference, starting at 0.3 l/min, considered the state of rest, coinciding with the common minimum value for records of all subjects, and ending at the respective calculated value of AT (Anaerobic threshold) because the model is defined only for aerobic exercise.
The parameters values used as reference in each parameter temporal fit of the model corresponded to the immediately preceding optimization result, except for the first fit of each database, in which the nominal values reported in [
47] were used (traditional single-fit approach).
The simulation time for each stimulus level was 2000 s, which was enough to guarantee the reported settling time measure of all variables evaluated [
47]. The model responses in steady-state corresponded to the mean values of each variable at the last minute of each simulated stimulus level.
The model simulations and data processing were implemented in SIMULINK/MATLAB®. The computational characteristics for the simulation were the same as reported for the model validation, corresponding to the numerical solver ODE23 (Bogacki and Shampine BS23 algorithm for solution of ordinary non-rigid differential equations) and variable step size between s and s. The simulations were run once due to all the model equations are determinists. The specific implementation characteristics for each procedure are described below.
Dynamic Model Fitting
Standardization of simulation conditions
The parameters of the physiological model that were assigned according to experimental data corresponded to
,
,
, basal respiratory tidal volume (
), anaerobic threshold (
), total blood volume (
) and constant values of unstressed blood volumes.
,
and
account for the environmental conditions, were considered constant in all records and equal to 640 mmHg, 0.0421 %, and 21.0379 %, respectively. Values of
,
,
and the unstressed blood volumes are related to specific characteristics of the subjects. The value of
was determined from VT measurements;
was calculated using the noninvasive technique v-slope [
20,
48].
in ml was also estimated for each record as a function of the body surface area (BSA) in
according to the Equations (1) and (2) [
49,
50]. Finally, the parameters related to the unstressed blood volumes were calculated using the proportion of their reported nominal values regarding
[
47].
The values of
,
and
used for the standardization of each fit are presented in
Table 2. The data presented for DB1 correspond to the mean and standard deviation, except for
whose value corresponds to the median and interquartile range considering the data distribution.
Classification and Selection of Parameters
Steady-state simulation results for variations of stimulus level and parameter values were used for classification techniques. The number, order, and values of stimulus levels and parameter variations were the same as reported for the static fitting strategy previously applied to the same model [
37]. Three simulated exercise stimulus levels (using
and
as model inputs) were evaluated as incremental steps, corresponding to states of rest, an intermediate level of exercise, and
. Five uniformly distributed percentage variations in a range of ± 5% of the reference value were evaluated for the selected parameters.
Regarding the selection, the fitting approach related to the stimulus was not applied because its result does not represent a significant decrease in the prediction error, and the exercise mechanisms had already been fitted for the model [
37]. The selection criteria and the number of parameters defined were also the same as those reported for the model according to the static fitting strategy.
Parameters Optimization
The optimization algorithm implemented corresponds to the evolutionary strategy with covariance matrix and adaptation (CMA-ES). It is a stochastic global optimization algorithm based on adaptive and evolutionary strategies [
51]. It was selected considering the reported results of convergence speed, precision, and accuracy for the fitting of the case study model [
37] and other related ones [
26]. The parameter values used for implementing CMA-ES algorithm are the same as those reported for the model according to the static fitting strategy [
37].
The evaluation ranges for the parameter optimizations were in accordance with the static fitting strategy [
37]. The parameter values reported as nominal for the model were used as variation reference [
47]. It was established as a general evaluation range of ± 30% regarding the nominal value, expanded to ± 50% only for weighting parameters that are unrelated to direct physiological measures. These ranges were modified according to the information reported about optimization results or physiological sense (values with physiological justification).
The evaluated model predictions corresponded to the simulation results in steady-state under eight consecutive, equidistant, and incremental steps of the values of
and
between rest and AT. The experimental data for comparison were the same as the parameter’s classification procedure. The cost function implemented corresponds to Equation (3), the same RMSE-based metric used for the reported fit of the case study model [
37].
Where denotes de cost function for the model; and represent the experimental and simulated values of the variable, respectively; and K denote the number of variables and stimuli levels; the subscripts i and k indicate each variable and stimulus level.
Dynamic Parameter Modeling
The parameter values corresponding to five records were predicted for the case study. For DB1 the optimized parameter values of the record three were predicted, and for DB2 the optimized parameter values of the records from three to six were predicted. Only the parameters that presented variations in their value due to the previous fit procedures were modeled (dynamic parameters). The others remain with their same value (nominal value or standardization result of the simulation conditions).
Deterministic and parametric MISO models were implemented for each dynamic parameter. The implemented model structure corresponded to an ARX model (autoregressive with exogenous input), selected to consider the effect of the past values of the outputs and inputs on the current value [
31]. Equation (4) corresponds to the representation of the model structure.
Where represents the value of the parameter at each instant of time; is the th input, which is associated with the factors evaluated; A and B are polynomials expressed in the time-shift operator and are related to the parameter and the inputs, respectively; is the total number of inputs; is the input delay; and is considered random noise.
As inputs of each model, the time between records, the average weekly physical activity, the average daily sleep hours (
Table 1), and AT (
Table 2) were used. The time was considered as the accumulated difference between the ages in each record, assigning the value zero to the first fit of the model for each database. The AT values were used because they account for the metabolic response of the subject to cardiorespiratory changes [
48,
52]. Except for time, the values of the inputs of the previous record were considered as the equivalent values in the record to be predicted.
Different model orders associated with the polynomials were evaluated for each parameter, and the model with the best fit was selected. The past values of the parameter and the inputs were interpolated, considering that the design of the models requires constant sampling times. Sampling times between 0.1 and 0.4 years were evaluated in this work considering the temporal difference between records for each fit. The orders of polynomials A and B were evaluated from zero to the maximum number of samples available.
was considered zero regarding all inputs due to the limited number of available records. The coefficients of the polynomials for each model were estimated using the least-squares method. The best-order selection was based on applying Akaike’s Information Criterion (AIC) according to Equation (5).
Where is the loss function (normalized sum of squared prediction errors), is the total number of parameters in the structure in question, and is the number of data points used for the estimation.
Criteria based on the goodness of fit of the models and on the variation of the predicted value regarding a bounded range were applied to evaluate the physiological justification of the predictions. Model with goodness of fit less than 60%, with the sign change, or variations greater than 50% regarding the range of recorded values were discarded, and the values corresponding to the immediately previous record were used instead.
Validation
The validation consisted of comparing the simulation results of the case study model using the optimized parameter values for the records (from the dynamic fitting of the physiological model), the predicted values (from the dynamic parameter modeling), and the nominal values. The experimental values of variables for the respective record were used as reference. These comparisons were aimed at demonstrating the usefulness of the dynamic fitting strategy from the perspective of predicting the future state of the modeled system against the traditional approach of using population fit results in a single instant (single-fit).
The model performances were evaluated by the measure of the prediction error (PE) regarding the steady-state experimental measures of the cardiorespiratory variables. The metric used to calculate the PE corresponds to Equation (6), which is a modification of the Mean Absolute Error (MAE) that allows an interpretation of the differences as a proportion of the experimental data. It consider the error for each subject, variable and level of stimulus; and it is the same reported in the validation work of the case study model [
37].
Where is the experimental variable value; is the simulation prediction value; denotes the number of variables; the subscripts i, j and k are indexes that represent each subject, variable, and stimulus level, respectively.