Introduction
A major problem in pediatric intensive care units (PICUs) is candidal sepsis, which causes a lot of harm, people to stay in the hospital longer, and costs more money [
1,
2]. The most common kind of bloodstream infection in pediatric hospital settings is Candida spp., which accounts for 10%-15% of these cases in PICUs in the US and Europe [
3]. Additional research conducted by Mellinghoff et al. suggests that bloodstream infections caused by Candida greatly increase mortality rates to 20-49% and are responsible for about 15% of all PICU deaths [
4]. An alarming 207% increase in cases of fungal sepsis from 1979 to 2000 underscores the growing threat that this infection poses [
5]. Noting a notable rise in such infections over the last decade, Pfaller, and Ostrosky-Zeichner et al. have also added to the existing body of research [
6,
7]. Candida bloodstream infections continue to be a significant concern in pediatric settings, as shown by Piqueras et al. who recorded an incidence rate of 1.97 per 1,000 patient-days [
8]. In contrast, Hsu et al. (2018) found that candidemia typically occurs in pediatric patients at a rate of 0.21 to 10.5 occurrences per thousand admissions [
9].
In pediatric intensive care units (PICUs), candidiasis is associated with numerous risk factors and becomes more severe after seven to ten days in the ICU.Immunosuppression, steroid use, dialysis, total parenteral nutrition (TPN), central venous catheterization, recent major surgery, and antibiotic consumption in the past are primary factors [
4,
10]. Factors unique to children include extended periods of neutropenia, exposure to large doses of steroids, and, more recently, advancing age [
11]. Prematurity, urinary catheter use, antacids, H2 blockers, malnutrition, immunosuppressive and cytotoxic treatments, immunodeficiency, burns, diabetes mellitus, and immunodeficiency are all known to enhance the risk of colonization and candidiasis [
12]. Furthermore, Candida bloodstream infections (CBSIs) are more common in specific PICU populations, including newborns, children younger than one year old, and children with hematologic malignancies or severely sick situations (8). Aplastic anemia, graft-versus-host disease, steroid usage, and unrelated bone marrow or cord blood transplants are additional risk factors in hematological patients. Candidemia can develop after three days of taking vancomycin or another antibiotic that kills anaerobic bacteria, having a malignant disease, or using a central venous catheter [
13]. An uptick in patients who are immunosuppressed or in critical care, as well as the widespread use of broad-spectrum antibiotics, may be causing an increase in Candida infections, as pointed out by Hsu et al. [
9].
Candida bloodstream infections (CBSIs) are still a major issue, causing a lot of illness and death, even though infection management and antimicrobial stewardship have gotten better. On top of that, low sensitivity of blood cultures and symptoms that aren't always specific can cause diagnostic delays. Moreover, variable incidence rates, ranging from 0.21 to 10.5 cases per 1,000 admissions, hinder the development of standardized treatment protocols [
8,
9]. A notable gap in data from PICUs in developing countries exacerbates these challenges, indicating a critical need for global insights into this issue [
2]. Although the potential benefits of machine learning in healthcare are significant [
14,
15], there is currently a lack of predictive models specifically designed for the early prediction of pediatric fungemia-associated mortality. Developing such models is crucial, as they can significantly enhance early detection, enable personalized medicine [
16,
17], optimize resource allocation, and provide insights into disease progression. These models could potentially transform patient outcomes by allowing for tailored interventions based on real-time risk assessments and improving the efficiency of clinical trials. However, the success of these models hinges on the availability of high-quality, diverse datasets and the integration of machine learning predictions into clinical practice. Addressing these challenges and ethical considerations, such as ensuring patient privacy and data security, is essential for the successful implementation and acceptance of predictive models in medical settings.
Predicting mortality due to fungemia in pediatric intensive care units (PICUs) may be possible with the help of the Candida Score. The Candida Score is useful for predicting candidiasis in adult intensive care units [
18], but it doesn't work when applied to children's needs since their immune systems and pathophysiology are different. Current models, such as the EQUAL Candida score, are designed for adults and do not consider aspects that are unique to children, like the length of time they are on immunosuppressive medication or mechanical breathing [
19,
20]. Considering these caveats, the purpose of our research is to adapt the Candida Score into a pediatric-specific predictive model by adding relevant pediatric-specific characteristics. Two essential components of fungemia management—improving mortality risk prediction and meeting a significant demand in pediatric intensive care—are expected to be addressed by this adjustment.
Methods
Study Design and Setting
This retrospective cohort study was conducted in pediatric intensive care units (PICUs) at King Abdulaziz University Hospital, Jeddah, Saudi Arabia. The study included pediatric patients admitted between 2016 and 2020.
Study Population
The study cohort comprised 85 pediatric patients diagnosed with fungemia. Inclusion criteria were patients aged 1months to 18 years with a positive blood culture for Candida species. Exclusion criteria included patients with incomplete medical records and those who were not treated in the PICU. For the PICU fungemia cohort study, inclusion criteria encompassed pediatric patients aged 0 to 18 years who were admitted to the PICU within the study timeframe and had a confirmed diagnosis of fungemia via blood culture. Only those with comprehensive medical records providing the necessary clinical data for the study's duration were considered. Excluded were individuals older than 18, those who developed fungemia after transferring out of the PICU or had antifungal treatment prior to PICU admission. Also excluded were patients with incomplete data, those with co-existing conditions influencing mortality risk independent of fungemia, such as terminal illnesses, and patients with PICU stays shorter than 24 hours to ensure adequate data collection for the study’s predictive modeling objectives. This delineation of criteria aimed to ensure a homogenous study population for accurate analysis of fungemia risk and outcomes in a pediatric intensive care context.
Data Collection
In the retrospective cohort study within a pediatric intensive care unit setting, data were meticulously gathered to inform a multivariate predictive model for fungemia. Data collection encompassed patient demographics, clinical parameters, and microbiological findings. The Candida Score was calculated for each patient to assess the risk of invasive candidiasis, incorporating elements such as total parenteral nutrition, surgical history, multifocal Candida colonization, and the presence of severe sepsis. Ages and weights at PICU admission were noted, along with the Pediatric Risk of Mortality scores, to gauge illness severity. The duration of mechanical ventilation and length of PICU stay were documented, as well as the duration of antibiotic treatment preceding fungal infection. The presence fungus was identified from blood cultures. The overall length of hospitalization prior to fungal infection onset was recorded. Gender was categorized as male or female. Additionally, outcomes were classified into survival or mortality to correlate clinical data with patient prognosis, and the isolation of the specific type of fungus was noted to discern patterns in fungal prevalence.
Diagnosis of Fungemia
We used the BacT/Alert automated system (Organon, Teknika, USA) to conduct blood cultures. Five milliliters of blood were added to a pediatric culture container for every sample. When the BacT/Alert system identified microbial growth or after five days of incubation without growth, the culture bottles were placed inside and left to continue. The contents of the bottles were Gram stained for preliminary identification once growth was detected. One step further was to subculture the bottles that showed yeast cells onto Sabouraud dextrose agar (SDA) from Saudi Prepared Media Laboratories in Riyadh, KSA. Yeast cells were quickly recognized using VITEK MS the same day when there was sufficient growth on the SDA. Using the VITEK®2 system (bioMérieux, Inc., France), the antifungal susceptibility testing, and species identification of Candida were carried out. This all-encompassing method guaranteed that fungemia cases were properly identified and treated with targeted antifungal medication.
Candida Score
The Candida Score is a clinical tool designed to assess the risk of invasive candidiasis in critically ill patients, particularly those in intensive care units. It helps clinicians decide when to start antifungal therapy in high-risk patients who do not have confirmed candidiasis. The score includes four factors: the administration of total parenteral nutrition (TPN), a history of recent surgery (especially abdominal surgery), multifocal Candida colonization (such as colonization in the urine, respiratory tract, or wounds), and the presence of severe sepsis. Each factor contributes to the score as follows: 0.908 for TPN, 0.997 for surgery, 1.112 for multifocal colonization, and 1.112 for severe sepsis. The scores for each patient are summed, and a total score of 3 or higher indicates a high risk of invasive candidiasis. This score was used in the model as risk factors for mortality.
Outcome Measures
The primary outcome was mortality within 30 days of the diagnosis of fungemia.
Statistical Analysis
Descriptives Statistics
The study used descriptive statistics to summarize the demographic and clinical features of the pediatric patients that were part of the research. Depending on the data distribution, continuous variables were either shown as medians with interquartile ranges or as means with standard deviations. The distribution of categorical variables within the study population was made clear by expressing them as percentages and frequencies. The data was displayed in a tabular format using tableone R tools alone, with outcomes (survival versus fatality) stratified.
Creating the Model
Logistic regression analysis was the first step in developing the model that could predict the likelihood of death. The purpose of this research was to determine which of the several clinical and laboratory factors were significant predictors of death. After being included in the univariate analysis, variables that showed a significant connection with mortality (p-value <0.05) were eventually added to the multivariate logistic regression model. This methodical procedure improved the model's prediction power by ensuring that only the most important predictors were included in the final version. To enhance our predictive capabilities, we compared three advanced statistical models: Random Forest, Gradient Boosting, and Logistic Regression.
Assessment of the Model
Area under the receiver operating characteristic curve (AUC) was the primary metric used to assess the prediction models' discriminative capacity. To find out which model had the greatest predictive performance, we computed and compared the AUC values for Logistic Regression, Random Forest, and Gradient Boosting. A model with good discrimination has an AUC value closer to 1, but a value closer to 0.5 implies about the same performance as chance.
Statistical Programs
Appropriate statistical software applications, including R, were used to conduct the studies. This study's extensive data analyses were made possible by the computational power and flexibility given by these technologies. A statistically significant link between the factors and the result of interest was shown by a p-value less than 0.05 with CI 95%. ChatGPT 4 statistical tool was used to create image by Python 3.8 version with appropriate libraries.
Verifying the Model
The prediction models were subjected to internal validation using bootstrapping approaches to guarantee their robustness and dependability. To accomplish this, we created several bootstrap samples by resampling the initial dataset with replacement. Next, the samples were subjected to the models, and the AUC was determined for every one of them. After adjusting for optimism, the average of these AUC values gave a prediction of how well the models would do on previously unknown data, proving their validity.
Model Validation
Model validation is crucial to ensure that the logistic regression model performs well on unseen data and is not overfitting the training data. Cross-validation is a robust method for assessing model performance. Variance Inflation Factor (VIF) and Hosmer-Lemeshow goodness-of-fit test were done for model validation.
Ethical Considerations
This study was conducted in strict accordance with the ethical standards outlined in the Declaration of Helsinki (1964 and its subsequent amendments). Ethical approval was obtained from the Institutional Review Board (IRB) of King Abdulaziz University Hospital, Jeddah, Saudi Arabia, under Reference No. 566-23. Due to the retrospective nature of our analysis, the IRB waived the requirement for informed consent. To ensure rigorous protection of patient confidentiality and anonymity, several data anonymization and security measures were implemented. Firstly, all patient identifiers were removed from the dataset prior to analysis and replaced with unique anonymization codes. This process ensured that the data could not be traced back to individual patients. Additionally, all electronic data were encrypted and stored on secure, password-protected servers accessible only to authorized research personnel. Physical copies of data, if any, were kept in locked cabinets in secure facilities. These precautions were taken to adhere to the highest standards of privacy and ethical integrity, minimizing any risk of data breaches while maintaining the utility of the data for research purposes. This approach not only protects individual patient privacy but also ensures the integrity of the research process.
Verifying the Model
To determine whether the regression model has multicollinearity among the predictor variables, the study used the Variance Inflation Factor (VIF). As larger values may suggest multicollinearity issues, VIF values below 5 are usually considered acceptable. The VIF values for Age and Wt were 5.86 and 6.02, respectively, which is slightly higher than 5, indicating a little amount of multicollinearity. However, none of the other predictors had VIF values below 5, therefore there were no major problems with multicollinearity. The results show that the model is correct. Additionally, the Hosmer-Lemeshow goodness-of-fit test yielded a chi-squared statistic of 11.818 with 8 degrees of freedom and a p-value of 0.1595. The non-significant p-value suggests that the logistic regression model fits the data adequately. Moreover, to ensure our logistic regression model was strong and trustworthy, we used a five-fold K-Fold Cross-Validation procedure. In this method, the dataset is partitioned into five equal parts. The model is trained on four parts, which make up 80% of the data, and tested on the fifth part, which makes up 20% of the data. This way, each part is used as a testing set once during the process. Because of this, we can use every single piece of data for our training and testing. Important metrics including recall, accuracy, precision, and the Area Under the Receiver Operating Characteristic Curve (ROC-AUC) were used to assess the model's efficacy. By correctly identifying true positives and true negatives in about 72% of the instances, the logistic regression model attained an average accuracy of 71.76%, suggesting a good degree of prediction correctness. It is critical to identify patients at high risk of death in clinical settings, and the model's ability to predict right outcomes nearly 67% of the time was reflected in the stated accuracy of 67.49%. In addition, the model had a recall of 74.64%, which means it correctly identified 75% of all positive instances. This is important for clinical applications since it allows us to catch the majority of patients who are at danger. Finally, the model's ability to distinguish between survival and non-survival groups across multiple threshold settings is excellent, as indicated by the ROC-AUC of approximately 79.90%.
Discussion
High Mortality Rates in Pediatric Fungemia
Our study reports a concerning 45.9% mortality rate (39 out of 85 patients) in pediatric intensive care units due to fungemia, which significantly exceeds previous findings. This rate contrasts starkly with earlier studies: Zeng et al. (2020) reported the lowest mortality rate of 8.1% [
21] and Rajendran et al. (2016) at 20.9% [
22], and again Zeng et al. (2020) at 20.4% [
23]. Other studies such as Karacaer et al. (2014) noted a 27% mortality rate [
24], Chakrabarti et al. (2020) at 29.4% [
25], Keighley et al. (2019) at 31% [
26], Zaoutis et al. (2010) at 44% [
27], and Ghrenassia et al. (2019) reported 60% among immunocompromised patients [
28], with Dimopoulos et al. (2008) observing the highest rate of 66% [
29]. These varied figures underline the severe challenges in treating children with fungemia and the substantial differences in mortality rates across studies.
Understanding Pediatric Fungemia Risk Factors
Our research delves into the complex dynamics of pediatric fungemia in intensive care settings, uncovering a detailed landscape of risk factors and outcomes. While extensive studies have identified several contributing factors, pinpointing the exact drivers behind the severity and mortality of candidiasis remains complex. Established risk factors such as immunosuppression, steroid use, dialysis, and total parenteral nutrition (TPN) are confirmed in our findings, consistent with both adult and pediatric studies [
4,
10]. Specific to pediatrics, extended neutropenia and high-dose steroid exposure were highlighted [
11]. Our logistic regression analysis explored the impact of various demographic factors on mortality, revealing that age and weight did not significantly influence outcomes—a finding corroborated by several studies [
3,
25,
27,
30,
31], although Hegazi et al. (2014) suggest age may still be relevant in certain candida infections [
32].
Limitations of the PRISM Score in Predicting Fungemia Mortality
The PRISM (Pediatric Risk of Mortality) score is a widely recognized tool used to assess mortality risk in pediatric intensive care units (PICUs), typically calculated upon admission [
33]. While its utility is well established for initial clinical assessments, our study raises concerns about its effectiveness in predicting mortality from fungemia, an infection that usually develops during the ICU stay and beyond the acute phase initially assessed by the PRISM score. The findings from our analysis showed no significant relationship between the PRISM score and mortality related to fungemia, with an odds ratio (OR) of 1.000 and a confidence interval (95% CI) from 0.938 to 1.065 (p = 0.989), suggesting that the PRISM score may not effectively capture the risk of nosocomial infections that occur post-admission [
34].This limitation highlights the necessity for additional predictive tools that are tailored to the evolving conditions within the ICU. The Candida score, for instance, has shown significant predictive value for mortality in our cohort, underscoring its potential utility in this specific context [
35]. Integrating the Candida score or similar ICU-specific tools could improve the accuracy of mortality predictions and enable more precise clinical interventions for pediatric patients with ICU-acquired infections, potentially enhancing overall prognosis and outcomes (Vincent et al. [
36]. The adaptation of existing models or development of new models to better suit the dynamic ICU environment could bridge the gap in current predictive capabilities, ensuring that assessments remain relevant throughout the patient's ICU stay [
37]. As such, there is a pressing need to reevaluate the applicability of the PRISM score in the context of late-onset complications such as fungemia and to consider supplementary models that address these limitations [
38].
Impact of Mechanical Ventilation on Mortality Risk
The study highlights two pivotal determinants of mortality in pediatric intensive care patients with fungemia: the duration of mechanical ventilation and the Candida Score. Our analysis indicates that each additional day of mechanical ventilation correlates with an 11.5% increase in the odds of mortality (odds ratio [OR] 1.115; 95% confidence interval [CI] 1.025-1.212; p=0.011), suggesting that prolonged mechanical ventilation is a marker of severe underlying conditions or complications, thereby elevating the risk of adverse outcomes. Xiao et al. (2019) noted a significant disparity in the duration of mechanical ventilation between survivors and non-survivors, with survivors requiring shorter periods (5 days) compared to non-survivors (11.5 days) [
46]. Similarly, Paiva et al. (2016) identified mechanical ventilation as an independent predictor of mortality, with a notably high OR of 8.86 [
47]. Conversely, Singh et al. (2016) contended that the duration and complexity of mechanical ventilation do not directly impact fungal disease outcomes [
48], a view that contrasts with findings from Peres-Bota et al. (2004), who confirmed mechanical ventilation as an independent predictor of mortality in Candida infections [
49]. Supporting this, Morrell et al. (2005) suggested that longer durations of mechanical ventilation correlate with more severe Candida bloodstream infections, thus increasing mortality rates [
50].
Candida Score as a Predictor of Mortality
Our study demonstrates that a higher Candida Score significantly increases the odds of mortality, with an odds ratio of 2.205 and a confidence interval ranging from 1.359 to 3.578. This underscores the predictive utility of the Candida Score in assessing patient outcomes in pediatric ICU settings, suggesting that a higher score—often reflective of severe clinical scenarios like multifocal colonization and severe sepsis—substantially elevates mortality risk. Unlike previous research, our study introduces a predictive model that incorporates the Candida Score along with other relevant predictors specifically within a pediatric ICU context. In comparison, Juneja et al. (2022) documented a more pronounced association in an adult medical ICU, where a Candida Score greater than 3 was linked to an approximately 13.2-fold increase in mortality risk, with a wide but statistically significant confidence interval ranging from 1.3 to 125 [
51]. This points to the importance of the Candida Score in predicting outcomes, though prior models, including those by Juneja et al. (2022), have not been adapted for pediatric ICU populations, marking a significant gap that our study aims to fill.
Mortality Risks: Candida Albicans vs. Non-Albicans Candida
In the context of candidemia in pediatric intensive care units, our study identified a 2.73-fold increased risk of mortality associated with Candida albicans infections, though this did not reach statistical significance (95% CI: 0.857-8.703, p = 0.089). This finding contrasts with several other studies, which report higher mortality and morbidity rates associated with non-albicans Candida (NAC) infections. Behera et al. (2020) observed that all recorded deaths in their study were linked to NAC infections, suggesting a potentially higher virulence or resistance profile for these species [
52]. Similarly, Caggiano et al. (2017) found that infants with NAC infections experienced significantly longer stays in neonatal intensive care units compared to those infected by C. albicans, indicating more severe outcomes or complicated clinical courses with NAC infections [
53]. Furthermore, Dutta et al. (2012) reported that NAC species are often linked with higher morbidity due to their resistance to common antifungal treatments like fluconazole [
54]. However, Rajeshwari et al. (2022) found no significant difference in survival rates between infections caused by Candida albicans and non-albicans species [
55], suggesting that the impact of Candida species on mortality may vary across different study settings and populations. Tsai et al. (2017) and Filioti et al. (2007) both highlight the significant challenges posed by NAC species in managing candidemia, due to their distinct resistance profiles and associations with higher morbidity compared to C. albicans [
56,
57]. The variation in findings across these studies, including ours, underscores the complexity of candidemia in critically ill pediatric populations. It highlights the need for a nuanced understanding of the impact of different Candida species on clinical outcomes. This suggests that while Candida albicans remains a significant pathogen, the increasing prevalence and complexity of NAC species infections require tailored therapeutic strategies and highlight the importance of integrating Candida species identification into risk models for better management and prognosis of pediatric patients in intensive care settings. Our study contributes to this body of knowledge by exploring the differential impacts of Candida species on mortality, underscoring the critical need for ongoing surveillance and adaptive treatment approaches in the face of evolving fungal epidemiology.
Comparative Evaluation of Predictive Models
This study demonstrated the superior performance of the Random Forest model over Logistic Regression and Gradient Boosting models in predicting mortality among children hospitalized with fungemia. Our Random Forest model achieved an AUC of 0.8386, indicating not only high accuracy but also robustness in handling complex interactions and heterogeneity inherent in clinical data [
58,
59].The Logistic Regression model, with an AUC of 0.7085, though useful for its interpretability and simplicity, may have underperformed due to its linear nature and inability to capture more complex patterns in the data without extensive feature engineering [
60]. On the other hand, Gradient Boosting showed a moderate performance with an AUC of 0.7759, which aligns with literature suggesting its efficacy in handling varied data types and distributions, yet possibly falling short due to overfitting concerns in smaller or more imbalanced datasets [
61].The findings align with recent studies emphasizing the effectiveness of ensemble methods like Random Forest in clinical predictions, which benefit from their capacity to model non-linear interactions and their robustness against overfitting, especially in datasets with many variables and complex underlying structures [
62,
63]. Additionally, the precision-recall curves and confusion matrices suggest that Random Forest not only predicts more accurately but also maintains a balance between sensitivity and specificity, crucial for clinical decision-making in pediatric settings [
64,
65]. These results support the integration of Random Forest models into decision-support tools in pediatric intensive care, offering potentially lifesaving insights with high reliability. Future research should focus on validating these models in multicentric settings and across broader demographic groups to enhance generalizability and applicability [
66,
67].
Predictive Model Validation
Our study's logistic regression model appears robust and adequately fitted to predict clinical outcomes, as demonstrated through various statistical tests and validation techniques. The use of the Variance Inflation Factor (VIF) to assess multicollinearity among predictors revealed slightly elevated VIF values for Age and Wt (5.86 and 6.02, respectively). Although these values exceed the commonly accepted threshold of 5, suggesting mild multicollinearity, they do not indicate severe problems that could undermine the model's reliability significantly. Research suggests that VIF values up to 10 may not distort the regression estimates significantly, though they should be interpreted with caution [
68]. Furthermore, the Hosmer-Lemeshow goodness-of-fit test yielded a chi-squared statistic of 11.818 with 8 degrees of freedom and a non-significant p-value (p=0.1595), supporting the model's fit [
60]. This indicates that the model's predictions are not significantly deviant from the observed outcomes, suggesting that the model is appropriate for the data. The employment of a five-fold K-Fold Cross-Validation method enhances the model's credibility by utilizing every data segment both for training and testing, ensuring generalizability and robustness of the predictive accuracy [
69]. This method of validation is crucial in clinical predictive modeling, as it helps avoid overfitting and ensures that the model performs well across different subsets of data. Performance metrics such as recall, precision, accuracy, and ROC-AUC are crucial for evaluating the clinical utility of a predictive model. Our model demonstrated a high level of predictive accuracy with an average accuracy of 71.76%, and a precision of approximately 67.49%, indicating a good predictive balance [
70]. The recall rate of 74.64% is particularly noteworthy for clinical applications, as it implies that the model can identify many positive cases, which is essential for early intervention in clinical settings [
71]. Lastly, the ROC-AUC score of approximately 79.90% is indicative of the model’s excellent capacity to distinguish between the patient survival and non-survival across various threshold settings [
6]. Such a high AUC value confirms the model’s effectiveness in handling binary classification problems typical in medical outcome predictions. The model shows slight multicollinearity for some predictors, its overall statistical validation and cross-validation through k-fold testing confirm its robustness and reliability for clinical use. Future studies might explore the implications of multicollinearity in more depth and refine the model to enhance its predictive accuracy and precision.
Implications for Clinical Practice, Challenges and Future Directions
The study introduces a model that combines the Candida Score with key predictors to assess mortality risk in pediatric patients with fungemia, marking a significant improvement in pediatric intensive care. This approach aids in making precise predictions that can guide more effective treatments. However, integrating such predictive models into daily medical practice is challenging. The accuracy of predictions depends heavily on the quality of data, which varies significantly across different regions, especially in developing countries. Ensuring the model's seamless integration into clinical workflows without disrupting existing practices is another hurdle. Moreover, the ethical implications concerning patient privacy and data security must be addressed to prevent misuse of sensitive health information.
Clinical Implication and Future Challenges
This study develops an innovative model to enhance mortality risk assessments in pediatric patients with fungemia in intensive care units by integrating the Candida Score with crucial clinical predictors. This model promises significant clinical utility by potentially reducing fungemia-related mortality through more precise risk stratification. Early identification of high-risk patients could lead to earlier and more tailored interventions, possibly improving patient outcomes. Moreover, this approach demonstrates the feasibility of applying machine learning tools in pediatric healthcare, potentially leading to broader applications for similar predictive modeling in other complex health conditions. The future application of this model faces several hurdles that need addressing. Firstly, external validation of the model across varied healthcare settings is crucial to confirm its effectiveness and reliability beyond the study's initial environment. Secondly, integrating such predictive models into routine clinical workflows could be challenging, requiring adjustments to current practices and training for healthcare providers. Additionally, ethical concerns regarding the management of patient data, including privacy and security, are paramount as predictive modeling typically involves handling sensitive information. Lastly, ongoing adaptations of the model may be necessary to keep pace with evolving pathogens and resistance patterns, ensuring sustained relevance and accuracy.
Strengths
The study's primary strength lies in its innovative integration of the Candida Score with specific clinical predictors to refine mortality risk assessments for pediatric fungemia patients. This approach leverages the predictive power of machine learning, demonstrated by the superior performance of the Random Forest model, which notably outperforms traditional statistical methods. The rigorous methodological framework and comprehensive data analysis strengthen the study's findings, providing a robust foundation for potential clinical application. The research also fills a significant gap by addressing the need for specialized predictive tools in pediatric intensive care settings.
Figure 1.
Fungal Pathogen Distribution in Pediatric Intensive Care Fungemia Cases. The Donut chart illustrates the proportion of different fungal pathogens isolated from a cohort of 85 pediatric patients diagnosed with fungemia in the intensive care unit between 2016 and 2020, presented with frequency and percentage of each type of Fungus.
Figure 1.
Fungal Pathogen Distribution in Pediatric Intensive Care Fungemia Cases. The Donut chart illustrates the proportion of different fungal pathogens isolated from a cohort of 85 pediatric patients diagnosed with fungemia in the intensive care unit between 2016 and 2020, presented with frequency and percentage of each type of Fungus.
Figure 2.
Odds Ratios and Confidence Intervals for Clinical Predictors of Outcomes. This figure illustrates the odds ratios and their corresponding 95% confidence intervals for various clinical and demographic predictors influencing the outcome, as determined by logistic regression analysis. Each horizontal line represents the confidence interval for a predictor, with the point denoting the odds ratio. Labels on the extreme left display the p-value, the odds ratio, and the confidence interval range for each predictor. The vertical dashed line at odds ratio 1 indicates no effect.
Figure 2.
Odds Ratios and Confidence Intervals for Clinical Predictors of Outcomes. This figure illustrates the odds ratios and their corresponding 95% confidence intervals for various clinical and demographic predictors influencing the outcome, as determined by logistic regression analysis. Each horizontal line represents the confidence interval for a predictor, with the point denoting the odds ratio. Labels on the extreme left display the p-value, the odds ratio, and the confidence interval range for each predictor. The vertical dashed line at odds ratio 1 indicates no effect.
Figure 3.
Diagnostic and Predictive Performance of Machine Learning Models in PICU Fungemia Risk Assessment. ROC curves, Precision-Recall curves, and Confusion Matrices for Logistic Regression, Gradient Boosting, and Random Forest are displayed. Random Forest shows superior performance with an AUC of 0.86 compared to Logistic Regression (AUC = 0.72) and Gradient Boosting (AUC = 0.79). The model demonstrates high sensitivity and specificity, with favorable precision-recall balance, making it a more effective tool for mortality risk prediction in pediatric intensive care. Confusion Matrices highlight Random Forest's accuracy and lower misclassification rate, supporting its use as a robust decision support system in improving patient management and outcomes.
Figure 3.
Diagnostic and Predictive Performance of Machine Learning Models in PICU Fungemia Risk Assessment. ROC curves, Precision-Recall curves, and Confusion Matrices for Logistic Regression, Gradient Boosting, and Random Forest are displayed. Random Forest shows superior performance with an AUC of 0.86 compared to Logistic Regression (AUC = 0.72) and Gradient Boosting (AUC = 0.79). The model demonstrates high sensitivity and specificity, with favorable precision-recall balance, making it a more effective tool for mortality risk prediction in pediatric intensive care. Confusion Matrices highlight Random Forest's accuracy and lower misclassification rate, supporting its use as a robust decision support system in improving patient management and outcomes.
Figure 4.
Normalized Feature Importance Across Mortality Prediction Models in PICU. This figure compares feature importance in Logistic Regression (LR), Gradient Boosting (GB), and Random Forest (RF) models. Key predictors include Candida Score (CScore), most prominent in RF (importance > 0.8), PRISM, and LOPICUS scores, showing varied significance. Lesser predictors such as age and gender in LR, and moderate predictors like weight (Wt) and C. albicans in GB and RF, are also assessed. This visualization highlights the relative impact of each predictor on mortality risk assessment in pediatric intensive care, with RF notably valuing comprehensive clinical parameters, essential for enhancing prognostic accuracy in critical care.
Figure 4.
Normalized Feature Importance Across Mortality Prediction Models in PICU. This figure compares feature importance in Logistic Regression (LR), Gradient Boosting (GB), and Random Forest (RF) models. Key predictors include Candida Score (CScore), most prominent in RF (importance > 0.8), PRISM, and LOPICUS scores, showing varied significance. Lesser predictors such as age and gender in LR, and moderate predictors like weight (Wt) and C. albicans in GB and RF, are also assessed. This visualization highlights the relative impact of each predictor on mortality risk assessment in pediatric intensive care, with RF notably valuing comprehensive clinical parameters, essential for enhancing prognostic accuracy in critical care.
Table 1.
Clinical and Demographic Characteristics by Outcome in Pediatric ICU Patients. This table compares median and interquartile ranges for age, weight (Wt), Pediatric Risk of Mortality (PRISM) scores, duration of mechanical ventilation (DOMV), PICU stay length (LOPICUS), pre-fungal infection antibiotic duration (DOAbBFI), fungal score (Cscore), Candida Albicans presence (Calbicans), and hospital stay length before fungal infection (LHSBFI) between survivors (Outcome=0) and non-survivors. Non-parametric tests assess statistical significance, with p-values highlighting key outcome determinants in pediatric critical care.
Table 1.
Clinical and Demographic Characteristics by Outcome in Pediatric ICU Patients. This table compares median and interquartile ranges for age, weight (Wt), Pediatric Risk of Mortality (PRISM) scores, duration of mechanical ventilation (DOMV), PICU stay length (LOPICUS), pre-fungal infection antibiotic duration (DOAbBFI), fungal score (Cscore), Candida Albicans presence (Calbicans), and hospital stay length before fungal infection (LHSBFI) between survivors (Outcome=0) and non-survivors. Non-parametric tests assess statistical significance, with p-values highlighting key outcome determinants in pediatric critical care.
Variable |
Survive |
Mortality |
P value |
N |
46 |
39 |
|
Age (median [IQR]) |
7.00 [2.00, 20.25] |
4.00 [2.00, 11.00] |
0.362 |
Gender = Male (%) |
29 (63.0) |
24 (61.5) |
1.000 |
Wt (median [IQR]) |
5.00 [3.41, 8.00] |
4.80 [3.60, 6.80] |
0.958 |
PRISM (median [IQR]) |
9.50 [5.00, 13.00] |
10.00 [6.50, 15.50] |
0.422 |
DOMV (median [IQR]) |
11.00 [1.50, 24.00] |
20.00 [11.50, 30.00] |
0.011** |
LOPICUS (median [IQR]) |
25.50 [14.00, 35.75] |
29.00 [18.50, 38.00] |
0.514 |
DOAbBFI (median [IQR]) |
13.00 [10.00, 16.00] |
18.00 [10.00, 21.50] |
0.024* |
Cscore (median [IQR]) |
2.00 [1.00, 2.00] |
4.00 [2.00, 4.00] |
<0.001*** |
Candida albicans (%) |
14 (30.4) |
20 (51.3) |
0.083 |
Non-Albicans (%) |
32(69.6) |
19(48.7) |
|