Development and Validation of a Logistic Regression Model to Predict Post-Operative Mortality in Emergency Cardiac Surgeries: A Comprehensive Analysis of Pre-Operative Factors and Model Performance

Preprint

Article

Development and Validation of a Logistic Regression Model to Predict Post-Operative Mortality in Emergency Cardiac Surgeries: A Comprehensive Analysis of Pre-Operative Factors and Model Performance

Altmetrics

Downloads

149

Views

Comments

Dhruv Patel^*,Jonathan Hofmann,Andrew Bouras

This version is not peer-reviewed

Submitted:

18 July 2024

Posted:

25 July 2024

You are already at the latest version

Alerts

Abstract

Objective: The primary objective of this study was to develop a logistic regression model to predict post-operative in-hospital mortality rates for patients undergoing emergency cardiac surgeries, with the aim of improving predictive accuracy over traditional risk assessment tools and enhancing patient outcomes and clinical decision-making. Methods: Data were collected from 4,855 patients who underwent emergency cardiac surgeries at a tertiary hospital between 2008 and 2017. The analysis incorporated demographic, anthropomet- ric, and clinical factors, including the ASA classification, emergency status, and various preoperative laboratory values. A logistic regression model was developed, and the Elixhauser Comorbidity Index was calculated using standard ICD-10 codes for its comprehensive assessment of comorbidities. Model performance was evaluated using metrics such as AUROC, AUPRC, accuracy, precision, recall, and F1 score. Results: The logistic regression model demonstrated strong predictive performance, with an AUROC of 0.939 and an AUPRC of 0.350. Key pre-operative factors identified included emergency operation status, ASA classification, and preoperative prothrombin time. The model significantly outperformed the traditional ASA classification system, which showed an AUROC of 0.524 and an AUPRC of 0.010. These findings suggest a substantial improvement in predicting post-operative mortality. Conclusion: The logistic regression model significantly improves the prediction of post-operative mortality in emergency cardiac surgeries compared to the ASA classification system. These findings highlight the potential of incorporating comprehensive pre-operative factors into predictive models to enhance clinical decision-making and patient outcomes. Implementing such models in routine clinical practice could lead to more accurate risk assessments, better resource allocation, and improved patient care.

Keywords:

Subject: Medicine and Pharmacology - Cardiac and Cardiovascular Systems

1. Introduction

1.1. Background

Cardiac surgeries, especially in emergency situations, carry a significant risk of post-operative mortality. Accurate prediction of these outcomes is essential for improving patient care and resource allocation. Traditional risk assessment tools, such as the ASA classification system, often fall short in predictive accuracy due to their limited scope and reliance on subjective parameters. Recent advancements in machine learning have introduced models that can analyze large datasets and identify complex interactions among variables, providing more accurate predictions of surgical outcomes (Fan et al., 2021).

1.2. Objective

The primary objective of this study is to develop a logistic regression model to predict post-operative in-hospital mortality rates for patients undergoing emergency cardiac surgeries. By integrating a comprehensive set of pre-operative factors, the study aims to enhance predictive accuracy compared to traditional risk assessment tools like the ASA classification. Logistic regression was chosen for its interpretability and effectiveness in handling binary outcomes, while also exploring its performance relative to more complex machine learning models.

1.3. Significance

Identifying significant pre-operative factors that can predict patient outcomes is crucial in cardiac surgery. Early and accurate prediction of post-operative mortality can inform clinical decision-making, guide pre-operative preparations, and improve patient stratification. This can lead to better resource allocation, timely interventions, and improved survival rates for high-risk patients. Previous studies, such as Fernandes et al. (2020) and Fritz et al. (2019), have shown the potential of machine learning models in providing superior predictive performance. This study builds on these findings by focusing on the pre-operative phase, aiming to develop a reliable and interpretable logistic regression model for clinical settings. Successful implementation of this model could revolutionize pre-operative risk assessments, enabling more precise and personalized patient care, improving outcomes, and optimizing resource use in healthcare systems.

2. Methods

2.1. Data Collection

The dataset for this study was derived from patients undergoing emergency cardiac surgeries at a tertiary hospital between 2008 and 2017. Initially, 5,443 patients met the inclusion criteria, which required complete pre-operative records and crucial variables. Patients with incomplete records or missing data were excluded, resulting in a final sample size of 4,855 patients. Patient data were anonymized to maintain confidentiality and comply with ethical standards.

2.2. Variables

The analysis included demographic, anthropometric, and clinical factors. Demographic factors were age and sex. Anthropometric measurements included weight and height. Clinical factors comprised the ASA Classification and emergency status (emop). Pre-existing conditions were assessed using the Elixhauser Comorbidity Index. Preoperative laboratory values included preoperative prothrombin time/international normalized ratio (preop_ptinr), creatinine levels (preop_creatinine), potassium levels (preop_potassium), and hemoglobin levels (preop_hb). These variables were selected for their clinical relevance and established predictive value (Fan et al., 2021). The Elixhauser Comorbidity Index, calculated using standard ICD-10 codes, was chosen for its comprehensive assessment of 30 comorbidities, enhancing the model’s predictive accuracy. The ICD-10 codes used in this study were specifically chosen for their relevance to cardiac procedures. These codes encompass a comprehensive range of interventions and diagnostic measures pertinent to cardiac surgeries. By including these specific ICD-10 codes, the analysis ensures that the model accurately captures the complexity and variety of cardiac-related conditions and procedures, thus enhancing the predictive power and clinical relevance of the model. This comprehensive approach aligns with established medical coding practices and facilitates a robust assessment of patient comorbidities and surgical risk factors.

2.3. Statistical Analysis

The primary analytical approach involved developing a logistic regression model to predict post-operative mortality. Pre-operative variables were used as predictors. The Elixhauser Comorbidity Index, calculated using standard ICD-10 codes, was included for its comprehensive assessment of 30 comorbidities. Various statistical tests ensured the robustness of the findings: the Wald Test determined the significance of individual predictors, the Likelihood Ratio Test compared the goodness-of-fit of different models, and the Hosmer-Lemeshow Test assessed the model’s calibration. Statistical significance was assessed using P values. Logistic regression was chosen for its interpretability and proven effectiveness in binary outcome prediction (Fernandes et al., 2020).

2.4. Tools and Techniques

Statistical analysis and model development were conducted using Python (version 3.8). The scikit-learn library in Python was used for implementing machine learning algorithms and calculating performance metrics. The statsmodels library in Python was used for developing the logistic regression model, and the SHapley Additive exPlanations (SHAP) library in Python was employed to interpret feature contributions. These tools were selected for their robust analytical capabilities and acceptance in the scientific community (Kilic et al., 2019).

3. Results

3.1. Feature Importance

The logistic regression model identified several key pre-operative factors as significant predictors of post-operative mortality in emergency cardiac surgeries. These factors included emergency operation status (emop), ASA classification (asa), preoperative prothrombin time/international normalized ratio (preop_ptinr), Elixhauser Comorbidity Index, preoperative creatinine levels (preop_creatinine), preoperative potassium levels (preop_potassium), and preoperative hemoglobin levels (preop_hb). The ’emop’ variable was the most influential, highlighting the impact of emergency surgeries on patient outcomes. The importance of these features was quantified by their absolute coefficient values in the logistic regression model (Figure 1).

3.2. Model Performance

The logistic regression model demonstrated strong performance in predicting post-operative mortality, with an AUROC of 0.939. The model also achieved an AUPRC of 0.350, showing its capability to maintain high precision across varying recall thresholds. In terms of classification metrics, the logistic regression model exhibited an accuracy of 0.990, a precision of 1.000, a recall of 0.125, and an F1 score of 0.222. These metrics reflect the model’s robustness in minimizing false positives, though the recall rate indicates a need for improvement in identifying true positive cases. In contrast, the ASA classification system, a traditional risk assessment tool, performed significantly worse, with an AUROC of 0.524 and an AUPRC of 0.010. These results underscore the superior predictive accuracy of the logistic regression model over the ASA classification.

3.3. ROC Curve

The ROC curve for the logistic regression model underscores its predictive performance (Figure 2). The curve shows a clear distinction from the diagonal line, which represents a random classifier, and closely approaches the top left corner, indicative of high sensitivity and specificity. The AUROC value of 0.939 further confirms the model’s effectiveness.

3.4. Confusion Matrix

The confusion matrix (Figure 3) provides a detailed breakdown of the logistic regression model’s predictions against actual outcomes. Out of 1457 test cases, the model correctly predicted 1441 true negatives and 2 true positives, with no false positives and 14 false negatives. The high true negative rate underscores the model’s accuracy in identifying non-mortal cases. However, the relatively low number of true positives and the presence of false negatives highlight areas for improvement in the model’s sensitivity. Enhancing the model’s ability to capture true positive cases could improve its utility in clinical decision-making, ensuring that high-risk patients receive appropriate care.

3.5. Precision-Recall Curve

The precision-recall curve (Figure 4) illustrates the trade-off between precision and recall for the logistic regression model. The curve shows that the model maintains relatively high precision even as recall increases, with an AUPRC of 0.350. This performance is significantly better compared to the ASA classification’s AUPRC of 0.010, highlighting the logistic regression model’s superior ability to identify true positives while minimizing false positives. In clinical practice, this balance is crucial for ensuring that high-risk patients are accurately identified and treated, while avoiding the over-treatment of low-risk patients. The precision-recall curve reinforces the model’s potential as a valuable tool for improving patient outcomes in emergency cardiac surgeries.

4. Discussion

4.1. Interpretation of Results

The results indicate that the logistic regression model significantly outperformed the traditional ASA classification system in predicting post-operative mortality in emergency cardiac surgeries. The logistic regression model’s AUROC of 0.939 and AUPRC of 0.350 demonstrate its strong discriminative power. Key features such as emergency operation status, ASA classification, preoperative prothrombin time/international normalized ratio, Elixhauser Comorbidity Index, preoperative creatinine levels, preoperative potassium levels, and preoperative hemoglobin levels were significant predictors. These findings align with previous research, such as Fan et al. (2021), which highlighted the superiority of machine learning models in mortality prediction. The enhanced performance of the logistic regression model underscores its potential utility in clinical settings, providing clinicians with a more reliable tool for risk assessment.

The comparison with the ASA classification system underscores the limitations of traditional risk assessment tools. The ASA classification’s lower AUROC of 0.524 and AUPRC of 0.010 reflect its reduced effectiveness. This disparity emphasizes the need for predictive models that can integrate a broader range of variables and handle complex interactions. The superior performance of the logistic regression model, as demonstrated by Fernandes et al. (2020), suggests that incorporating diverse preoperative factors can enhance predictive accuracy, crucial for clinical practice, enabling more precise risk stratification and better-informed surgical decisions.

4.2. Strengths and Limitations

This study has several strengths, including the use of a comprehensive dataset from a tertiary hospital, providing a rich source of preoperative variables. The logistic regression model’s strong performance metrics, such as high AUROC and AUPRC, indicate the robustness of the analytical approach. The inclusion of the Elixhauser Comorbidity Index significantly enhanced the model’s predictive accuracy. These strengths contribute to the study’s reliability and relevance, offering valuable insights that can improve clinical outcomes in emergency cardiac surgeries.

However, the study has limitations. The retrospective nature of the data collection may introduce biases related to missing or incomplete records. Additionally, the study is based on data from a single institution, which may limit the generalizability of the findings. Future studies could benefit from multicenter datasets to validate the model’s applicability across different clinical environments. Prospective data collection methods could help mitigate biases associated with retrospective studies. While the logistic regression model demonstrated strong performance, there is room for improvement in capturing true positive cases, as indicated by the model’s recall rate. Enhancing data completeness and incorporating additional predictive factors could further refine the model’s accuracy.

4.3. Future Work

Future research should focus on expanding the dataset to include multiple institutions, enhancing the generalizability of the findings. Additionally, exploring more complex machine learning models, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), could improve predictive accuracy. Studies by Fritz et al. (2019) and Yu et al. (2021) have shown the potential of deep learning models in predicting post-operative outcomes, suggesting that such approaches could be beneficial in emergency cardiac surgeries. These models can capture complex, non-linear relationships among variables and may offer superior performance.

Incorporating real-time intraoperative data into predictive models could provide dynamic risk assessments, enabling clinicians to make more informed decisions during surgeries. Techniques such as real-time data streaming and integration with electronic health records (EHR) systems could be employed. Studies by Kilic et al. (2019) and Tseng et al. (2020) have highlighted the importance of intraoperative data in improving predictive accuracy. Finally, further investigation into the specific contributions of individual preoperative and intraoperative factors could help refine and optimize predictive models. Methods such as feature selection algorithms and sensitivity analysis could identify the most impactful predictors, ensuring the models are both accurate and clinically useful.

5. Conclusion

5.1. Summary

This study developed and validated a logistic regression model to predict post-operative in-hospital mortality rates for patients undergoing emergency cardiac surgeries. Using a dataset from a tertiary hospital, the model incorporated a comprehensive set of pre-operative factors, including emergency operation status, ASA classification, and various preoperative laboratory values. The logistic regression model demonstrated superior performance with an AUROC of 0.939 and an AUPRC of 0.350, significantly outperforming the traditional ASA classification system, which had an AUROC of 0.524 and an AUPRC of 0.010. These findings underscore the importance of integrating diverse pre-operative factors to enhance predictive accuracy.

5.2. Implications

The findings of this study have important clinical implications. The superior predictive performance of the logistic regression model suggests that it can be a valuable tool in clinical settings, helping healthcare providers make more informed decisions about patient management and resource allocation. By incorporating a comprehensive set of pre-operative factors, the model provides a more accurate risk assessment, enabling timely and targeted interventions for high-risk patients. This approach can potentially improve patient outcomes, reduce post-operative complications, and enhance overall healthcare efficiency. Implementing such predictive models in routine clinical practice could transform pre-operative evaluations, leading to better-prepared surgical teams and optimized patient care pathways. However, potential barriers to implementation, such as the need for comprehensive data collection and integration into existing clinical workflows, must be addressed. Training healthcare providers on the use of these models and ensuring the availability of necessary technological infrastructure will be crucial for successful adoption. Doctors can use our calculator to understand the chance of a patient dying within 30 days of a cardiac procedure.

References

Benedetto, U., Sinha, S., Lyon, M., Dimagli, A., Gaunt, T. R., Angelini, G., & Sterne, J. (2020). Can machine learning improve mortality prediction following cardiac surgery? European Journal of Cardio-Thoracic Surgery: Official Journal of the European Association for Cardio-thoracic Surgery.
Brobbey, A., Bruning, J. W., & Cimini, M. (2018). Ensemble-based classification models for predicting post-operative mortality risk in coronary artery disease. Journal of Biomedical Informatics, 78, 39-49.
Choi, B., Oh, A., Lee, S.-H., Lee, D. Y., Lee, J.-H., Yang, K., Kim, H. Y., Park, R. W., & Park, J. (2022). Prediction model for 30-day mortality after non-cardiac surgeryusing machine-learning techniques based on preoperative evaluation of electronic medical records. Journal of Clinical Medicine, 11, 6487.
Fan, Y., Dong, J., Wu, Y., Shen, M., Zhu, S., He, X., Jiang, S., Shao, J., & Song, C. (2021). Development of machine learning models for mortality risk prediction after cardiac surgery. Cardiovascular Diagnosis and Therapy, 12(1), 12-23.
Fernandes, M., Armengol de la Hoz, M. Á., Rangasamy, V., & Subramaniam, B. (2020). Machine learning models with preoperative risk factors and intraoperative hypotension parameters predict mortality after cardiac surgery. Journal of Cardiothoracic and Vascular Anesthesia.
Fritz, B., Cui, Z., Zhang, M., He, Y., Chen, Y., Kronzer, A., Ben Abdallah, A., King, C., & Avidan, M. (2019). Deep-learning model for predicting 30-day postoperative mortality. British Journal of Anaesthesia.
Kilic, A., Goyal, A., Miller, J. K., Gjekmarkaj, E., Tam, W., Gleason, T., Sultan, I., & Dubrawksi, A. (2019). Predictive utility of a machine learning algorithm in estimating mortality risk in cardiac surgery. The Annals of Thoracic Surgery.
Rahmanian, P. B., Adams, D. H., & Castillo, J. G. (2010). Predicting hospital mortality and analysis of long-term survival after major noncardiac complications in cardiac surgery patients. The Annals of Thoracic Surgery, 90(3), 902-909.
Widyastuti, Y., Khusen, M., Prihartono, J., & Soesastro, Y. (2012). Preoperative and intraoperative prediction of risk of cardiac dysfunction following open heart surgery. The Journal of Thoracic and Cardiovascular Surgery, 143(1), 178-186.
Yu, Y., Peng, C., Zhang, Z., Shen, K., Zhang, Y., Xiao, J., Xi, W., Wang, P.-Y., Jin, Z., & Wang, Z. (2021). Machine learning methods for predicting long-term mortality in patients after cardiac surgery. Frontiers in Cardiovascular Medicine, 9.
Zeng, J., Zhang, D., Lin, S., Su, X., Wang, P., Zhao, Y., & Zheng, Z. (2023). Comparative analysis of machine learning versus traditional modeling approaches for predicting in-hospital mortality after cardiac surgery: assessment from temporal and spatial external validation based on a nationwide cardiac surgery registry. European Heart Journal - Quality of Care & Clinical Outcomes.

Figure 1. Importance of pre-operative factors in predicting post-operative mortality, quantified by absolute coefficient values in the logistic regression model.

Figure 2. Receiver Operating Characteristic (ROC) curve for the logistic regression model, demonstrating high discriminative power with an AUROC of 0.939.

Figure 3. Confusion matrix showing the logistic regression model’s predictions against actual outcomes for post-operative mortality.

Figure 4. Precision-Recall curve for the logistic regression model, indicating a high precision across varying recall thresholds with an AUPRC of 0.350.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

MDPI Initiatives

Important Links

Choose an area of interest and we will send you notifications of new preprints at your preferred frequency.

Disclaimer

Development and Validation of a Logistic Regression Model to Predict Post-Operative Mortality in Emergency Cardiac Surgeries: A Comprehensive Analysis of Pre-Operative Factors and Model Performance

Abstract

1. Introduction

1.1. Background

1.2. Objective

1.3. Significance

2. Methods

2.1. Data Collection

2.2. Variables

2.3. Statistical Analysis

2.4. Tools and Techniques

3. Results

3.1. Feature Importance

3.2. Model Performance

3.3. ROC Curve

3.4. Confusion Matrix

3.5. Precision-Recall Curve

4. Discussion

4.1. Interpretation of Results

4.2. Strengths and Limitations

4.3. Future Work

5. Conclusion

5.1. Summary

5.2. Implications

References

MDPI Initiatives

Important Links

Subscribe