Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

Integrating Climate Data and Advanced Machine Learning for Precision Dengue Outbreak Prediction: A Study in Ba Ria-Vung Tau province, Vietnam

Version 1 : Received: 18 September 2024 / Approved: 19 September 2024 / Online: 19 September 2024 (13:18:24 CEST)
Version 2 : Received: 23 September 2024 / Approved: 23 September 2024 / Online: 23 September 2024 (14:31:09 CEST)
Version 3 : Received: 23 September 2024 / Approved: 23 September 2024 / Online: 23 September 2024 (14:39:40 CEST)

How to cite: Anh Tuan, D.; Dang, T. N. Integrating Climate Data and Advanced Machine Learning for Precision Dengue Outbreak Prediction: A Study in Ba Ria-Vung Tau province, Vietnam. Preprints 2024, 2024091535. https://doi.org/10.20944/preprints202409.1535.v2 Anh Tuan, D.; Dang, T. N. Integrating Climate Data and Advanced Machine Learning for Precision Dengue Outbreak Prediction: A Study in Ba Ria-Vung Tau province, Vietnam. Preprints 2024, 2024091535. https://doi.org/10.20944/preprints202409.1535.v2

Abstract

Dengue fever is a persistent public health issue in tropical regions, including Vietnam, where climate variability significantly influences transmission dynamics. This study aims to develop machine learning models to forecast dengue outbreaks in Ba Ria-Vung Tau province, Vietnam, by leveraging meteorological data from 2003 to 2022. Four models were utilized: Negative Binomial Regression (NBR), Seasonal AutoRegressive Integrated Moving Average with Exogenous Regressors (SARIMAX), Extreme Gradient Boosting (XGBoost), and Long Short-Term Memory (LSTM) networks. Key climate variables incorporated in the models include daily maximum and minimum temperature (ranging from 26.57°C to 29.63°C), temperature range (1.86°C to 7.26°C), relative humidity (58.3% to 89.1%), precipitation (up to 42.53 mm/day), surface pressure (100.08 to 101.14 kPa), wind speed (1.90 to 10.23 m/s), wind direction, and sea surface temperature (24.7°C to 29.8°C). Lagged variables from 2 to 20 weeks were included to account for delayed climatic effects on dengue transmission. The NBR model demonstrated the highest predictive accuracy with the lowest Mean Absolute Error (MAE) of 21.41. SARIMAX and LSTM models effectively captured seasonal trends but struggled with short-term outbreak prediction, achieving MAEs of 20.31 and 28.86, respectively. XGBoost exhibited moderate predictive performance (MAE: 24.45) but was prone to overfitting without fine-tuning. These findings highlight the value of climate-based machine learning models, particularly NBR, in forecasting dengue outbreaks in Ba Ria-Vung Tau. However, enhancing short-term outbreak prediction remains challenging, underscoring the need for model refinement and integration into early warning systems for more effective public health responses.

Keywords

Dengue fever; machine learning; climate forecasting; negative binomial regression; SARIMAX; XGBoost; LSTM; Ba Ria Vung Tau; Vietnam

Subject

Public Health and Healthcare, Health Policy and Services

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.