PreprintArticleVersion 1This version is not peer-reviewed
Precipitation Return Period Estimation using Random Forest: A Comparative Analysis with Probability Density Functions using Outdated Weather Station Data
Version 1
: Received: 31 October 2024 / Approved: 1 November 2024 / Online: 7 November 2024 (07:13:00 CET)
How to cite:
Anco-Valdivia, J.; Valencia-Félix, S.; Espinoza Vigil, A. J.; Anco, G.; Booker, J.; Juarez-Quispe, J.; Rojas-Chura, E. Precipitation Return Period Estimation using Random Forest: A Comparative Analysis with Probability Density Functions using Outdated Weather Station Data. Preprints2024, 2024110492. https://doi.org/10.20944/preprints202411.0492.v1
Anco-Valdivia, J.; Valencia-Félix, S.; Espinoza Vigil, A. J.; Anco, G.; Booker, J.; Juarez-Quispe, J.; Rojas-Chura, E. Precipitation Return Period Estimation using Random Forest: A Comparative Analysis with Probability Density Functions using Outdated Weather Station Data. Preprints 2024, 2024110492. https://doi.org/10.20944/preprints202411.0492.v1
Anco-Valdivia, J.; Valencia-Félix, S.; Espinoza Vigil, A. J.; Anco, G.; Booker, J.; Juarez-Quispe, J.; Rojas-Chura, E. Precipitation Return Period Estimation using Random Forest: A Comparative Analysis with Probability Density Functions using Outdated Weather Station Data. Preprints2024, 2024110492. https://doi.org/10.20944/preprints202411.0492.v1
APA Style
Anco-Valdivia, J., Valencia-Félix, S., Espinoza Vigil, A. J., Anco, G., Booker, J., Juarez-Quispe, J., & Rojas-Chura, E. (2024). Precipitation Return Period Estimation using Random Forest: A Comparative Analysis with Probability Density Functions using Outdated Weather Station Data. Preprints. https://doi.org/10.20944/preprints202411.0492.v1
Chicago/Turabian Style
Anco-Valdivia, J., Julio Juarez-Quispe and Erick Rojas-Chura. 2024 "Precipitation Return Period Estimation using Random Forest: A Comparative Analysis with Probability Density Functions using Outdated Weather Station Data" Preprints. https://doi.org/10.20944/preprints202411.0492.v1
Abstract
Precipitation during a specific return period plays an important role in the design of hydraulic infrastructure. The traditional approach involves collecting annual maximum precipitation data from a station, then subjecting it to statistical probability distributions (PDFs), and finally selecting the one with the lowest value in a goodness-of-fit test (e.g., Kolmogorov-Smirnov). Nevertheless, this methodology assumes current data, leaving uncertainty regarding its suitability for outdated data. The aim of this study is to compare the probability density functions (e.g., Normal, Log Normal, Pearson III) with the machine learning algorithm known as Random Forest (RF) for calculating precipitation at different return periods, using the province of Arequipa in Peru as the study area, through 5 stations located in different parts of the province. This comparison was conducted using the RMSE metric in both methods to evaluate their performance, resulting in RF having a lower RMSE than PDFs in most cases in calculating precipitation for return periods of 2, 5, 10, 20, 50 and 100 years for the studied stations.
Keywords
probability distributions; return period; random forest; supervised learning; annual maximum rainfall; goodness of fit test
Subject
Engineering, Civil Engineering
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.