TY - JOUR
T1 - A Methodology Based on Random Forest to Estimate Precipitation Return Periods
T2 - A Comparative Analysis with Probability Density Functions in Arequipa, Peru
AU - Anco-Valdivia, Johan
AU - Valencia-Félix, Sebastián
AU - Espinoza Vigil, Alain Jorge
AU - Anco, Guido
AU - Booker, Julian
AU - Juarez-Quispe, Julio
AU - Rojas-Chura, Erick
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/1
Y1 - 2025/1
N2 - Precipitation within specific return periods plays a crucial role in the design of hydraulic infrastructure for water management. Traditional analytical approaches involve collecting annual maximum precipitation data from a station followed by the application of statistical probability distributions and the selection of the best-fit distribution based on goodness-of-fit tests (e.g., Kolmogorov-Smirnov). However, this methodology relies on current data, raising concerns about its suitability for outdated data. This study aims to compare Probability Density Functions (PDFs) with the Random Forest (RF) machine learning algorithm for estimating precipitation at different return periods. Using data from twenty-six stations located in various parts of the Arequipa department in Peru, the performance of both methods was evaluated using MSE, RMSE, R2 and MAE. The results show that RF outperforms PDFs in most cases, having more precision using the metrics mentioned for precipitation estimates at return periods of 2, 5, 10, 20, 50, and 100 years for the studied stations.
AB - Precipitation within specific return periods plays a crucial role in the design of hydraulic infrastructure for water management. Traditional analytical approaches involve collecting annual maximum precipitation data from a station followed by the application of statistical probability distributions and the selection of the best-fit distribution based on goodness-of-fit tests (e.g., Kolmogorov-Smirnov). However, this methodology relies on current data, raising concerns about its suitability for outdated data. This study aims to compare Probability Density Functions (PDFs) with the Random Forest (RF) machine learning algorithm for estimating precipitation at different return periods. Using data from twenty-six stations located in various parts of the Arequipa department in Peru, the performance of both methods was evaluated using MSE, RMSE, R2 and MAE. The results show that RF outperforms PDFs in most cases, having more precision using the metrics mentioned for precipitation estimates at return periods of 2, 5, 10, 20, 50, and 100 years for the studied stations.
KW - algorithms
KW - annual maximum rainfall
KW - artificial intelligence
KW - mathematical methods
KW - probability distributions
KW - random forest
KW - return period
UR - http://www.scopus.com/inward/record.url?scp=85214486696&partnerID=8YFLogxK
U2 - 10.3390/w17010128
DO - 10.3390/w17010128
M3 - Article
AN - SCOPUS:85214486696
SN - 2073-4441
VL - 17
JO - Water (Switzerland)
JF - Water (Switzerland)
IS - 1
M1 - 128
ER -