This study evaluates flood susceptibility mapping in the Gorganrud watershed, Golestan Province, Iran, using Random Forest (RF) and Maximum Entropy (ME) machine learning models. Unlike previous research that relied on single random splits of flood data, this study systematically assessed both predictive accuracy and model robustness across three distinct training/validation splits (S1, S2, S3).
A flood inventory of 127 occurrence points was combined with 19 environmental factors including distance to stream, drainage density, lithology, rainfall, slope, and land use. Multicollinearity was checked and confirmed as negligible.
Both models showed excellent predictive performance. RF achieved validation AUC values of 0.95, 0.97, and 0.98 for S1, S2, and S3 respectively, while ME achieved 0.936, 0.955, and 0.935. For robustness, measured by AUC variability, RF scored 0.001 compared to ME's 0.008, indicating that RF provides significantly more stable predictions regardless of how training data are split.
The final flood susceptibility map (averaged across model runs) classified approximately 15% (RF) to 22.7% (ME) of the watershed as high or very high flood hazard, concentrated in low-slope areas near rivers. The most influential factors were distance to stream (49.4%), drainage density (15.2%), and lithology (10.8%). Flood probability increased sharply when distance to stream fell below 500 m and drainage density exceeded 2.5 km/km². Quaternary alluvial formations (Qsw) were identified as the most susceptible lithology.
The study concludes that while both models are highly effective (AUC > 0.93), RF is superior due to its higher accuracy (up to 98%) and greater robustness (0.001 vs. 0.008). The ensemble map provides a reliable decision-support tool for sustainable land-use planning and flood risk mitigation in Golestan Province. Future work should incorporate hydrological time-series data and climate change scenarios. |