AAI_2025_Capstone_Chronicles_Combined

9

and its computational efficiency, the XGBoost model was selected for intensive optimization and fine-tuning to improve performance on critical categories and address the severe class imbalance inherent in the dataset. The baseline XGBoost model demonstrated strong performance on high-frequency categories with F1-scores above 0.70 for “related" (74% prevalence), "weather-related" (27% prevalence), and "direct report" (17% prevalence), as can be seen in Table 2. However, critical humanitarian categories showed substantial variation: food (F1=0.781) and water (F1=0.738) approached the 0.80 target threshold, while medical help (F1=0.429) and search and rescue (F1=0.343) fell significantly short. The low recall values for these life-critical categories (0.315 and 0.221 respectively) indicated the model was missing 68-78% of genuine emergencies, driven by the model's learned preference for precision over recall under extreme class imbalance where conservative predictions maximize overall accuracy. Table 2

Performance metrics before optimization, showing strong results on high-frequency categories but poor recall on critical rare categories like medical help (recall=0.31) and search and rescue (recall=0.17). Systematic hyperparameter optimization was done using randomized hyperparameter tuning with 2-fold cross-validation explored learning rates (0.1, 0.2, 0.3), tree depths (3, 5, 8), and number of estimators (100, 200) across 5 parameter combinations. The optimal configuration involved having a learning rate of 0.2, max depth of 5, and the number of estimators to 200. This balanced gradient step size, model

325

Made with FlippingBook - Share PDF online