M.S. Applied Data Science - Capstone Chronicles 2025
30
Table 4 Cross-Validation Performance Metrics for Each Model
Model
F1-score (CV)
Rank
Random forest
0.9215
1
Decision tree
0.8986
2
MLPClassifier
0.8810
3
XGBoost
0.8781
4
Logistic regression
0.6894
5
Note. F1-scores represent the weighted average across all classes during 5-fold cross-validation.
Table 5 Test Set Performance Metrics for Each Model
Model
Accuracy
Precision
Recall
F1-score
Random rorest
0.9324
0.9302
0.9324
0.9308
Decision tree
0.9106
0.9122
0.9106
0.9113
MLPClassifier
0.8799
0.8951
0.8799
0.8855
XGBoost
0.8759
0.8790
0.8759
0.8772
Logistic regression
0.6562
0.7711
0.6562
0.6931
Note. All metrics represent weighted averages across all classes. Model precision evaluation indicated systematic differences in the ability to minimize false positive predictions among the five machine learning approaches. The tree-based ensemble method emerged as particularly effective at correctly classifying positive instances. Table 3 illustrates that random forest achieved the highest
precision score (0.9308), with decision tree following closely (0.9137), demonstrating these models’ superior ability to minimize false positive classifications. MLP and XGBoost demonstrated comparable precision metrics (0.8930 and 0.8794, respectively), positioning them as moderately effective classifiers regarding
34
Made with FlippingBook flipbook maker