M.S. Applied Data Science - Capstone Chronicles 2025
15
Given the importance of high retention rates in the organization, where overall accuracy and Stay detection is important for operational stability, XGBoost represents the best performing model for this application. Future work will consist of optimizing the XGBoost model to improve minority class recall without compromising the accuracy of the model. 6 Discussion This study explored federal employee survey responses to predict employee attrition using machine learning models. The result indicated the XGBoost classifier outperformed both the decision tree classifier and logistic regression models, particularly when evaluating the F1-score, precision, and recall scores. Specifically, the model did extremely well on identifying the majority class—employees who are more likely to stay in their organization. This model highlighted
(0.16) is lower compared to the other models (0.61-0.62), the precision for leave (0.50) is stronger than the other models. These results suggest the model’s leave predictions are more accurate when they are made. Table 5 Performance Metrics for Leave Class Model Precision Recall F-1 Score Logistic Regression 0.34 0.61 0.43 Decision Tree 0.33 0.62 0.43 XGBoost 0.50 0.16 0.24
Figure 9 ROC Curve Comparison
109
Made with FlippingBook flipbook maker