M.S. Applied Data Science - Capstone Chronicles 2025

16

key features associated with employee retention, providing insights where the organization can develop targeted strategies to maintain a high retention rate and minimize employee turnover. The most influential key features related to employee retention included (a) gender and specific survey questions relating to themes such as employee confidence, (b) alignment with job goals, and (c) employee recognition. Survey questions with feature importance were Q6, Q36, and Q34, and can be found in the appendix under Table 2. These results suggest employees were more likely to stay in the organization if they felt acknowledged for their contributions to work and understand their responsibilities, and feel their work contributes to the greater good of the organization. These factors are closely related to employee engagement and commitment to their work. Additionally, recognizing gender as a key factor in retention suggests employee experience and retention may differ across different demographic groups. 6.1 Limitations One notable limitation in this study was the computational cost of training and running the machine learning models, particularly the decision tree and XGBoost classifiers. Due to the extended runtime of these models, extensive hyperparameter tuning was not conducted. Hyperparameter tuning is a critical step in model optimization, and its absence may have limited the accuracy of the models. Another notable limitation is that the XGBoost model performed significantly better on predicting the majority class (Stay Class) than the minority class (Leave Class). This imbalance indicated the model may be biased towards the majority class in the dataset. Future model improvements should explore other class balancing techniques to improve the classification model performance for

employees at risk of leaving. Additionally, this may highlight different key features that relate to why employees are likely to leave the organization. Finding employee research was also another struggle due to the sensitive nature and personal information. 6.2 Recommend Next Steps/Future Studies Given the time constraint of a 7-week timeline to complete the project, the scope and findings of this study were limited. In this timeframe, we focused on exploring survey data, developing models, and identifying key features associated with employee turnover and retention. While these findings offer some insight into steps HR can take to increase employee retention and identify at-risk employees, more comprehensive research must be conducted to better understand the underlying factors influencing employee turnover. Next steps should include model refinement to optimize the classification models. Also, a more in-depth feature exploration could be beneficial to uncover other key features and patterns missed in this initial study. Acknowledgements We would like to acknowledge Dr. Ebrahim Tarshizi for his guidance and invaluable feedback throughout the capstone project course. We also acknowledge the use of ChatGPT (Open AI, 2025) for support with grammar, editing, and writing refinement during the preparation of this report.

110

Made with FlippingBook flipbook maker