M.S. Applied Data Science - Capstone Chronicles 2025

23

While decision trees are highly interpretable, they are susceptible to overfitting, especially when grown to full depth. To mitigate this, hyperparameter tuning was employed, focusing on parameters such as max_depth, min_samples_split, and min_samples_leaf. Proper tuning of these parameters balances the model’s complexity and generalization capabilities, optimizing performance without compromising interpretability (Pagliarini & Sciavicco, 2021). The decision tree classifier was implemented using scikit-learn’s DecisionTreeClassifier. Hyperparameter tuning was performed using grid search, evaluating combinations of max_depth, min_samples_split, and min_samples_leaf. The model’s performance was assessed using stratified k-fold cross-validation, ensuring balanced representation of classes in each fold. To address class imbalance, SMOTE was applied to the training data. The optimal model configuration utilized all features, with a max_depth set to None, min_samples_split of 2, and min_samples_leaf of 1, indicating that a deeper, more complex tree yielded the best cross-validation performance. 4.5.7 Evaluation Metrics Considering class imbalance. Given the presence of class imbalance, multiple metrics were used to evaluate model performance. Accuracy alone is not a reliable metric in imbalanced datasets, so precision, recall, and F1-score were prioritized. Precision measures how many predicted positive cases were actually positive, while recall evaluates the model’s ability to capture all relevant cases. The F1-score was used to balance precision and recall. The metric is the harmonic mean of precision and recall (C. Larose & D. Larose, 2019).

Additionally, for the random forest model, the receiver operating characteristic – area under the curve (ROC-AUC) metric was used to evaluate the discriminatory power of the model. This metric is particularly valuable for imbalanced datasets, as it measures how well the model distinguishes between classes (Analytics Vidhya, 2017). By analyzing these metrics, we ensured that the model performed well across all classes and did not disproportionately favor the majority class. This distribution of scores shows that random forest performs the best, achieving an F1-score above 0.92. Decision tree, MLP, and XGBoost follow closely with solid performances between 0.87 and 0.90. In contrast, logistic regression significantly underperforms with an F1-score of approximately 0.69. These results suggest that for this specific classification task, tree-based ensemble models such as random forest and XGBoost, as well as more complex models like MLP, are better suited than simpler linear models. The strong performance of the decision tree model indicates that the underlying decision boundaries in the data may be effectively captured by tree-based methods. 4.5.8 Model Deployment on The Web Our innovative web application serves as a sophisticated tool for predicting FDA recall classifications through an intuitive and accessible interface. Leveraging advanced machine learning algorithms, this platform enables regulatory professionals, and quality assurance specialists to anticipate potential recall classifications based on product and incident characteristics, thereby facilitating proactive compliance strategies and risk management protocols. The application features three meticulously designed sections to enhance user experience and

27

Made with FlippingBook flipbook maker