M.S. Applied Data Science - Capstone Chronicles 2025
22
4.5.6 Selection of Modeling Techniques - Decision Trees Decision trees were selected for this study due to their inherent interpretability and suitability for multiclass classification tasks. Unlike models that require transformation strategies to handle multiple classes, decision trees natively support multiclass targets, facilitating straightforward modeling without additional complexity (Pagliarini & Sciavicco, 2021). Their hierarchical structure enables the modeling of complex decision boundaries, making them effective for datasets with intricate patterns. Incorporating decision trees into the modeling framework ensures representation of rule-based, non-parametric learning paradigms. This contrasts with linear models, which assume linear relationships, and neural networks, which rely on layered transformations. By including decision trees, the study encompasses a diverse set of algorithms, enhancing the robustness of comparative analyses (Pagliarini & Sciavicco, 2021). Decision trees contribute valuable contrast to other models in the suite. While MLP and ensemble models can capture complex patterns with higher performance, they typically lack transparency. In contrast, decision trees offer unmatched interpretability by presenting decision rules directly aligned with the original feature values. This property is especially useful for stakeholder communication and diagnostic analysis. Furthermore, decision trees serve as the foundational unit of ensemble models like random forest and XGBoost, which makes their inclusion helpful for understanding performance improvements achieved through ensembling (Z.-H. Zhou, 2012).
multiple trees, MLPs do so through learned weights and activations, providing a different form of complexity and generalization. A key trade-off with MLPs is between their superior performance on non-linear problems and their limited interpretability. Unlike linear models or decision trees, MLPs function as black-box models due to the abstraction of weights and layers. However, this can be partially mitigated through techniques such as permutation importance and SHAP values for model explanation (Molnar, 2022). Despite interpretability limitations, MLPs often outperform simpler models when the dataset exhibits complex patterns, provided they are carefully regularized to prevent overfitting (Zhang et al., 2021). Techniques such as early stopping, dropout, and weight regularization (via alpha) are applied to enhance generalization. The MLP classifier was implemented using MLPClassifier from Scikit-Learn. Several hyperparameters were tuned, including hidden layer configurations (e.g., one or two layers of 50–100 neurons), activation functions (relu, tanh), learning rate strategies (constant vs. adaptive), and regularization strength (alpha). Early stopping was enabled to prevent overfitting, and the Adam optimizer was used to adapt the learning rate dynamically. Feature scaling was performed using StandardScaler, which is essential for gradient-based learning (Pedregosa et al., 2011). Feature selection employed SelectKBest based on ANOVA F-values (f_classif), and the full pipeline was evaluated using 5-fold stratified cross-validation with SMOTE oversampling and nested grid search for hyperparameter tuning.
26
Made with FlippingBook flipbook maker