M.S. Applied Data Science - Capstone Chronicles 2025

15

​ Multinomial logistic regression, a linear model that learns class boundaries using logistic functions, was implemented with the SAGA solver, L2 regularization, a maximum of 1,000 iterations, and “RobustScaler” preprocessing for numerical features. Initial cross-validation on the balanced training set revealed fundamental challenges: the model achieved a mean accuracy of 36.18% (±5.69%) and a macro F1-score of 29.92% (±7.23%). The relatively large standard deviations indicated unstable learning across folds and suggested the linear decision surface was poorly matched to the underlying feature space. Hyperparameter tuning was performed using “RandomizedSearchCV”, exploring regularization strength (C values from 0.001 to 100 on a logarithmic scale) and the maximum number of iterations (500 – 2,000). The optimal configuration employed very strong regularization (C = 0.0036) with 2,000 iterations, slightly improving the cross-validation macro F1-score to 30.47%. However, this gain remained marginal relative to the ensemble models, and logistic regression was ultimately retained as a linear baseline rather than a primary deployment candidate. The tuned logistic regression model was nonetheless carried forward for completeness to the evaluation phase reported in Section 5. ​ 4.4.2.8 Linear Support Vector Machine (SVM) ​ A linear support vector machine (SVM), which learns maximum-margin hyperplanes to separate classes, was implemented using a one-vs-rest multiclass strategy with regularization strength , a maximum of 5,000 iterations, and “RobustScaler” preprocessing for the numerical features. Cross-validation on the balanced training set yielded a mean accuracy of 61.22% (±0.14%) and a macro F1-score of 61.11% (±0.09%),

representing a substantial improvement over logistic regression but still falling short of the tree-based ensemble methods. Hyperparameter optimization was carried out using “RandomizedSearchCV”, exploring regularization strength (C values from 0.001 to 100), the maximum number of iterations (3,000 – 10,000), and convergence tolerance (0.0001 – 0.01). The optimal configuration used moderate regularization (C = 2.15), a maximum of 10,000 iterations, and a tight tolerance of 0.0001, achieving a cross-validation macro F1-score of 61.15%. This tuned linear SVM was retained as the strongest linear baseline and was subsequently evaluated on the validation and test sets as reported in Section 5. 4.4.3 Model comparison and selection ​ Comprehensive evaluation across all seven models revealed clear performance tiers. Table 1 summarizes test-set performance for all models, ranked by macro F1-score. The top tier consisted of four ensemble methods with macro F1-scores ranging from 62.86% to 63.29%: LightGBM (63.29%), XGBoost (63.01%), CatBoost (62.97%), and Random Forest (62.86%). These models achieved very similar accuracy (77.78% – 77.99%) and weighted F1-scores (79.47% – 79.63%), suggesting they captured comparable decision boundaries despite their different algorithmic designs. The middle tier contained only the linear SVM, which achieved a macro F1-score of 52.78% and accuracy of 70.54%. Although this represents a substantial drop from the top-tier ensembles, it is still acceptable for a linear model given the complexity of the task. The lowest tier included AdaBoost (48.13% macro F1-score, 68.20% accuracy) and logistic regression (28.23% macro F1-score, 55.20% accuracy), both of which demonstrated fundamental limitations for large-scale network intrusion detection.

253

Made with FlippingBook flipbook maker