ADS Capstone Chronicles Revised

15

preprocessed data (x_train), transformed data (x_train_transformed), scaled data (x_train_scaled), transformed and scaled data (x_train_trans_scaled), PCA-transformed data (x_train_pca), and scaled data with PCA (x_train_scaled_pca). The Logistic regression model trained on PCA-transformed and scaled data achieved a validation accuracy of approximately 82% performance indicated that PCA effectively captured the data’s underlying structure and reduced overfitting, while scaling ensured that all features contributed equally to the model. The use of PCA also improved computational efficiency by reducing the number of features, which decreased training time and resource consumption. and a mean accuracy of 73% post cross-validation (see Table 1). This

The choice of PCA-transformed and scaled data for the Logistic regression model was based on several factors: dimensionality reduction through PCA led to a more efficient model by eliminating redundant features; feature scaling ensured that all features had an equal impact on the model’s performance; and the simplification of the feature set reduced the risk of overfitting. Additionally, the reduced number of features contributed to faster and more resource-efficient model training. Consequently, the Logistic regression model with PCA-transformed and scaled data proved to be the most effective, achieving an accuracy of 82%, and was thus selected as the baseline model for comparison with more complex models.

Table 1 Baseline Model Performance Metrics Across Different Data Transformation

4.4.2 Selection of Modeling

Network, Adaboost, SVM with Kernel Trick, Stochastic Gradient Descent, and Quadratic Discriminant Analysis. Ridge and Lasso classifier models were employed as variations of the Ridge and Lasso classifier models to address issues related to multicollinearity and outliers. Ridge Regression incorporates L2 regularization, which helps manage

Techniques. Following the evaluation of the baseline Logistic regression model, a range of more advanced models were tested to enhance performance. These included Ridge classifier, Lasso classifier, XGBoost, Bagging Classifier, Neural

139

Made with FlippingBook - Online Brochure Maker