ADS Capstone Chronicles Revised

First page Table of contents Previous page 13 Next page Last page

K-NN

0.7843

0.8612 0.8605 0.8642

In Figure 4, the recall of the model is evaluated on both the validation and test set. This score evaluates the proportion of true positives out of all the actual positives. This score determines whether the model can identify the most relevant label. All of the models did well during the validation phase, however dropped in performance when applied to the test set. The baseline logistic regression model had a minimal drop in performance, but the RF, K-NN, and XGBoost models saw a significant decrease. This implies that these models did well in making few false positive predictions, but were not able to predict many actual positive instances. Figure 4 Recall Score of Validation and Test Set

K-NN - grid search 0.7876

XGBoost Presidio

0.8088 0.1104

0.2582 5.1.3 Test Performance. The model was applied to a validation and test dataset to observe how various performance metrics performed, as seen in Figure 3 through 6. The evaluation metrics used include accuracy, precision, recall, and F1 scores, providing a comprehensive assessment of each model's performance. Figure 3 illustrates the precision performance, this metric identifies the proportion of true positive labels out of all the positive predictions made. The K-NN model performed the best during the validation stage, however significantly dropped in performance when applied to the test set. The same occurred with the XGBoost model and presidio model. On the other hand, both the logistic regression and RF models computed zero false positives in the test set, improving in performance from the validation to test set. Figure 3 Precision Score of Validation and Test Set

The harmonic mean between the precision and recall, or the F1 score of the models are illustrated in Figure 5. This metric gives an equal weight to both of these metrics and measures the class distribution. The baseline logistic regression model was the only model that performed better in the test set. This was followed by the RF models, and saw similar performance between the K-NN and XGBoost models. This score might have been impacted

Made with FlippingBook - Online Brochure Maker