AAI_2025_Capstone_Chronicles_Combined

Cinema Analytics and Prediction System

18

For the classification task in particular, the model worked best with 300 trees, a tree

depth of 4, and a relatively high learning rate of 0.5. Both the binary and multi-class

classification models exhibited stable training and validation loss curves (Figure 13 & 14). To

avoid overfitting, we added regularization by setting values that control tree complexity and

minimum data per split. Model performance during training was evaluated using log loss for

binary classification and multi-class log loss for multi-class tasks.

Figure 14: Multi-class XGBClassifier training

Figure 13: Binary-XGBClassifier training

Results & Conclusion

Although developed as independent modules, the models share a common

preprocessing and feature engineering pipeline, enabling them to function cohesively within a

single decision-support platform. The following sections summarize the results for each model.

Movie Recommendation model

The recommendation model aimed to suggest movies based on content, each capturing

different aspects of textual and contextual similarity. TF-IDF, while effective in some cases,

lacked semantic depth and struggled with thematically similar films. For example (Figure 15),

two thematically related movies like Superman and Superman Returns received low similarity

scores due to differing surface-level vocabulary:

185

Internal

Made with FlippingBook - Share PDF online