AAI_2025_Capstone_Chronicles_Combined
Cinema Analytics and Prediction System
18
For the classification task in particular, the model worked best with 300 trees, a tree
depth of 4, and a relatively high learning rate of 0.5. Both the binary and multi-class
classification models exhibited stable training and validation loss curves (Figure 13 & 14). To
avoid overfitting, we added regularization by setting values that control tree complexity and
minimum data per split. Model performance during training was evaluated using log loss for
binary classification and multi-class log loss for multi-class tasks.
Figure 14: Multi-class XGBClassifier training
Figure 13: Binary-XGBClassifier training
Results & Conclusion
Although developed as independent modules, the models share a common
preprocessing and feature engineering pipeline, enabling them to function cohesively within a
single decision-support platform. The following sections summarize the results for each model.
Movie Recommendation model
The recommendation model aimed to suggest movies based on content, each capturing
different aspects of textual and contextual similarity. TF-IDF, while effective in some cases,
lacked semantic depth and struggled with thematically similar films. For example (Figure 15),
two thematically related movies like Superman and Superman Returns received low similarity
scores due to differing surface-level vocabulary:
185
Internal
Made with FlippingBook - Share PDF online