AAI_2025_Capstone_Chronicles_Combined
Cinema Analytics and Prediction System
6
popularity, revenue, and vote_average revealed some moderate correlations that informed
feature selection for future predictive tasks.
Overall, this project’s movie recommendation model requires significant cleaning and
transformation of data, particularly to convert raw text and metadata into model-ready
formats. The LSTM approach, while more complex and requiring supervised labels to improve
and be more accurate in its recommendations, offered an alternative view into sequence-based
learning. This thorough feature engineering and model experimentation is providing a solid
foundation for a solid movie recommendation system.
Even if the semantic richness captured by BERT embeddings is currently leading to more
accurate recommendations compared to TF-IDF and LSTM models, the LSTM approach, while
more complex and requiring supervised labels, offered an alternative view into sequence-based
learning.
Revenue Prediction and Success Classification
For our revenue prediction and success classification tasks, we aimed to model how
various features influenced a film’s financial success. The primary variables for these tasks were
budget and revenue, both of which initially contained numerous missing or implausible values,
which could be either because the data wasn’t recorded properly or information unavailability.
To address this, we used the TMDB API to supplement and correct these figures. We also
applied inflation correction ( US Consumer Price Index and Inflation (CPI) ) to standardize all
budget and revenue values to a consistent dollar basis across different years. Besides budget
173
Internal
Made with FlippingBook - Share PDF online