AAI_2025_Capstone_Chronicles_Combined

Cinema Analytics and Prediction System

6

popularity, revenue, and vote_average revealed some moderate correlations that informed

feature selection for future predictive tasks.

Overall, this project’s movie recommendation model requires significant cleaning and

transformation of data, particularly to convert raw text and metadata into model-ready

formats. The LSTM approach, while more complex and requiring supervised labels to improve

and be more accurate in its recommendations, offered an alternative view into sequence-based

learning. This thorough feature engineering and model experimentation is providing a solid

foundation for a solid movie recommendation system.

Even if the semantic richness captured by BERT embeddings is currently leading to more

accurate recommendations compared to TF-IDF and LSTM models, the LSTM approach, while

more complex and requiring supervised labels, offered an alternative view into sequence-based

learning.

Revenue Prediction and Success Classification

For our revenue prediction and success classification tasks, we aimed to model how

various features influenced a film’s financial success. The primary variables for these tasks were

budget and revenue, both of which initially contained numerous missing or implausible values,

which could be either because the data wasn’t recorded properly or information unavailability.

To address this, we used the TMDB API to supplement and correct these figures. We also

applied inflation correction ( US Consumer Price Index and Inflation (CPI) ) to standardize all

budget and revenue values to a consistent dollar basis across different years. Besides budget

173

Internal

Made with FlippingBook - Share PDF online