AAI_2025_Capstone_Chronicles_Combined
Cinema Analytics and Prediction System
7
and revenue, we included features like vote count, popularity, and runtime, as these factors can
influence revenue outcomes.
Figure 6: Revenue distribution over budget category
Figure 5: Budget distribution over budget category
As part of our feature engineering for revenue prediction, we created a budget binning
variable that grouped movies into five budget categories (from very low to very high budget).
Figure 5 shows a boxplot of the budget across different budget categories. Figure 6 presents a
boxplot of revenue within these same categories, revealing that although higher-budget films
generally achieved higher revenues, there was considerable variability and numerous outliers
across all groups. Another engineered feature was the release season, derived from the release
month, to capture potential seasonal effects on revenue, since certain months (like summer or
holiday seasons) often correspond to higher box office performance.
For the movie classification task, we initially defined a movie as a hit if its revenue was
more than twice its budget; otherwise, it was labeled a flop. To explore finer granularity, we
174
Internal
Made with FlippingBook - Share PDF online