AAI_2025_Capstone_Chronicles_Combined

Cinema Analytics and Prediction System

7

and revenue, we included features like vote count, popularity, and runtime, as these factors can

influence revenue outcomes.

Figure 6: Revenue distribution over budget category

Figure 5: Budget distribution over budget category

As part of our feature engineering for revenue prediction, we created a budget binning

variable that grouped movies into five budget categories (from very low to very high budget).

Figure 5 shows a boxplot of the budget across different budget categories. Figure 6 presents a

boxplot of revenue within these same categories, revealing that although higher-budget films

generally achieved higher revenues, there was considerable variability and numerous outliers

across all groups. Another engineered feature was the release season, derived from the release

month, to capture potential seasonal effects on revenue, since certain months (like summer or

holiday seasons) often correspond to higher box office performance.

For the movie classification task, we initially defined a movie as a hit if its revenue was

more than twice its budget; otherwise, it was labeled a flop. To explore finer granularity, we

174

Internal

Made with FlippingBook - Share PDF online