AAI_2025_Capstone_Chronicles_Combined

Cinema Analytics and Prediction System

8

extended this to a three-class: flop, hit, and superhit based on revenue thresholds relative to

budget.

System Architecture

The system architecture (Figure 7) for this project is organized into four key blocks: Data

Ingestion, Processing & Feature Engineering, Modeling & Analytics, and Operational

Deployment. The Data Ingestion block involves collecting structured and unstructured movie

data from CSV files, including attributes such as budget, revenue, genre, keywords, and

overview. This block also incorporates external enrichment through APIs, particularly TMDB,

and uses CPI data to perform inflation correction on budget and revenue values. In a

production setting, this module would be replaced by automated, real-time ingestion pipelines.

The Processing & Feature Engineering block performs data cleaning, standardization,

transformations, embedding of textual fields, and derivation of features such as budget bins

and seasonal indicators. In the Modeling & Analytics block, we implemented models for genre

classification, content-based movie recommendation, movie revenue prediction, and success

classification. The Operational block, though not implemented in this project, is intended to

support production-level features such as model endpoint hosting, inference APIs, performance

monitoring, and periodic retraining pipelines. Currently, all modules are implemented as

independent Jupyter notebooks, providing modularity and flexibility for future integration and

deployment.

175

Internal

Made with FlippingBook - Share PDF online