AAI_2025_Capstone_Chronicles_Combined
Cinema Analytics and Prediction System
8
extended this to a three-class: flop, hit, and superhit based on revenue thresholds relative to
budget.
System Architecture
The system architecture (Figure 7) for this project is organized into four key blocks: Data
Ingestion, Processing & Feature Engineering, Modeling & Analytics, and Operational
Deployment. The Data Ingestion block involves collecting structured and unstructured movie
data from CSV files, including attributes such as budget, revenue, genre, keywords, and
overview. This block also incorporates external enrichment through APIs, particularly TMDB,
and uses CPI data to perform inflation correction on budget and revenue values. In a
production setting, this module would be replaced by automated, real-time ingestion pipelines.
The Processing & Feature Engineering block performs data cleaning, standardization,
transformations, embedding of textual fields, and derivation of features such as budget bins
and seasonal indicators. In the Modeling & Analytics block, we implemented models for genre
classification, content-based movie recommendation, movie revenue prediction, and success
classification. The Operational block, though not implemented in this project, is intended to
support production-level features such as model endpoint hosting, inference APIs, performance
monitoring, and periodic retraining pipelines. Currently, all modules are implemented as
independent Jupyter notebooks, providing modularity and flexibility for future integration and
deployment.
175
Internal
Made with FlippingBook - Share PDF online