AAI_2025_Capstone_Chronicles_Combined

Cinema Analytics and Prediction System

5

recommendation and classification tasks, while numeric features, especially popularity, vote

count, and runtime, boosted model performance when integrated in hybrid architectures.

To further explore the textual patterns in the dataset, we generated word clouds for the

first five movies (Figure 4). These visualizations highlighted the most frequent and prominent

words associated with each film, offering insight into recurring themes and keywords. For

instance, the word cloud for Avatar emphasized terms like “space,” “colony,” and “war,” while

Pirates of the Caribbean showcased words like “ocean,” “captain,” and “treasure.” These word

clouds provided a quick, intuitive sense o f each movie’s core narrative and confirmed the

relevance of our engineered text feature in capturing meaningful information for

recommendation and classification.

Figure 4: Word Cloud for movies

Text-based features provided strong signals for similarity modeling, while numerical

metadata added extra dimensions of comparison. While some variables like title serve mainly

for indexing and querying, others like vote average will be used to create pseudo-labels for

supervised learning experiments with LSTM. The relationships between variables such as

172

Internal

Made with FlippingBook - Share PDF online