AAI_2025_Capstone_Chronicles_Combined

Cinema Analytics and Prediction System

20

Figure 16: Movie clusters based on content

Each cluster group includes movies that share common thematic or lexical

characteristics (Figure 17):

Figure 17: Movie clusters genre and top TF-IDF

Clustering showed TF-IDF could group movies by theme without genre labels, enabling

recommendations from nearby “neighborhoods” in the vector space.

The BERT-based model, leveraging pre-trained embeddings, greatly enhanced semantic

understanding, producing smoother similarity rankings and recommending contextually

relevant titles despite differing vocabulary. For a Toy Story (1995) query, it ranked Toy Story 3

187

Internal

Made with FlippingBook - Share PDF online