AAI_2025_Capstone_Chronicles_Combined
Cinema Analytics and Prediction System
20
Figure 16: Movie clusters based on content
Each cluster group includes movies that share common thematic or lexical
characteristics (Figure 17):
Figure 17: Movie clusters genre and top TF-IDF
Clustering showed TF-IDF could group movies by theme without genre labels, enabling
recommendations from nearby “neighborhoods” in the vector space.
The BERT-based model, leveraging pre-trained embeddings, greatly enhanced semantic
understanding, producing smoother similarity rankings and recommending contextually
relevant titles despite differing vocabulary. For a Toy Story (1995) query, it ranked Toy Story 3
187
Internal
Made with FlippingBook - Share PDF online