AAI_2025_Capstone_Chronicles_Combined
Cinema Analytics and Prediction System
21
(2010) and Toy Story 2 (1999) highest, with similarity scores of 0.935 and 0.928, respectively,
showing BERT’s strength in capturing deep semantic relationships (Figure 18). It also identified
thematically related films like CJ7, Rushmore, and Jimmy Neutron: Boy Genius by recognizing
shared narrative and emotional elements beyond surface language.
Figure 18: BERT Similarity score
To ensure consistency, we fixed the number of clusters at k = 5 across all embedding
methods (Figure 19), enabling direct comparison of their impact on thematic grouping.
Although some overlap occurred in BERT-based clusters (notably Clusters 2 and 4), we
maintained this structure for methodological uniformity. Future research could apply adaptive
clustering or silhouette analysis for refinement. BERT’s sentence embeddings, visualized via K Means and PCA, effectively grouped movies by deep semantic themes rather than surface
keywords, for example, (Figure 20) emotional films like Stepmom and Submarine clustered
together, while comedies, sci- fi, and thrillers formed separate groups, demonstrating BERT’s
strength in capturing contextual meaning beyond vocabulary.
188
Internal
Made with FlippingBook - Share PDF online