AAI_2025_Capstone_Chronicles_Combined

Cinema Analytics and Prediction System

25

embeddings, and an enhanced LSTM + GloVe using oversampling to address class imbalance.

Each iteration improved the handling of data imbalance and multi-genre learning.

In the following plot (Figure 24), the F1-score results from the BERT-based genre

classification model show strong overall performance, particularly for dominant genres. The

model achieved high F1 scores for genres like Science Fiction (0.76), Drama (0.73), Action

(0.73), and Adventure (0.73), indicating that it was able to capture semantic patterns in movie

descriptions that are strongly associated with these categories. These genres tend to have rich

and distinctive vocabulary, which BERT embeddings are well-suited to represent.

Figure 24: F1-Score per Genre

The model showed moderate performance for common genres like Comedy (0.62),

Romance (0.60), and Thriller (0.65), despite their linguistic overlap. Performance declined

sharply for underrepresented or ambiguous genres such as Mystery (0.22) and Fantasy (0.42),

highlighting difficulty with rare genre cues due to limited data. Horror (0.70) performed well,

benefiting from strong genre-specific vocabulary captured by BERT. Overall, the model

192

Internal

Made with FlippingBook - Share PDF online