AAI_2025_Capstone_Chronicles_Combined
ResolveAI
Model Architecture
The model is built using a sequential architecture that transforms tokenized text input into dense vector representations via an embedding layer, followed by bidirectional LSTM layers to capture contextual information from both past and future tokens. The first bidirectional LSTM layer has 64 units, while the second has 32 units. This progressive reduction of units helps in extracting hierarchical features while still prioritizing efficiency. The architecture features dropout regularization layers after both bidirectional LSTM and Dense layers to help prevent overfitting. For initial testing, the dropout rate was set to 0.5 to ensure robustness while training, but this is one of the several layers that we will be optimizing during hyperparameter tuning. GlobalMaxPooling1D was applied to reduce dimensionality while still being able to preserve critical information, which makes the model more efficient. After the bidirectional LSTM layers, a dense layer with 128 units and a final dense layer with a single unit are used to transform the extracted features from the BiLSTM layers into a final classification decision. The model’s input consists of preprocessed and padded textual data extracted from key fields (subject, body, and answer), along with additional encoded metadata “type”, while the output is a binary label indicating ticket urgency. The bidirectional LSTM layers use tanh and sigmoid activations by default because tanh effectively captures the underlying patterns in sequential data by mapping values between -1 and 1, which allows for a more balanced representation of positive and negative relationships (Baheti P., (2021). The sigmoid activation for the bidirectional LSTM layers is used in the gating mechanisms to control the flow of information, which ensures that the model retains relevant features and discards unnecessary ones.
15
63
Made with FlippingBook - Share PDF online