AAI_2025_Capstone_Chronicles_Combined

First page Table of contents Previous page 66 Next page Last page

‭ResolveAI‬

‭Model Optimization‬

‭Optimization of the model is achieved through an iterative process that involves several‬ ‭strategies. Initially, the architecture includes bidirectional LSTM layers to capture contextual‬ ‭information from both directions in the text, which is crucial for understanding the nuances in‬ ‭ticket descriptions. However, early experiments revealed challenges such as overfitting and high‬ ‭validation loss.‬ ‭To address these issues, the training procedure was refined by increasing dropout rates‬ ‭and incorporating L2 regularization to impose weight decay on the network layers. Additionally,‬ ‭the model complexity was reduced by decreasing the number of LSTM units and layers.‬ ‭Learning rate adjustments were also explored using scheduling strategies that reduce the learning‬ ‭rate when the validation loss stagnates, allowing the model to converge more gradually.‬ ‭Despite these efforts, the performance of the multi-class model, tasked with‬ ‭distinguishing among low, medium, and high priorities remained suboptimal. Consequently, the‬ ‭problem was re-framed as a binary classification task by grouping medium and high priority‬ ‭tickets together. This simplification helped address issues of data imbalance and overfitting,‬ ‭resulting in a more robust and generalizable model.‬

‭LLM with Retrieval Augmented Generation‬

‭To implement the chatbot portion of our support agent we utilized a combination of‬ ‭Langchain, ChromaDB and OpenAI’s ChatGPT 3.5 Turbo model. We used both Langchain and‬ ‭ChromaDB to process the question/answer pairs in the training dataset into a vector database. To‬ ‭serve the most relevant examples based on a user query a Retrieval Augmented Generation‬

‭18‬

Made with FlippingBook - Share PDF online