AAI_2025_Capstone_Chronicles_Combined

‭ResolveAI‬

‭LLM with RAG Results‬

‭The final step in building our app was evaluating the results of the chatbot portion based‬ ‭on OpenAI’s ChatGPT 3.5 turbo with a ChromaDB vector database for RAG. We focused on‬ ‭tuning one variable each for the vector database and the LLM itself. For the vector database we‬ ‭looked at the impact of top-K selection in serving the most relevant question and answer pairs to‬ ‭the model. For the LLM itself we evaluated the effect of model temperature. Both of these‬ ‭variables were evaluated through ROGUE scores comparing the generated model outputs against‬ ‭known correct answers in the test dataset.‬

‭When comparing the Rouge scores we placed more weight on the Rouge L scores as this‬ ‭gave an indication of how well our chatbot was matching the style of the answers in the test‬ ‭dataset. In general scores tended to be the highest for two sets of parameters: top-K=1, Temp =‬ ‭0.0 and top-K=5, Temp = 0.7. Subjectively the latter values tended to give more natural sounding‬ ‭results and we elected to use a top-K retrieval value of 5 and temperature value of 0.7 for our‬ ‭final model which was then hosted on HuggingFace.‬ ‭Observations and Optimizations‬

‭22‬

70

Made with FlippingBook - Share PDF online