M.S. AAI Capstone Chronicles 2024

A.S.LINGUIST

10

model performance and address the typical need of CNN models for vast amounts of labeled

data, the introduction of the rescale parameter (which ensures the normalization of pixel values)

and the use of a substantial dataset of images for training are recommendable strategies to take

into account during model development.

Based on all the above and given the huge number of images in the ASL alphabet dataset,

the implementation of a CNN model was judged by us as an optimal solution for our sign

language interpreter.

Chatbot

Regarding the chatbot, we thought of transfer learning as our best option. It leverages the

knowledge of a previously trained model to properly perform a new task, while requiring a

reduced amount of new data, computational power and time compared to a completely new

model built from scratch. In addition, given a new dataset, the fine-tuning procedure enables pre

trained models to expand their original tasks and functions by adjusting a portion of their

weights.

When building complex models like chatbots, possible pre-trained models could be Flan

T5-Base (google/flan-t5-base, n.d.), Gemma-2 (google/gemma-2-9b, n.d.), GPT-2 (openai

community/gpt, n.d.) and Llama 2 (Meta, n.d.). They are large language models, trained on huge

textual datasets and recommended to be applied in a variety of fields, from healthcare to finance,

for translation, text classification, question answering, document summarization and more (What

is a large language model (LLM)?, n.d.). The success of these models was already demonstrated

by other authors working on conversational models. For example, Rajani (n.d.) and Bhandare

(n.d.) created very well performing chatbots by respectively fine-tuning the T5 ( T5, n.d.) and

Flan-T5-Base pre-trained models on the same conversations dataset selected for this project.

190

Made with FlippingBook - professional solution for displaying marketing and sales documents online