M.S. AAI Capstone Chronicles 2024
A.S.LINGUIST
10
model performance and address the typical need of CNN models for vast amounts of labeled
data, the introduction of the rescale parameter (which ensures the normalization of pixel values)
and the use of a substantial dataset of images for training are recommendable strategies to take
into account during model development.
Based on all the above and given the huge number of images in the ASL alphabet dataset,
the implementation of a CNN model was judged by us as an optimal solution for our sign
language interpreter.
Chatbot
Regarding the chatbot, we thought of transfer learning as our best option. It leverages the
knowledge of a previously trained model to properly perform a new task, while requiring a
reduced amount of new data, computational power and time compared to a completely new
model built from scratch. In addition, given a new dataset, the fine-tuning procedure enables pre
trained models to expand their original tasks and functions by adjusting a portion of their
weights.
When building complex models like chatbots, possible pre-trained models could be Flan
T5-Base (google/flan-t5-base, n.d.), Gemma-2 (google/gemma-2-9b, n.d.), GPT-2 (openai
community/gpt, n.d.) and Llama 2 (Meta, n.d.). They are large language models, trained on huge
textual datasets and recommended to be applied in a variety of fields, from healthcare to finance,
for translation, text classification, question answering, document summarization and more (What
is a large language model (LLM)?, n.d.). The success of these models was already demonstrated
by other authors working on conversational models. For example, Rajani (n.d.) and Bhandare
(n.d.) created very well performing chatbots by respectively fine-tuning the T5 ( T5, n.d.) and
Flan-T5-Base pre-trained models on the same conversations dataset selected for this project.
190
Made with FlippingBook - professional solution for displaying marketing and sales documents online