AAI_2025_Capstone_Chronicles_Combined

18

Figure 5 Cluster assignment stability over time.

a large margin, aligning with the human evaluator in 76% of cases. The strongest performance came from the LLM-as-classifier model, which aligned with the human evaluator in 87% of cases, although this came at the cost of a much higher processing time than either of the other methods (see Table 2). However, the long processing time of the LLM-as-classifier approach could be significantly mitigated by parallelizing the batch labeling and final classification phases. Clustering conversations based on the extracted ’mood’ facet revealed distinct patterns in student engagement and sentiment. To determine the user’s sentiment, the LLM was prompted with the question "What is the overall mood of the user in the conversation?". The resulting output is a descriptive sentence such as: "The user’s overall mood in the conversation is engaged and eager to learn. They actively participate in solving problems and ask for further clarification and additional practice problems when needed." The mood-based clustering with facets yielded four archetypes of student interaction with the tutoring chatbot. The “Confident Active Learners” cluster, comprising 35 conversations, is characterized by users who showed strong positive engagement and successful learning outcomes.

242

Made with FlippingBook - Share PDF online