AAI_2025_Capstone_Chronicles_Combined

8

Figure 2, the distributions of conversational turns and word count offer a view into typical dialogue engagement. This analysis helps contextualize student-chatbot interactions, suggesting potential correlations between dialogue length, the nature of the mathematical problem, or student engagement level. Figure 2 Conversation length distributions.

Although Math Level, Reading Grade Level, and Sentiment offer rich interpretive value, including them as features for the clustering process risked biasing the model towards predefined categorizations or groupings. These derived features were best utilized to interpret and explain the composition of each cluster after they had been formed. Background A primary goal of our tutoring system is to provide educators with usage insights in a format that allows them to quickly understand how students are using the application. A major challenge in presenting text data to educators is the time commitment and cognitive overhead involved in interpreting large volumes of text. Clustering offers one automated approach to taking this unlabeled text data and grouping it in ways that allow insights to be presented in a usable format. Clustering is already used extensively in educational data science, with k-means being the most widely used technique (Le Quy et al., 2023). K-means clustering involves creating k initial

232

Made with FlippingBook - Share PDF online