AAI_2025_Capstone_Chronicles_Combined

First page Table of contents Previous page 235 Next page Last page

Existing Applications of Deep Clustering

While deep clustering isn’t nearly as common in educational data science research as k-means or hierarchical clustering, a number of projects have explored its use. Tang et al. (2025) used CMVAE, a deep clustering approach similar to DEC, to cluster student submissions for computer science education based on coding strategy. CMVAE combines multiple "views," or extracted feature representations, of the input data to more effectively capture both the syntactic structure and semantic meaning of student source code. Li (2022) used deep clustering on a combination of demographic and clickstream data to create clusters tying user behaviors in online courses to their eventual course outcomes (passing with distinction, passing, withdrawing, or failing). Experimental Methods In this section, we detail the experimental methods and optimization strategies used to develop the deep learning component of our AI Math Tutor. The core function of this component is to generate actionable insights from student chatbot conversations through unsupervised clustering. We achieved this by exploring three variants of the Deep Embedded Clustering (DEC) model: the original DEC model, an LSTM-based model, and a final optimized model. DEC Model Overview We initially developed a direct recreation of the original DEC design presented by Xie et al. (2016). In this model, TF-IDF embeddings are generated and passed through the autoencoder’s encoder portion. The encoder consists of three fully connected hidden layers with 500, 500, and 2000 nodes, respectively, followed by a bottleneck output layer with 10 nodes. All hidden layers utilize a Rectified Linear Unit (ReLU) activation function, and both dropout is incorporated as a regularization technique to mitigate overfitting. Each layer in the autoencoder’s encoder component is pretrained individually for 50,000 epochs with added corruption for denoising. This initial layer-wise pretraining is followed by an end-to-end fine-tuning phase for 100,000 epochs without corruption. Once the encoder portion of the autoencoder is trained, it is used to create encoded

235

Made with FlippingBook - Share PDF online