Yael Feldman-Maggor

Senior Academic

Automated Identification and Validation of the Optimal Number of Knowledge Profiles in Student Response Data

Brad Din, Tanya Nazaretsky, Yael Feldman-Maggor, Giora Alexandron

It is well–known that personalized instruction can enhance student learning. AI–based education tools can be used to incorporate blended learning in the science classroom, and have been shown to enhance teachers’ ability to prescribe this personalization. We utilize cluster analysis to reveal student knowledge profiles from their response data. However, clustering algorithms typically require the number of clusters as a hyperparameter, yet there is no clear method for choosing the optimal number. Motivated by a practical instance of this foundational problem for a group–based per sonalization tool, this paper discusses several variations of the gap statistic to identify the optimal number of clusters in student response data. We begin with a simulation study where the ground truth is known to evaluate the quality of the identified methods. We then assess their behaviour on real student data and suggest a stability–based app oach to validate our predictions. We identify an empirical thresh old for the number of observations required for a prediction to be stable. We found that if a dataset had cluster structure, very small subsamples also showed cluster structure– large datasets were only required to discern the number of clusters accurately. Finally, we discuss how the method enables teachers to tailor their personalization according to their class environment or teaching goals.

Publication language English
Pages 458-465
Publication status Published - 01.01.2023

Keywords

Clustering
Gap Statistic
Personalized Instruction

ASJC Scopus subject areas

Artificial Intelligence
Computer Science Applications
Human-Computer Interaction
Information Systems
Access to Document
10.5281/zenodo.8115744
Other files and links
Link to publication in Scopus