
יעל פלדמן מגור
Automated Identification and Validation of the Optimal Number of Knowledge Profiles in Student Response Data
It is well–known that personalized instruction can enhance student learning. AI–based education tools can be used to incorporate blended learning in the science classroom, and have been shown to enhance teachers’ ability to prescribe this personalization. We utilize cluster analysis to reveal student knowledge profiles from their response data. However, clustering algorithms typically require the number of clusters as a hyperparameter, yet there is no clear method for choosing the optimal number. Motivated by a practical instance of this foundational problem for a group–based per sonalization tool, this paper discusses several variations of the gap statistic to identify the optimal number of clusters in student response data. We begin with a simulation study where the ground truth is known to evaluate the quality of the identified methods. We then assess their behaviour on real student data and suggest a stability–based app oach to validate our predictions. We identify an empirical thresh old for the number of observations required for a prediction to be stable. We found that if a dataset had cluster structure, very small subsamples also showed cluster structure– large datasets were only required to discern the number of clusters accurately. Finally, we discuss how the method enables teachers to tailor their personalization according to their class environment or teaching goals.
| שפת פרסום | אנגלית |
| דפים | 458-465 |
| סטטוס פרסום | פורסם - 01.01.2023 |