← Back to Paper List

An XGBoost-Based Knowledge Tracing Model

Wei Su, Fan Jiang, Chunyan Shi, Dongqing Wu, Lei Liu, Shihua Li, Yongna Yuan, Juntai Shi
Lanzhou University, CITIC Bank Software Development Center, Duzhe Publishing Group Co. Ltd., Yizhichuan Primary School
International Journal of Computational Intelligence Systems (2023)
P13N Recommendation Memory

📝 Paper Summary

Knowledge Tracing Educational Data Mining
The paper demonstrates that XGBoost, combined with rich feature engineering like attempt counts and problem IDs, outperforms complex deep learning models in knowledge tracing accuracy and training speed.
Core Problem
Existing Knowledge Tracing (KT) models, particularly Deep Learning approaches (DKT), suffer from low interpretability, slow training times, and difficulty in effectively utilizing heterogeneous multi-dimensional features.
Why it matters:
  • Accurate student modeling is the backbone of Intelligent Tutoring Systems (ITSs), enabling personalized feedback and curriculum sequencing.
  • Deep learning models often require massive compute and data, making them hard to deploy in real-time educational environments.
  • Standard sequence models often ignore crucial metadata (like how many times a student attempted a specific problem), limiting prediction accuracy.
Concrete Example: In the ASSISTments dataset, a student might attempt the same problem multiple times. A standard DKT (Deep Knowledge Tracing) model tracking only correct/incorrect sequences misses the context of 'attempt count', whereas the proposed XGBoost model explicitly leverages this feature to predict mastery probability more accurately.
Key Novelty
XGBoost-KT with Explicit Feature Engineering
  • Reframes knowledge tracing from a pure time-series sequence problem to a feature-rich classification problem using XGBoost (eXtreme Gradient Boosting).
  • Explicitly incorporates auxiliary features—such as 'problem_id', 'attempt_count', and teacher/school IDs—into the decision tree inputs, rather than relying solely on the latent states of a neural network.
Evaluation Highlights
  • Achieves 0.9855 AUC on the ASSIST09 dataset using full features, outperforming the AutoInt baseline (0.9843) and significantly surpassing DKT (0.8583).
  • Reduces training time drastically: XGBoost trains in ~41 seconds on ASSIST09, compared to ~1.5 hours for AutoInt and ~10 minutes for DKT.
  • Identifies 'attempt_count' as a critical predictor; adding it improves AUC by roughly 0.23 compared to using only user/skill features on ASSIST09.
Breakthrough Assessment
5/10
Provides a strong pragmatic engineering result showing traditional ML (XGBoost) can beat Deep Learning when features are well-engineered, but does not propose a fundamental theoretical advance.
×