← Back to Paper List

Constructing a Question-Answering Simulator through the Distillation of LLMs

Haipeng Liu, Ting Long, Jing Fu
School of Artificial Intelligence, Jilin University
arXiv (2025)
Recommendation P13N QA

📝 Paper Summary

Educational Recommender Systems (ERS) Student Simulation / User Modeling Knowledge Distillation
LDSim creates an efficient student simulator by distilling an LLM's domain knowledge into a concept graph and its reasoning capabilities into mastery labels for training a lightweight neural network.
Core Problem
Existing QA simulators are either LLM-free (fast but inaccurate due to lack of semantic understanding) or LLM-based (accurate but too slow and computationally expensive for real-time interaction with recommender systems).
Why it matters:
  • Educational Recommender Systems (ERS) need simulators to train safely without exposing real students to harmful, untrained recommendations
  • LLM-free methods treat concepts as isolated IDs, missing prerequisite relationships (e.g., addition helps multiplication), leading to poor simulation accuracy
  • Directly using LLMs for simulation is prohibitively expensive and slow for the large-scale trial-and-error interactions required to train an ERS
Concrete Example: When simulating a student's response to question q8, a simulator must rely on predicted history (q4-q7). If an LLM-free model fails to capture that 'addition' is a prerequisite for 'multiplication' (concept relation), it may incorrectly predict the student's mastery state based on the synthetic history, misleading the recommender system.
Key Novelty
LLM Distillation based Simulator (LDSim)
  • Distills 'World Knowledge' by prompting an LLM to identify prerequisite relationships between concepts, constructing a rich Concept Relation Graph instead of using static expert maps
  • Distills 'Reasoning Capability' by asking an LLM to infer a student's latent concept mastery from their history, creating a labeled 'distilled dataset' (including synthetic pseudo-QA records) to train a smaller model
Architecture
Architecture Figure Figure 2
The overall architecture of LDSim, illustrating the three main modules: Knowledge Distillation (KD), Reasoning Distillation (RD), and the Simulation Module (Sim).
Breakthrough Assessment
7/10
Novel application of LLM distillation specifically for educational student simulation, effectively bridging the gap between high-performance LLMs and efficient sequential models.
×