← Back to Paper List

KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model

Kai Zhang, Rui Zhu, Shutian Ma, Jingwei Xiong, Yejin Kim, Fabricio Murai, Xiaozhong Liu
Worcester Polytechnic Institute, Yale University, University of California, Davis, The University of Texas Health Science Center at Houston
arXiv (2025)
Recommendation RAG KG Reasoning Benchmark

📝 Paper Summary

Drug Discovery / Drug Repurposing Biomedical Natural Language Processing
KEDRec-LM improves explainable drug recommendation by distilling knowledge from a teacher model that reasons over biomedical literature retrieved for specific drug-disease pairs from a knowledge graph.
Core Problem
Identifying therapeutic drug-disease relationships is complex because knowledge graphs are static and biomedical literature (e.g., PubMed) is too vast to manually reason over effectively.
Why it matters:
  • Traditional knowledge graphs lack the nuanced context required to reason about complex therapeutic mechanisms
  • Standard retrieval systems provide documents but fail to synthesize insightful reasoning or explain why a drug treats a disease
  • There is a lack of automated tools that can bridge structured graph data with unstructured literature for explainable decision-making
Concrete Example: When given a disease and a potential drug, a standard model might output a score based on graph connectivity. However, without accessing specific clinical trial text or mechanism descriptions, it cannot generate a rationale explaining *how* the drug efficacy interacts with disease pathology.
Key Novelty
Distilled RAG for Drug Recommendation
  • Constructs a focused dataset by sampling hard-negative drug candidates from a knowledge graph using GNN embeddings
  • Uses a Teacher model to generate high-quality rationales based on retrieved PubMed/Clinical Trials text
  • Distills this reasoning capability into a smaller Student LLaMA model that learns to both select the correct drug and generate the rationale
Architecture
Architecture Figure Figure 1
The three-stage framework of KEDRec-LM: Sampling, Retrieval, and Distillation.
Breakthrough Assessment
7/10
Integrates KG sampling, RAG, and distillation in a logical pipeline for a high-value domain (drug discovery). The construction of the expRxRec dataset is a significant resource contribution.
×