← Back to Paper List

MiLP: Personalized LLM response generation w. Parameterized user memory injection

(WPI, Alibaba) Kai Zhang, Lizhi Qing, Yangyang Kang, Xiaozhong Liu
Worcester Polytechnic Institute, The University of Texas Health Science Center at Houston
arXiv, 4/2024 (2024)
Memory P13N QA

📝 Paper Summary

Memory internalization Personalization (P13N)
MiLP injects user history into Large Language Models as parameterized memory using multiple LoRA adapters, optimized via Bayesian Optimization to balance memory capacity and generation quality.
Core Problem
Existing personalization methods either suffer from limited context windows (prompt-based) or loss of fine-grained detail during retrieval (memory-based), struggling to effectively incorporate complex user histories.
Why it matters:
  • Context window limits prevent full utilization of long user histories in prompt-based approaches
  • Retrieval-based methods often miss fine-grained details due to the nature of similarity search
  • Generic responses in sensitive domains like healthcare can be inappropriate if patient history is ignored or fragmented
Concrete Example: In healthcare, a patient's long-term medical trajectory contains complex interactions. A standard retriever might fetch fragmented records that provide an incorrect snapshot of disease progression, leading the LLM to give generic or unsafe advice.
Key Novelty
Parameterized Memory-injected LLM Personalization (MiLP)
  • Mimics bionic memory by storing user history directly in the LLM's Feed Forward Layers using multiple LoRA adapters, rather than as external text
  • Treats the configuration of these adapters (which layers to inject, rank size, number of adapters) as a high-dimensional search problem solved by Bayesian Optimization
Architecture
Architecture Figure Figure 2
The MiLP framework showing the Bayesian Optimization loop interacting with the LLM. It illustrates how the search space (layers, rank, number of LoRAs) is explored to minimize loss and maximize ROUGE-L.
Evaluation Highlights
  • Outperforms Text-prompt, Memory-augmented, and User-embedding baselines across AmazonQA, Reddit, and MedicalDialogue datasets
  • Achieves higher ROUGE-L and Persona-F1 scores on LLaMA2-13B compared to prompt-based personalization
  • Demonstrates superior Win Rate in human evaluation against standard generation methods
Breakthrough Assessment
7/10
Novel application of Bayesian Optimization to search the architecture of adapter-based memory injection. Addresses the 'where to store memory' problem in LLMs effectively.
×