Model Factorization: Decomposing model weights into a shared component (base) and user-specific components (heads) to balance generalization and personalization
Reranker: A model that re-scores retrieved documents to prioritize the most useful ones for a specific query, rather than just the most semantically similar
Adapter: In this paper, a scoring model that evaluates candidate generations from the black-box LLM to select the one best aligned with user preference (effectively a reward model for rejection sampling)
LaMP: Language Model Personalization benchmark—a collection of datasets for evaluating how well LLMs can adapt to user-specific writing styles and preferences
Black-box LLM: Large Language Models (like GPT-3.5/4) accessible only via API inference, meaning internal weights cannot be viewed or modified
Rejection Sampling: A technique where multiple candidate outputs are generated, and a separate model selects the best one based on a scoring criterion
Hydra Head: A lightweight, user-specific neural network layer (e.g., a single linear layer) attached to a shared base model to capture individual idiosyncrasies