CRS: Conversational Recommender System—an interactive system that suggests items (like movies) through natural language dialogue.
RLPF: Reinforcement Learning from CRSs Performance Feedback—the paper's method for tuning the LLM using recommendation and generation metrics as rewards.
REINFORCE: A Monte Carlo Policy Gradient algorithm that updates model weights to maximize expected rewards.
Schema: A structured template defining the name, arguments, and output type of a sub-task, used to guide the LLM's understanding.
Expert Models: Specialized, smaller models optimized for specific tasks (e.g., a collaborative filtering model for recommendation) that the LLM calls upon.
BLEU: Bilingual Evaluation Understudy—a metric for evaluating text generation quality by measuring n-gram overlap with reference text.
Distinct-n: A metric measuring the diversity of generated text by calculating the ratio of unique n-grams to total n-grams.
Demonstration-based Instruction: Providing examples (few-shot prompting) in the input to teach the LLM how to perform a task.
Inference Endpoint: The specific API or local function call used to execute an expert model.