Implicit Personalization: Inferring user preferences from indirect cues (e.g., writing style, task requests) rather than explicit statements like 'I like X'
Agentic Memory: A system where an AI agent actively manages (writes, updates, deletes) a persistent memory representation of the user
GRPO: Group Relative Proximal Optimization—a reinforcement learning algorithm used to fine-tune the model's reasoning capabilities
RFT: Reinforcement Fine-Tuning—using RL to adjust a pre-trained model for specific behaviors, here used for reasoning about personalization
PersonaHub: A synthetic dataset of diverse user personas used as a seed for generating the personas in this paper
Markovian assumption: The assumption that the current memory state summarizes all necessary past information, so future updates depend only on the current input and the previous memory
MCQ: Multiple-Choice Question—a format used here to rigorously evaluate whether the model picked the correct personalized option