Zero-Query Setting: A scenario where the system must predict user intent based on context before the user types any text.
GAUC: Group AUC—Area Under the Curve calculated per user group, measuring ranking quality within user sessions.
LTP: Last Token Pooling—using the hidden state of the final token of an LLM generation as the sequence representation.
CoT: Chain-of-Thought—prompting the LLM to generate intermediate reasoning steps (the latent query) before the final output.
Best-of-N: A sampling strategy where N candidate outputs are generated, and the best one is selected (here, by the ranking model) for training.
DLRM: Deep Learning Recommendation Model—standard architectures for industrial ranking using embeddings and interaction layers.
CSAT: Customer Satisfaction—a metric measuring how satisfied users are with the service.
ListNet: A learning-to-rank loss function that optimizes the probability distribution of the entire ranked list.
SFT: Supervised Fine-Tuning—updating a pre-trained model on a smaller, labeled dataset.
CTR: Click-Through Rate—the ratio of users who click on a specific link to the number of total users who view a page.