In-context learning: The inner loop of meta-learning where a model adapts to a task at inference time using only the context window, without weight updates
Few-shot: Providing the model with K examples (typically 10-100) in the context window before the target query
One-shot: Providing the model with exactly one example and a natural language task description
Zero-shot: Providing the model with only a natural language instruction and no examples
SOTA: State-of-the-art—the best performance currently achieved by any known method
Cloze task: A task where the model must fill in a missing word or phrase in a sentence (e.g., fill-in-the-blank)
Perplexity: A measurement of how well a probability model predicts a sample; lower values indicate better performance
BLEU: Bilingual Evaluation Understudy—a metric for evaluating machine-generated text, commonly used in translation
Beam search: A search algorithm that explores a graph by expanding the most promising node in a limited set
Winograd Schema: A challenge requiring the resolution of an ambiguous pronoun in a statement, testing commonsense reasoning
Autoregressive: A model property where the output at current time step depends on previous time steps (predicting strictly left-to-right)