Agentic LLMs: LLMs that (1) reason, (2) act, and (3) interact, possessing a degree of autonomy to achieve goals
In-context learning: The ability of a model to learn from examples provided in the prompt at inference time without updating weights
VLA: Vision-Language-Action models—models that update weights according to robotic action-feedback sequences
CoT: Chain-of-Thought—a prompting strategy that encourages the model to generate intermediate reasoning steps
SFT: Supervised Fine-Tuning—training a model on labeled examples to specialize it for a task
RLHF: Reinforcement Learning from Human Feedback—aligning models to human preferences using reward signals
Hallucination: When LLMs generate answers that are factually incorrect or ungrounded
LoRa: Low-Rank Optimization—a parameter-efficient fine-tuning technique
DPO: Direct Preference Optimization—an alignment method optimizing preferences without a separate reward model