RAG: Retrieval-Augmented Generation—AI systems that answer questions by first searching for relevant documents to use as context
SFT: Supervised Fine-Tuning—training a model on labeled input-output pairs to adapt it to a specific task
PEFT: Parameter-Efficient Fine-Tuning—techniques like LoRA that update only a small subset of parameters to reduce computational cost
LoRA: Low-Rank Adaptation—a PEFT technique that injects trainable rank-decomposition matrices into transformer layers while freezing pre-trained weights
RLHF: Reinforcement Learning from Human Feedback—a method to align models with human values using rewards derived from human preferences
PPO: Proximal Policy Optimization—an RL algorithm used to update model policies stably
DPO: Direct Preference Optimization—an alignment method that optimizes the model directly on preference data without a separate reward model
MoE: Mixture of Experts—an architecture using multiple specialized sub-networks (experts) where a gating mechanism routes inputs to the most relevant expert
MoA: Mixture of Agents—a framework leveraging collaboration between multiple autonomous agents
Word2Vec: A technique to represent words as vectors where semantic relationships are captured by vector angles
TF-IDF: Term Frequency-Inverse Document Frequency—a statistical measure used to evaluate how important a word is to a document in a collection
BM25: Best Matching 25—a ranking function used by search engines to estimate the relevance of documents to a given search query