PEFT: Parameter-Efficient Fine-Tuning—methods to adapt large models by updating only a small set of parameters
LoRA: Low-Rank Adaptation—a PEFT method that freezes pre-trained weights and injects trainable rank decomposition matrices into each layer
Adapter Tuning: Inserting small trainable neural network modules (adapters) between layers of a pre-trained model while freezing the original weights
Prefix Tuning: Prepending a sequence of continuous, trainable vectors (prefixes) to the input or hidden states to steer the model's generation
Prompt Tuning: Learning soft prompt embeddings that are concatenated to the input text, similar to discrete prompts but optimized via gradient descent
BitFit: A PEFT method that fine-tunes only the bias terms of the model
Delta Tuning: A theoretical framework representing the change in parameters ($\Delta\theta$) during fine-tuning
RLHF: Reinforcement Learning from Human Feedback—fine-tuning models using rewards derived from human preferences
CoT: Chain-of-Thought—a prompting strategy that encourages the model to generate intermediate reasoning steps
MoE: Mixture of Experts—a model architecture where different 'expert' sub-networks are activated for different inputs to improve efficiency
SFT: Supervised Fine-Tuning—training a model on labeled examples (instruction-output pairs)
Diffusion Models: Generative models that learn to reverse a noise-adding process to generate data (e.g., images) from noise