Delta-tuning: A unified term for parameter-efficient fine-tuning methods that optimize only a small portion of parameters (the 'delta') while freezing the rest
LoRA: Low-Rank Adaptation—decomposes weight updates into low-rank matrices to reduce trainable parameters
Prefix-tuning: Prepending learnable continuous vectors (prefixes) to transformer layers to steer generation
Adapter: Small bottleneck neural modules inserted between transformer layers that are trained while the base model is frozen
BitFit: A specification-based method that fine-tunes only the bias terms of the model
Intrinsic dimensionality: The minimum number of parameters needed to represent a solution effectively; large models effectively have low intrinsic dimension
Optimal control: A mathematical framework for finding a control policy that moves a system state to a desired target; here, the 'control' is the delta update steering the PLM
Prompt-tuning: Optimizing continuous input embeddings (soft prompts) to condition the frozen model for a task
T5: Text-to-Text Transfer Transformer—an encoder-decoder model that converts all NLP problems into a text-generation format