PEFT: Parameter-Efficient Fine-Tuning—methods to adapt models by training only a small subset of parameters
FFT: Full Fine-Tuning—training all parameters of the model
LoRA: Low-Rank Adaptation—a PEFT method that approximates weight updates as the product of two small rank-deficient matrices
RPCA: Robust Principal Component Analysis—a statistical procedure to decompose a matrix into a low-rank component and a sparse component
SVD: Singular Value Decomposition—a factorization of a matrix that reveals its intrinsic rank and principal components
SDDMM: Sampled Dense-Dense Matrix Multiplication—a specialized kernel operation for efficient sparse matrix computation
Intrinsic Rank: The minimum dimension required to accurately represent the information in a matrix (e.g., a weight update)
QLoRA: Quantized LoRA—a version of LoRA applied to quantized (compressed) base weights
L0 norm: A measure of sparsity counting the number of non-zero elements in a vector or matrix