Machine Unlearning: The process of removing specific knowledge or data points from a trained machine learning model
Model Collapse: A failure mode where a generative model loses its diversity or linguistic structure, often outputting repetitive or garbage text
Negative Log-Likelihood (NLL): A loss function commonly used in training language models; minimizing it improves prediction, maximizing it (gradient ascent) induces forgetting
KL Divergence: Kullback-Leibler divergence—a measure of how one probability distribution differs from a second, reference probability distribution
Extraction Likelihood (EL): A metric measuring the success rate of extracting specific training sequences from a model via generation
Memorization Accuracy (MA): A metric quantifying how accurately a model can complete a given prefix from the training data
Analogous Set: A constructed dataset containing information similar in category to the forget set but with different key concepts, used to preserve model capabilities
BLEU: Bilingual Evaluation Understudy—a metric for evaluating the quality of text which has been machine-translated from one natural language to another
BERTScore: A metric for text generation evaluation that computes similarity using contextual embeddings from BERT rather than exact n-gram matching
Sentence Transformer: A modification of the BERT network that uses siamese networks to derive semantically meaningful sentence embeddings
GLM: Generative Language Model—AI models designed to generate text, such as GPT or Llama