DLM: Diffusion Language Model—a generative model that creates text by iteratively denoising a sequence of random masks rather than predicting the next token sequentially.
ARM: Autoregressive Model—standard language models (like GPT) that generate text one token at a time from left to right.
(n, p)-discoverable extraction: A metric defining a sequence as memorized if it can be generated exactly within 'n' attempts with probability at least 'p'.
sampling resolution: The number of steps used in the diffusion reverse process to convert noise into text; fewer steps are faster but coarser, more steps are finer-grained.
PII: Personally Identifiable Information—sensitive data like emails or phone numbers.
mask token: A special token (e.g., [MASK]) used to replace original tokens during the forward diffusion process, which the model learns to predict.
Hamming distance: A metric measuring the number of positions at which the corresponding symbols in two sequences are different.