MDM: Masked Diffusion Model—a generative model that iteratively unmasks tokens to generate data.
ELBO: Evidence Lower Bound—a variational lower bound on the log-likelihood of data, used as an optimization objective.
P2: Path Planning—the proposed inference strategy that separates token selection (planning) from token prediction (denoising).
remasking: The process of taking a previously unmasked (generated) token and turning it back into a mask token to allow the model to regenerate it.
pLDDT: Predicted Local Distance Difference Test—a metric for protein structure prediction confidence/quality.
Gillespie sampler: An algorithm for simulating continuous-time stochastic processes, used here to determine exact jump times for denoising events.
DPLM: Discrete Protein Language Model—a specific baseline MDM for protein generation.
absorbing state diffusion: A diffusion process where data is corrupted by transitioning to a specific 'absorbing' state (like a mask token) and never leaving it during the forward process.