TDM: Trajectory Distribution Matching—a few-step generative method that aligns student and teacher trajectories at the distributional level using deterministic sampling.
NFE: Number of Function Evaluations—the number of times the neural network is called to generate a single image; fewer is faster.
RLHF: Reinforcement Learning from Human Feedback—fine-tuning models to maximize a reward model derived from human preferences.
Surrogate Reward: A learned differentiable reward function that approximates the true non-differentiable reward, used to guide the generator's gradients.
EMA: Exponential Moving Average—a technique where model weights are updated as a moving average of past weights to stabilize training.
GenEval: A rigorous benchmark for evaluating text-to-image models on their ability to follow complex prompts and spatial reasoning.
KL divergence: Kullback-Leibler divergence—a measure of how one probability distribution differs from a second, reference probability distribution.
ODE: Ordinary Differential Equation—in this context, refers to deterministic sampling paths in diffusion models (Probability Flow ODE).
SDE: Stochastic Differential Equation—sampling paths involving random noise injection.