Drifting Models: Generative models that train a one-step generator by regressing onto a 'mean-shift' transport direction calculated from data batches
Mean-Shift: An iterative algorithm that moves points towards the mode of a density estimate by following the gradient of the kernel-smoothed density
Score Function: The gradient of the log-density of a distribution (nabla log p(x)), pointing towards high-density regions
Tweedie's Formula: A statistical identity linking the expectation of the posterior mean under Gaussian noise to the score of the marginal distribution
Forward Fisher Divergence: A discrepancy measure between distributions based on score mismatch, averaged under the *data* distribution (promotes mode coverage)
Reverse Fisher Divergence: A discrepancy measure based on score mismatch, averaged under the *model* distribution (promotes mode seeking / suppressing spurious mass)
DMD: Distribution Matching Distillation—a method to distill diffusion models into one-step generators using a pre-trained teacher's score
Kernel Smoothing: Approximating a distribution by convolving it with a kernel function (e.g., Gaussian or Laplace)
Stop-Gradient: An optimization technique where a target value is treated as a constant during backpropagation, preventing gradients from flowing through the target generation process