VAE: Variational Auto-Encoder—a generative model that learns a probabilistic encoder (data to latent) and decoder (latent to data) by maximizing a lower bound on data likelihood.
ELBO: Evidence Lower Bound—a mathematical quantity that acts as a proxy for the intractable true log-likelihood of data; maximizing ELBO pushes the model distribution closer to the data distribution.
Latent Variable: Hidden variables (z) that are not directly observed but capture the underlying structure or 'essence' of the data (x).
DDPM: Denoising Diffusion Probabilistic Model—a generative model that destroys data by adding noise incrementally (forward process) and learns to reverse this process to generate data from noise.
Reparameterization Trick: A technique to allow backpropagation through random sampling nodes by expressing a random variable z as a deterministic function of parameters and an independent noise source (z = mu + sigma * epsilon).
KL Divergence: Kullback-Leibler Divergence—a measure of how one probability distribution differs from a second, reference probability distribution.
Langevin Dynamics: A physical process describing the motion of particles in a fluid, used here as an iterative method to sample from a distribution using its score function (gradient of log-density).
Score Matching: A technique to learn the gradient of the log-probability density (the 'score') of data, allowing sampling without knowing the normalizing constant of the distribution.
Fokker-Planck Equation: A partial differential equation that describes how the probability density function of a particle system evolves over time under diffusion and drift forces.
SDE: Stochastic Differential Equation—a differential equation where one or more terms are stochastic processes, essentially an ODE with added noise.