PTQ: Post-training Quantization—compressing a model's weights and activations to lower precision (e.g., 8-bit) without full re-training, using only a small calibration set
DiT: Diffusion Transformer—a diffusion model backbone that uses Transformer blocks instead of the traditional U-Net convolutional architecture
Salient Channels: Specific channels in neural network layers that contain values with significantly higher magnitudes than others, causing large quantization errors if not handled
W8A8: Quantization setting where both Weights and Activations are represented using 8 bits
FID: Fréchet Inception Distance—a metric for evaluating the quality of generated images by comparing their distribution to real images; lower is better
Spearman's ρ: A rank correlation coefficient used here to measure how similarly the 'salience' (magnitude) of activations and weights are distributed across channels
Re-parameterization: Mathematically transforming the weights and biases of a network offline so that complex operations (like scaling) are baked into the static parameters, avoiding runtime cost
adaLN: Adaptive Layer Normalization—a normalization layer where scale and shift parameters are dynamically predicted from condition embeddings (e.g., time, class)