Power-of-Two (PoT): Numbers that can be represented as 2^n, allowing multiplication/division to be performed via efficient bitwise shift operations.
Re-quantization: The process of rescaling the high-precision output of a layer (e.g., 32-bit accumulator) back to lower precision (e.g., 8-bit) for the next layer's input.
LayerNorm (LN): Layer Normalization, a technique to normalize activations across the feature dimension, often a bottleneck in quantization due to outliers.
Hessian Trace: The sum of eigenvalues of the Hessian matrix (second-order derivatives), used as a metric for layer sensitivity to quantization noise.
Dyadic Numbers: Rational numbers with a format A/2^B; often used to approximate floating-point values in integer-only arithmetic.
Row-Stationary Dataflow: A hardware dataflow strategy where rows of data (e.g., weights or activations) remain stationary in local storage to maximize reuse and minimize data movement.