Hyperspherical Normalization: Forcing vectors (weights or features) to have a length (norm) of 1, effectively projecting them onto the surface of a high-dimensional sphere.
LERP: Learnable Linear Interpolation—a mechanism that blends the input and output of a layer using a learnable weight, replacing standard addition in residual connections.
Effective Learning Rate: The actual impact of a gradient step on the model's behavior, which decreases if weight magnitudes grow large while the learning rate stays fixed.
Distributional Critic: A Q-function that predicts the full probability distribution of future returns rather than just the single expected value (mean).
SAC: Soft Actor-Critic—an off-policy RL algorithm that maximizes both expected reward and the entropy (randomness) of the policy.
Non-stationarity: In RL, the problem where the data distribution changes constantly as the agent learns, unlike in supervised learning where data is static.