Quasimetric: A distance function d(x, y) that satisfies the triangle inequality and d(x, x)=0, but is not necessarily symmetric (d(x, y) != d(y, x)).
Triangle Inequality: The property that the distance from A to C is never greater than the trip from A to B plus B to C.
Optimal Value Function V*: The maximum expected return (or minimum cost) achievable by any policy from a state to a goal.
IQE: Interval Quasimetric Embeddings—a specific neural network architecture designed to output valid quasimetrics.
HER: Hindsight Experience Replay—an RL technique where past experiences are replayed with different goals to learn more efficiently.
CQL: Conservative Q-Learning—an offline RL algorithm that regularizes Q-values to prevent overestimation on unseen actions.
MSG: Model Standard-deviation Gradients—an ensemble-based offline RL method.
Diffuser: A trajectory-based diffusion model for planning in RL.