Simplicity Bias: The tendency of a neural network to learn simpler (lower frequency) functions that generalize better, rather than overfitting to noise.
SAC: Soft Actor-Critic—an off-policy actor-critic deep RL algorithm based on the maximum entropy reinforcement learning framework.
MLP: Multilayer Perceptron—a standard feedforward neural network consisting of fully connected layers.
Fourier Analysis: A mathematical method used here to decompose the network's function into frequencies; high frequencies indicate complexity/overfitting, low frequencies indicate simplicity.
DMC: DeepMind Control Suite—a widely used benchmark for continuous control physics tasks.
PPO: Proximal Policy Optimization—an on-policy policy gradient algorithm.
TD-MPC2: Temporal Difference Model Predictive Control 2—a model-based RL algorithm.
Residual Connection: A skip connection that adds the input of a layer to its output, facilitating gradient flow and allowing the network to learn identity functions easily.