FSA: Finite State Automata—a computational model consisting of states and transitions, used to recognize regular languages
CoT: Chain-of-Thought—a prompting method where the model generates intermediate reasoning steps before the final answer
State Tracking: The ability to maintain and update the status of a system (world state) as a sequence of events (inputs) occurs
MLP: Multilayer Perceptron—the feed-forward neural network block within a Transformer layer, shown here to be responsible for state updates
A5 Group: The alternating group on 5 elements; a specific mathematical structure that is 'non-solvable', making it a hard benchmark for neural networks
Solvable Group: A group that can be constructed from abelian (commutative) groups using extensions; easier for models to learn than non-solvable groups
Activation Patching: An interpretability technique where specific neuron activations are swapped between inputs to identify which components cause a model's behavior
Compression: A proposed metric measuring how similar the internal representations are for the same state reached via different input histories
Distinction: A proposed metric measuring how different the internal representations are for distinct states