saliency analysis: Techniques (like computing input gradients) to determine which features of the input most strongly affect the neural network's output, used here to spot mathematical patterns
Kazhdan-Lusztig polynomial: A complex polynomial associated with a pair of permutations in a Coxeter group, central to representation theory
Bruhat graph: A directed graph structure representing relations between elements of a Coxeter group (like the symmetric group)
cross entropy method: An optimization technique where samples are drawn from a distribution, the best performers are selected, and the distribution is updated to increase the likelihood of those performers
descent set: In combinatorics, the set of indices where a permutation value decreases (e.g., i such that x(i) > x(i+1))
parity bit: A function that sums bits modulo 2; used here as an example of a noise-sensitive function that neural networks struggle to learn without sufficient density
transformer: A deep learning architecture based on self-attention mechanisms, typically used for sequence-to-sequence tasks
graph neural network (GNN): A neural network architecture designed to operate on graph structures, capturing relationships between nodes and edges
Wigner matrices: Random matrices where entries are independent and identically distributed (often from a uniform distribution)
support vector machine (SVM): A supervised learning model that classifies data by finding an optimal hyperplane that separates classes with the maximum margin