RLVR: Reinforcement Learning with Verifiable Rewards—training LLMs using outcomes (like correct math answers) as reward signals
Reasoning Graph: A directed graph where nodes represent clusters of semantically similar sentences (steps) and edges represent transitions between them in model outputs
Pass@k: The probability that at least one correct solution is generated when sampling k independent solutions from the model
SFT: Supervised Fine-Tuning—training a model to imitate expert reasoning traces (distillation from strong models like DeepSeek-R1)
chrF: Character n-gram F-score—a metric used here to measure similarity between two reasoning trajectories for clustering
Betweenness Centrality: A measure of a node's importance based on the number of shortest paths that pass through it; high centrality implies a 'hub' or bottleneck step
Modularity: A measure of the structure of a graph, quantifying the strength of division of a graph into modules (clusters, communities)
Decay Rate: The slope β in the exponential law rank plot of graph metrics (frequency, degree); higher β means activity is concentrated in fewer nodes
Graphlet: Small, connected, non-isomorphic induced subgraphs used to characterize local topology (e.g., cycles vs linear paths)