fail2pass: Tests that fail before the fix is applied and must pass after the fix (verifying the bug is resolved)
pass2pass: Tests that pass before and after the fix (regression tests ensuring no existing functionality is broken)
copyleft: A licensing scheme (e.g., GPL) requiring derivative works to be open source; used here to identify repos less likely to be in proprietary commercial training sets
GPL: General Public Licenseโa strong copyleft license used to filter repositories for the public set
scaffold: The software framework wrapping an LLM that allows it to interact with tools, file systems, and environments (e.g., SWE-Agent)
Pass@1: The percentage of problems where the model's single generated solution successfully passes all tests
held-out set: A portion of the dataset kept private and never released to prevent future models from training on it
agent trajectory: The sequence of actions (commands, reads, edits) an agent takes while attempting to solve a task