ReAct: Reasoning + Acting—a strategy where LLMs alternate between reasoning about a problem and executing actions (like running code) to solve it
CoT: Chain of Thought—prompting the LLM to generate intermediate reasoning steps before producing a final answer
SWE-bench Lite: A dataset of 300 real-world GitHub issues and pull requests from Python repositories used to benchmark software engineering agents
Lazy Representation: A retrieval strategy that initially returns only high-level signatures (class/function names) to save context, providing full code bodies only upon specific request
Diff: A file showing the differences between two versions of code (additions and deletions)
Tree-sitter: A parser generator tool and incremental parsing library used to build a concrete syntax tree for source files, enabling the agent to understand code structure
BM25: Best Matching 25—a ranking function used by search engines to estimate the relevance of documents to a given search query
RAG: Retrieval-Augmented Generation—optimizing LLM output by referencing an authoritative knowledge base outside its training data