← Back to Paper List

Agentic Software Issue Resolution with Large Language Models: A Survey

Zhonghao Jiang, David Lo, Zhongxin Liu
The State Key Laboratory of Blockchain and Data Security, Zhejiang University, School of Computing and Information Systems, Singapore Management University
arXiv (2025)
Agent Benchmark RL Reasoning

📝 Paper Summary

Agentic AI Automated Software Engineering
This survey systematically reviews 126 studies on agentic software issue resolution, proposing a taxonomy across benchmarks, techniques, and empirical studies while highlighting the paradigm shift toward reinforcement learning.
Core Problem
Existing surveys fragment the field into automated program repair (APR) or code generation, failing to capture the holistic, multi-step nature of issue resolution or the recent transition from prompt engineering to RL-based training.
Why it matters:
  • Issue resolution encompasses diverse activities (optimization, feature addition) beyond just bug fixing, which traditional APR taxonomies overlook
  • Real-world resolution requires long-horizon reasoning and feedback-driven decision making, demanding agentic capabilities rather than single-step generation
  • A paradigm shift is occurring where researchers are moving from general-purpose LLMs to training domain-specific models via reinforcement learning, which prior surveys miss
Concrete Example: Traditional APR surveys assume the existence of triggering test cases (fault localization based on coverage). In contrast, modern agentic issue resolution often starts with only a natural language issue description, requiring the agent to autonomously locate relevant files, generate reproduction tests, and iterate on fixes without pre-existing test suites.
Key Novelty
Systematic Survey of Agentic Issue Resolution
  • Establishes the first comprehensive taxonomy specifically for LLM-based agentic issue resolution, covering three dimensions: benchmarks, techniques (workflow phases), and empirical studies
  • Identifies and formalizes the 'paradigm shift' in the field: the transition from scaffold-based prompt engineering to training-based methods leveraging reinforcement learning (RL) on LLMs
  • Integrates diverse software maintenance activities (bug fixing, feature addition, optimization) under a unified task definition, distinct from narrower APR or code generation scopes
Architecture
Architecture Figure Figure 1
The typical framework/workflow of the automated issue resolution task as synthesized from the literature
Evaluation Highlights
  • systematic review of 126 recent studies filtered from an initial pool of 385 papers
  • analysis revealing that 62.7% of papers in this rapidly evolving field are currently preprints (arXiv) rather than peer-reviewed publications
  • bibliometric evidence showing AI venues (26.9%) currently outpace Software Engineering venues (10.4%) in publishing research on this specific task
Breakthrough Assessment
9/10
Essential and timely survey for a rapidly exploding field. It provides the first structured roadmap and taxonomy for agentic issue resolution, clearly distinguishing it from related fields like APR.
×