School of Computer Science, Peking University,
Beijing Key Laboratory of Security and Privacy in Intelligent Transportation, Beijing Jiaotong University
International Conference on Automated Software Engineering
(2024)
ReasoningAgentFactuality
📝 Paper Summary
Smart Contract SecurityAutomated Program Repair (APR)LLM for Code
ContractTinker repairs complex smart contract vulnerabilities by decomposing the task via Chain-of-Thought reasoning and grounding the LLM with static analysis (dependency graphs and program slicing).
Core Problem
Existing repair tools rely on predefined patterns that fail on high-level business logic bugs, while standard LLMs suffer from hallucinations and lack context when repairing complex real-world contracts.
Why it matters:
Smart contracts manage significant financial assets, making them high-value targets for attackers
Real-world vulnerabilities often involve complex business logic (e.g., price manipulation) rather than simple low-level bugs like re-entrancy
Manual repair is labor-intensive and requires deep security expertise that many developers lack
Concrete Example:A contract might have a price manipulation vulnerability where a function calculates an asset price insecurely. Pattern-based tools miss this because it's logic-specific. A standard LLM might try to fix it but hallucinate variables not present in the code or miss dependencies in other contracts.
Key Novelty
Context-Aware Chain-of-Thought for Repair
Decomposes the repair process into steps simulating a security expert: Attack Analysis -> Strategy Generation -> Code Patching
Injects static analysis results (Dependency Graphs, Program Slices) at each reasoning step to ground the LLM's logic in the actual codebase structure
Architecture
Workflow of ContractTinker: From Audit Report/Project input -> Dependency Analysis -> Vulnerability Localization -> CoT Patch Generation -> Refinement.
Evaluation Highlights
Repairs 23 out of 48 (48%) high-risk real-world vulnerabilities with valid patches
Generates patches requiring only minor modifications for an additional 10 vulnerabilities (21%)
Achieves a high success rate in generating correct mitigation strategies before coding
Breakthrough Assessment
7/10
Effective integration of static analysis with LLM CoT for a hard domain (smart contracts). Sample size (48) is small but realistic for this domain due to data scarcity.
⚙️ Technical Details
Problem Definition
Setting: Given a smart contract project and a vulnerability audit report, generate a valid code patch that fixes the vulnerability.
Model or implementation: LLM (Validator) + Compilation Checker
Novel Architectural Elements
Contextual Dependency Graph (CDG) construction guided by audit report entities to prune irrelevant code before LLM context
Interleaved Static Analysis and CoT: Static analysis results (slices/graphs) are injected specifically into relevant reasoning steps (e.g., Q2, Q3) rather than just once at the start
Modeling
Base Model: GPT-4 and GPT-3.5
Compute: Not reported in the paper (Inference-only approach using APIs)
Comparison to Prior Work
vs. SGuard/EVMPatch: ContractTinker addresses high-level functional/business logic vulnerabilities, whereas baselines focus on low-level bugs (re-entrancy, overflow)
vs. Standard LLM (Zero-shot): ContractTinker uses CoT + Static Analysis to reduce hallucination and improve context [not cited in paper as specific baseline, but implied comparison]
vs. SCRepair: Does not rely on manually created unit tests for search guidance
Limitations
Cannot repair vulnerabilities where business logic is extremely complex or description is too vague
Dependency on the quality of the audit report text
Static analysis (Slither) limitations may propagate to the LLM context
Dataset size (48) is relatively small due to the manual effort of collection