← Back to Paper List

4KAgent: Agentic Any Image to 4K Super-Resolution

Yushen Zuo, Qi Zheng, Mingyang Wu, Xinrui Jiang, Renjie Li, Jian Wang, Yide Zhang, Gengchen Mai, Lihong V. Wang, James Zou, Xiaoyu Wang, Mingxi Yang, Zhengzhong Tu
Texas A&M University, Stanford University, Snap Inc., University of Colorado Boulder, University of Texas at Austin, California Institute of Technology, Topaz Labs, University of California, Merced
arXiv.org (2025)
Agent MM Benchmark

📝 Paper Summary

Real-world image super-resolution Agentic restoration frameworks Multi-degradation image restoration
4KAgent is a generalist AI system that autonomously plans and executes restoration pipelines using vision-language models and expert tools to upscale any low-quality image to photorealistic 4K resolution.
Core Problem
Existing super-resolution models are specialists trained on synthetic data with fixed degradation assumptions, failing to handle the diverse, unknown, and complex degradations found in real-world images.
Why it matters:
  • Real-world photos suffer from unpredictable combinations of blur, noise, and compression that fixed-model approaches cannot handle simultaneously
  • Specialist models generalize poorly to out-of-distribution domains like medical imaging or satellite photography without retraining
  • Users need flexible workflows (e.g., prioritization of fidelity vs. perception) rather than rigid one-size-fits-all outputs
Concrete Example: A highly distorted 256x256 old photo requires denoising, deblurring, and face restoration before upscaling. A standard 4x SR model simply amplifies the artifacts, while 4KAgent detects the specific flaws and sequences the correct repair tools.
Key Novelty
Agentic Generalist for Universal 4K Super-Resolution
  • Decomposes restoration into a Perception Agent (analyzes defects, creates plan) and a Restoration Agent (executes plan with reflection and rollback)
  • Introduces Quality-Driven Mixture-of-Experts (Q-MoE) to dynamically select the best output from multiple expert tools at each step based on perceptual quality metrics
  • Uses a Profile Module to allow zero-shot customization (e.g., favoring fidelity over perception) without retraining the underlying models
Architecture
Architecture Figure Figure 2
The overall workflow of 4KAgent, illustrating the interaction between the Profile Module, Perception Agent, and Restoration Agent.
Evaluation Highlights
  • Sets new state-of-the-art on Real-World Super-Resolution (RealSR) benchmarks, outperforming diffusion-based methods like StableSR and PASD
  • Achieves superior perceptual quality (lower NIQE/FID) across 11 distinct task categories including medical, satellite, and fluorescence microscopy imaging
  • Demonstrates robust zero-shot generalization to scientific domains where it was never explicitly trained, unlike traditional supervised SR models
Breakthrough Assessment
9/10
First truly universal agentic framework for super-resolution that handles arbitrary domains (medical to natural) and degradations without retraining. The execution-reflection loop significantly advances reliability in low-level vision tasks.
×