Evaluation Setup
Qualitative comparison of cartoonization effects on portraits, animals, landscapes, and architecture
Benchmarks:
- Self-collected test images (Image Cartoonization) [New]
Metrics:
- Visual Fidelity
- Stylization Quality
- Statistical methodology: Not explicitly reported in the paper
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
ฮ |
| Ablation studies reveal the optimal ranges for noise disturbance parameters to achieve cartoonization without degradation. |
| Parameter Sensitivity |
Visual Quality (Qualitative) |
Insufficient cartoonization |
Optimal cartoonization |
Achieved at b=300, s=300
|
| Guidance Scale |
Cartoonization Degree (Qualitative) |
Variable |
Stable results |
Range [8, 12]
|
| DDIM Steps |
Image Cleanliness (Qualitative) |
Significant noise |
Clean cartoon |
N > 60
|
Main Takeaways
- Null-text guidance is not just a neutral baseline; modifying it actively shapes the generation style towards cartoons
- Back-D (Rollback Disturbance) creates stronger abstraction suitable for general cartoonization, while Image-D (Image Disturbance) preserves higher fidelity details from the input
- The method outperforms GAN-based baselines in generating vivid, 3D-like cartoon textures rather than flat comic styles
- Text prompts in the conditional branch can be used to inject creative diversity (e.g., changing species) while maintaining the cartoon style induced by the null-text branch