| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Efficiency and performance comparison against FlowGRPO on specific reward maximization tasks shows DiffusionNFT converging much faster to higher scores. | ||||
| GenEval (Object counting task) | GenEval Score | 0.95 | 0.98 | +0.03 |
| GenEval (Object counting task) | GenEval Score (Initial) | 0.24 | 0.98 | +0.74 |
| General enhancement of SD3.5-Medium across multiple benchmarks using DiffusionNFT. | ||||
| PickScore | Score | Not explicitly reported in the paper | Not explicitly reported in the paper | Not explicitly reported in the paper |