ImageGen-CoT: A structured reasoning text generated by the model before the image, explaining the inferred pattern or subject characteristics
T2I-ICL: Text-to-Image In-Context Learning—generating images based on patterns learned from a few examples provided in the prompt
Unified MLLMs: Models capable of processing and generating both text and images within a single architecture (e.g., SEED-X, SEED-LLaMA)
Test-time scaling: Increasing computational budget during inference (e.g., by generating more samples) to improve performance
Pass@N: A metric evaluating if at least one correct output exists among N generated samples
DreamBench++: A benchmark for subject-driven image generation, evaluating the ability to generate images of a specific subject in different contexts
CoBSAT: A benchmark for T2I-ICL evaluating capabilities like style transfer, subject binding, and identifying relationships