← Back to Paper List

UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization

Junjie He, Yifeng Geng, Liefeng Bo
Institute for Intelligent Computing, Alibaba Group
arXiv.org (2024)
MM P13N

📝 Paper Summary

Identity-Preserving Generation Multi-Concept Personalization
UniPortrait unifies single- and multi-ID image personalization by decoupling intrinsic identity from facial structure and using a routing mechanism to assign identities to specific image regions without predefined masks.
Core Problem
Existing methods struggle to balance face fidelity with editability (often just copying the reference) and fail to handle multiple identities without strict prompt constraints or manual layout masks.
Why it matters:
  • Current tuning-free methods often lose spatial facial details or overfit to irrelevant reference attributes (lighting, pose), limiting creative editing.
  • Multi-ID generation typically suffers from 'identity blending' where faces mix features, or requires rigid prompt formats (one-to-one token mapping) that limit natural language description.
  • Manual masking for multi-ID generation restricts the diversity of layouts and poses, preventing the model from generating novel compositions.
Concrete Example: When generating 'a man and a woman in a cafe', standard personalization methods might blend the man's and woman's facial features onto both faces (identity blending). Alternatively, mask-based methods require the user to draw exactly where the faces should be, preventing the model from creatively positioning them.
Key Novelty
ID Embedding with Decoupling + ID Routing
  • Uses a two-branch ID embedding module: one for 'intrinsic identity' (invariant features) and one for 'face structure' (spatial details), heavily regularized to prevent overfitting to the reference image's non-identity attributes.
  • Introduces an 'ID Routing' module that dynamically assigns the best-matching identity embedding to each spatial location in the image generation process, preventing ID mixing without needing manual masks.
Architecture
Architecture Figure Figure 1
Overview of the UniPortrait framework, detailing the ID Embedding Module and the ID Routing Module within the U-Net.
Evaluation Highlights
  • Achieves higher ID similarity (CS-I) and prompt consistency (CLIP-T) than InstantID and IP-Adapter-FaceID-Plus on single-ID benchmarks.
  • Outperforms FastComposer and consistent-ID-v1 in multi-ID customization, effectively preventing identity blending.
  • Demonstrates universal compatibility with control tools like ControlNet and IP-Adapter.
Breakthrough Assessment
8/10
Significantly improves the flexibility of multi-ID generation by removing layout/prompt constraints while maintaining high fidelity. The routing mechanism is a clever architectural solution to identity blending.
×