arxiv:2508.18966

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

Published on Aug 26

· Submitted by

fenfan on Aug 29

#3 Paper of the day

Upvote

Authors:

Shaojin Wu ,

Yufeng Cheng ,

Wenxu Wu ,

Abstract

USO, a unified model, achieves state-of-the-art performance in both style similarity and subject consistency by disentangling and re-composing content and style through a disentangled learning scheme and style reward-learning paradigm.

AI-generated summary

Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of content and style, a long-standing theme in style-driven research. To this end, we present USO, a Unified Style-Subject Optimized customization model. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content-style disentanglement training. Third, we incorporate a style reward-learning paradigm denoted as SRL to further enhance the model's performance. Finally, we release USO-Bench, the first benchmark that jointly evaluates style similarity and subject fidelity across multiple metrics. Extensive experiments demonstrate that USO achieves state-of-the-art performance among open-source models along both dimensions of subject consistency and style similarity. Code and model: https://github.com/bytedance/USO

View arXiv page View PDF Project page GitHub 720 Add to collection

Community

fenfan

Paper author Paper submitter 4 days ago

•

edited 4 days ago

🔥🔥 We introduce USO, an open-sourced unified customization model supports freely combine any subjects with any styles in any scenarios, delivering outputs with high subject/identity consistency and strong style fidelity while ensuring natural, non-plastic portraits.

🚄 code link: https://github.com/bytedance/USO
🚀 project page: https://bytedance.github.io/USO/
🌟 huggingface space: https://huggingface.co/spaces/bytedance-research/USO
👀 model checkpoint: https://huggingface.co/bytedance-research/USO

Open-sourced unified customization model

librarian-bot

3 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.18966 in a dataset README.md to link it from this page.

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

Abstract

Community

Models citing this paper 1

Datasets citing this paper 0

Spaces citing this paper 3

Collections including this paper 4