fenfan nielsr HF Staff commited on
Commit
18feb32
·
verified ·
1 Parent(s): b745e66

Improve model card: Correct library_name to diffusers and add full abstract (#5)

Browse files

- Improve model card: Correct library_name to diffusers and add full abstract (cf428c6dafa0748eb8be5114203a6d5ef71aab6d)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -3,7 +3,7 @@ base_model:
3
  - black-forest-labs/FLUX.1-dev
4
  language:
5
  - en
6
- library_name: transformers
7
  license: apache-2.0
8
  pipeline_tag: text-to-image
9
  tags:
@@ -31,8 +31,8 @@ Paper: [USO: Unified Style and Subject-Driven Generation via Disentangled and Re
31
 
32
  ![teaser of USO](./assets/teaser.webp)
33
 
34
- ## 📖 Introduction
35
- Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of content and style”, a long-standing theme in style-driven research. To this end, we present USO, a Unified framework for Style driven and subject-driven GeneratiOn. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and contentstyle disentanglement training. Third, we incorporate a style reward-learning paradigm to further enhance the models performance.
36
 
37
  ## ⚡️ Quick Start
38
 
 
3
  - black-forest-labs/FLUX.1-dev
4
  language:
5
  - en
6
+ library_name: diffusers
7
  license: apache-2.0
8
  pipeline_tag: text-to-image
9
  tags:
 
31
 
32
  ![teaser of USO](./assets/teaser.webp)
33
 
34
+ ## Abstract
35
+ Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of content and style, a long-standing theme in style-driven research. To this end, we present USO, a Unified Style-Subject Optimized customization model. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content-style disentanglement training. Third, we incorporate a style reward-learning paradigm denoted as SRL to further enhance the model's performance. Finally, we release USO-Bench, the first benchmark that jointly evaluates style similarity and subject fidelity across multiple metrics. Extensive experiments demonstrate that USO achieves state-of-the-art performance among open-source models along both dimensions of subject consistency and style similarity. Code and model: this https URL
36
 
37
  ## ⚡️ Quick Start
38