Improve model card: Correct library_name to diffusers and add full abstract

This PR improves the model card for USO by:

1. **Correcting `library_name`**: The `library_name` has been changed from `transformers` to `diffusers`. This is based on the `_diffusers_version` entry found in the `config.json` file within the repository, and the model being a Diffusion Transformer (FLUX.1-dev based), which is indicative of `diffusers` compatibility. This ensures that the automated code snippets generated by the Hugging Face Hub will accurately reflect the model's intended usage.
2. **Adding the full paper abstract**: The existing "Introduction" section has been replaced with the complete abstract provided in the paper details, and the section has been renamed to "Abstract". This provides a more comprehensive and standard overview of the model directly at the top of the model card.

All other sections, including the custom inference examples from the GitHub README, remain unchanged to preserve the original author's provided usage instructions.

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ base_model:
 - black-forest-labs/FLUX.1-dev
 language:
 - en
-library_name: transformers
 license: apache-2.0
 pipeline_tag: text-to-image
 tags:
@@ -31,8 +31,8 @@ Paper: [USO: Unified Style and Subject-Driven Generation via Disentangled and Re
 ![teaser of USO](./assets/teaser.webp)
-## 📖 Introduction
-Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of “content” and “style”, a long-standing theme in style-driven research. To this end, we present USO, a Unified framework for Style driven and subject-driven GeneratiOn. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content–style disentanglement training. Third, we incorporate a style reward-learning paradigm to further enhance the model’s performance.
 ## ⚡️ Quick Start

 - black-forest-labs/FLUX.1-dev
 language:
 - en
+library_name: diffusers
 license: apache-2.0
 pipeline_tag: text-to-image
 tags:
 ![teaser of USO](./assets/teaser.webp)
+## Abstract
+Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of content and style, a long-standing theme in style-driven research. To this end, we present USO, a Unified Style-Subject Optimized customization model. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content-style disentanglement training. Third, we incorporate a style reward-learning paradigm denoted as SRL to further enhance the model's performance. Finally, we release USO-Bench, the first benchmark that jointly evaluates style similarity and subject fidelity across multiple metrics. Extensive experiments demonstrate that USO achieves state-of-the-art performance among open-source models along both dimensions of subject consistency and style similarity. Code and model: this https URL
 ## ⚡️ Quick Start