Added API documentation
Browse files- docs/API.md +340 -0
- docs/images/accessory_result_01.jpg +3 -0
- docs/images/avatar_image_01.jpg +3 -0
- docs/images/avatar_image_02.jpg +3 -0
- docs/images/avatar_image_03.jpg +3 -0
- docs/images/avatar_image_04.jpg +3 -0
- docs/images/avatar_modification_result_01.jpg +3 -0
- docs/images/avatar_modification_result_02.jpg +3 -0
- docs/images/avatar_prompt_result_01.jpg +3 -0
- docs/images/avatar_prompt_result_02.jpg +3 -0
- docs/images/avatar_prompt_result_03.jpg +3 -0
- docs/images/background_image_01.jpg +3 -0
- docs/images/background_image_02.jpg +3 -0
- docs/images/background_image_03.jpg +3 -0
- docs/images/background_image_04.jpg +3 -0
- docs/images/clothing_image_01.jpg +3 -0
- docs/images/clothing_image_02.jpg +3 -0
- docs/images/clothing_image_03.jpg +3 -0
- docs/images/clothing_image_04.jpg +3 -0
- docs/images/clothing_prompt_result_01.jpg +3 -0
- docs/images/clothing_prompt_result_02.jpg +3 -0
- docs/images/image_based_background_result_01.jpg +3 -0
- docs/images/image_based_background_result_02.jpg +3 -0
- docs/images/image_based_result_01.jpg +3 -0
- docs/images/new_background_result_01.jpg +3 -0
- docs/images/same_crop_result_01.jpg +3 -0
- docs/images/same_crop_result_02.jpg +3 -0
- docs/images/txt2img_result_01.jpg +3 -0
- docs/images/txt2img_result_02.jpg +3 -0
docs/API.md
ADDED
|
@@ -0,0 +1,340 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Virtual Try-On Diffusion API
|
| 2 |
+
|
| 3 |
+
<!-- TOC -->
|
| 4 |
+
* [Virtual Try-On Diffusion API](#virtual-try-on-diffusion-api)
|
| 5 |
+
* [Summary](#summary)
|
| 6 |
+
* [Consuming the API](#consuming-the-api)
|
| 7 |
+
* [Try-On Endpoints](#try-on-endpoints)
|
| 8 |
+
* [Try-On Input Parameters](#try-on-input-parameters)
|
| 9 |
+
* [Clothing image](#clothing-image)
|
| 10 |
+
* [Clothing prompt](#clothing-prompt)
|
| 11 |
+
* [Avatar image](#avatar-image)
|
| 12 |
+
* [Avatar prompt](#avatar-prompt)
|
| 13 |
+
* [Background image](#background-image)
|
| 14 |
+
* [Background prompt](#background-prompt)
|
| 15 |
+
* [Additional notes](#additional-notes)
|
| 16 |
+
* [Try-On Output](#try-on-output)
|
| 17 |
+
* [Response codes](#response-codes)
|
| 18 |
+
* [NSFW content](#nsfw-content)
|
| 19 |
+
* [Use Cases and Recipes](#use-cases-and-recipes)
|
| 20 |
+
* [Image-based virtual try-on](#image-based-virtual-try-on)
|
| 21 |
+
* [Image-based virtual try-on with background](#image-based-virtual-try-on-with-background)
|
| 22 |
+
* [Avatar from a text prompt](#avatar-from-a-text-prompt)
|
| 23 |
+
* [Clothing from a text prompt](#clothing-from-a-text-prompt)
|
| 24 |
+
* [Modifying avatar's body](#modifying-avatars-body)
|
| 25 |
+
* [Txt2Img](#txt2img)
|
| 26 |
+
* [Other creative possibilities](#other-creative-possibilities)
|
| 27 |
+
* [Performance](#performance)
|
| 28 |
+
* [Known Issues and Limitations](#known-issues-and-limitations)
|
| 29 |
+
<!-- TOC -->
|
| 30 |
+
|
| 31 |
+
## Summary
|
| 32 |
+
|
| 33 |
+
Virtual Try-On Diffusion [VTON-D] by [Texel.Moda](https://texelmoda.com) is a custom diffusion-based pipeline for fast
|
| 34 |
+
and flexible multi-modal virtual try-on. Clothing, avatar and background can be specified by reference images or text
|
| 35 |
+
prompts allowing for clothing transfer, avatar replacement, fashion image generation and other virtual try-on related
|
| 36 |
+
tasks. Check out the [demo on HuggingFace](https://huggingface.co/spaces/texelmoda/try-on-diffusion) to try the API in
|
| 37 |
+
a user-friendly way.
|
| 38 |
+
|
| 39 |
+
## Consuming the API
|
| 40 |
+
|
| 41 |
+
The API is exposed through the RapidAPI Hub which manages API subscriptions, API keys, payments and other things. Please
|
| 42 |
+
refer to the [RapidAPI Documentation](https://docs.rapidapi.com/docs/consumer-quick-start-guide) to get started.
|
| 43 |
+
|
| 44 |
+
Generally, in order to use an API you need to perform the following steps:
|
| 45 |
+
- Create a RapidAPI.com account.
|
| 46 |
+
- [Navigate to the API page](https://rapidapi.com/texelmoda-texelmoda-apis/api/try-on-diffusion) and subscribe to a
|
| 47 |
+
suitable pricing plan. We also provide a free BASIC plan with 100 API requests per month.
|
| 48 |
+
- Use the obtained RapidAPI key to authenticate (via the _X-RapidAPI-Key_ header) and use an API from any programming
|
| 49 |
+
language or tool you like.
|
| 50 |
+
|
| 51 |
+
Example API call using cURL:
|
| 52 |
+
```shell
|
| 53 |
+
curl --request POST \
|
| 54 |
+
--url https://try-on-diffusion.p.rapidapi.com/try-on-file \
|
| 55 |
+
--header 'Content-Type: multipart/form-data' \
|
| 56 |
+
--header 'x-rapidapi-host: try-on-diffusion.p.rapidapi.com' \
|
| 57 |
+
--header 'x-rapidapi-key: <RapidAPI Key>' \
|
| 58 |
+
--form clothing_image=1.jpg \
|
| 59 |
+
--form avatar_image=2.jpg
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
For a simple Python client implementation please see the
|
| 63 |
+
[HuggingFace demo application source](https://huggingface.co/spaces/texelmoda/try-on-diffusion/blob/main/try_on_diffusion_client.py).
|
| 64 |
+
|
| 65 |
+
## Try-On Endpoints
|
| 66 |
+
|
| 67 |
+
Try-On API consists of two endpoints that differ only in the method of passing reference images:
|
| 68 |
+
|
| 69 |
+
- **POST** _/try-on-file_ - takes reference images as uploaded files in the request body (using multipart/form-data).
|
| 70 |
+
|
| 71 |
+
|
| 72 |
+
- **POST** _/try-on-url_ - takes reference images as image URLs in POST parameters.
|
| 73 |
+
|
| 74 |
+
All image requirements, behavior and status codes are the same for both endpoints, choose the one that best suits your
|
| 75 |
+
application architecture.
|
| 76 |
+
|
| 77 |
+
## Try-On Input Parameters
|
| 78 |
+
|
| 79 |
+
All input parameters for the try-on endpoints are currently optional. Images and prompts serve as additional generation
|
| 80 |
+
conditions and can even be used in combination. Below is the short parameter summary with links to extended information
|
| 81 |
+
on certain parameters.
|
| 82 |
+
|
| 83 |
+
List of input parameters for the **POST** _/try-on-file_ endpoint:
|
| 84 |
+
|
| 85 |
+
| Parameter | Description | Required |
|
| 86 |
+
|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|
| 87 |
+
| [clothing_image](#clothing-image) | Clothing reference image in JPEG, PNG or WEBP format, maximum file size is 12 MB. | No |
|
| 88 |
+
| [clothing_prompt](#clothing-prompt) | Text prompt for clothing, can be used instead of an image. Compel weighting syntax is supported. Example: _red sleeveless mini dress_ | No |
|
| 89 |
+
| [avatar_image](#avatar-image) | Avatar image in JPEG, PNG or WEBP format, maximum file size is 12 MB. | No |
|
| 90 |
+
| avatar_sex | Avatar sex, either "male" or "female". Will be detected automatically, if left empty or omitted. Will enforce certain avatar sex if specified. | No |
|
| 91 |
+
| [avatar_prompt](#avatar-prompt) | Text prompt for the avatar, can be used instead of an image or with image to modify the avatar. Compel weighting syntax is supported. Example: _a gentleman with beard and mustache_ | No |
|
| 92 |
+
| [background_image](#background-image) | Optional background reference image in JPEG, PNG or WEBP format, maximum file size is 12 MB. Original avatar background is preserved if background is not specified. | No |
|
| 93 |
+
| [background_prompt](#background-prompt) | Optional background text prompt. Original avatar background is preserved if background is not specified. Example: _in an autumn park_ | No |
|
| 94 |
+
| seed | Seed for image generation. Default is -1 (random seed). Actual seed will also be output in the "X-Seed" response header. Example: _42_ | No |
|
| 95 |
+
|
| 96 |
+
List of input parameters for the **POST** _/try-on-url_ endpoint:
|
| 97 |
+
|
| 98 |
+
| Parameter | Description | Required |
|
| 99 |
+
|-------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|
| 100 |
+
| [clothing_image_url](#clothing-image) | Clothing reference image URL. Image should be in JPEG, PNG or WEBP format, maximum file size is 12 MB. | No |
|
| 101 |
+
| [clothing_prompt](#clothing-prompt) | Text prompt for clothing, can be used instead of an image. Compel weighting syntax is supported. Example: _red sleeveless mini dress_ | No |
|
| 102 |
+
| [avatar_image_url](#avatar-image) | Avatar image URL. Image should be in JPEG, PNG or WEBP format, maximum file size is 12 MB. | No |
|
| 103 |
+
| avatar_sex | Avatar sex, either "male" or "female". Will be detected automatically, if left empty or omitted. Will enforce certain avatar sex if specified. | No |
|
| 104 |
+
| [avatar_prompt](#avatar-prompt) | Text prompt for the avatar, can be used instead of an image or with image to modify the avatar. Compel weighting syntax is supported. Example: _a gentleman with beard and mustache_ | No |
|
| 105 |
+
| [background_image_url](#background-image) | Optional background reference image URL. Image should be in JPEG, PNG or WEBP format, maximum file size is 12 MB. Original avatar background is preserved if background is not specified. | No |
|
| 106 |
+
| [background_prompt](#background-prompt) | Optional background text prompt. Original avatar background is preserved if background is not specified. Example: _in an autumn park_ | No |
|
| 107 |
+
| seed | Seed for image generation. Default is -1 (random seed). Actual seed will also be output in the "X-Seed" response header. Example: _42_ | No |
|
| 108 |
+
|
| 109 |
+
### Clothing image
|
| 110 |
+
|
| 111 |
+
For best results clothing reference images should meet a number of requirements:
|
| 112 |
+
|
| 113 |
+
- File format: **JPEG**, **PNG** or **WEBP**
|
| 114 |
+
- Maximum file size: **12 MB**
|
| 115 |
+
- Minimum image size: **256x256**
|
| 116 |
+
- Recommended image size: **768x1024 and above**
|
| 117 |
+
- Clothing should be **dressed on a person**. Some flat lay clothing photos might work, but currently it's not guaranteed
|
| 118 |
+
- **Single person** on the image (though multiple persons might also work)
|
| 119 |
+
- **Frontal** photo, though some degree of rotation is fine
|
| 120 |
+
- **Good lighting** conditions and **high image quality** as it directly affects the result
|
| 121 |
+
- **Minimal occlusion** by hair, hands or accessories
|
| 122 |
+
|
| 123 |
+
To summarize: the better is the clothing image the better is the final result.
|
| 124 |
+
|
| 125 |
+
Examples of good clothing images:
|
| 126 |
+
|
| 127 |
+
| <img src="images/clothing_image_01.jpg" width="240"> | <img src="images/clothing_image_02.jpg" width="240"> | <img src="images/clothing_image_03.jpg" width="240"> | <img src="images/clothing_image_04.jpg" width="240"> |
|
| 128 |
+
|------------------------------------------------------|------------------------------------------------------|------------------------------------------------------|------------------------------------------------------|
|
| 129 |
+
|
| 130 |
+
### Clothing prompt
|
| 131 |
+
|
| 132 |
+
Instead of a clothing image you can use text prompt to describe the garment. Short and clear prompts work best.
|
| 133 |
+
Additionally, [Compel weighting syntax](https://github.com/damian0815/compel/blob/main/doc/syntax.md) is supported to
|
| 134 |
+
increase or decrease weight of certain tokens. Examples:
|
| 135 |
+
- _a sheer blue sleeveless mini dress_
|
| 136 |
+
- _a beige woolen sweater and white pleated skirt_
|
| 137 |
+
- _a black leather jacket and dark blue slim-fit jeans_
|
| 138 |
+
- _a floral pattern blouse and leggings_
|
| 139 |
+
- _a colorful+++ t-shirt and black shorts_
|
| 140 |
+
|
| 141 |
+
### Avatar image
|
| 142 |
+
|
| 143 |
+
Avatar images should also meet a some requirements:
|
| 144 |
+
|
| 145 |
+
- File format: **JPEG**, **PNG** or **WEBP**
|
| 146 |
+
- Maximum file size: **12 MB**
|
| 147 |
+
- Minimum image size: **256x256**
|
| 148 |
+
- Recommended image size: **768x1024 and above**
|
| 149 |
+
- **Single person** on the image (though multiple persons might also work)
|
| 150 |
+
- **Frontal** photo, though some degree of rotation is fine
|
| 151 |
+
- **Good lighting** conditions and **high image quality**
|
| 152 |
+
|
| 153 |
+
Examples of good avatar images:
|
| 154 |
+
|
| 155 |
+
| <img src="images/avatar_image_01.jpg" width="240"> | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/avatar_image_03.jpg" width="240"> | <img src="images/avatar_image_04.jpg" width="240"> |
|
| 156 |
+
|----------------------------------------------------|----------------------------------------------------|----------------------------------------------------|----------------------------------------------------|
|
| 157 |
+
|
| 158 |
+
### Avatar prompt
|
| 159 |
+
|
| 160 |
+
Instead of an avatar image you can use text prompt to describe the person. Short and clear prompts work best.
|
| 161 |
+
Additionally, [Compel weighting syntax](https://github.com/damian0815/compel/blob/main/doc/syntax.md) is supported to
|
| 162 |
+
increase or decrease weight of certain tokens. Examples:
|
| 163 |
+
- _a beautiful blond girl with long hair_
|
| 164 |
+
- _a cute redhead girl with freckles_
|
| 165 |
+
- _a (plus size)++ female model wearing sunglasses_
|
| 166 |
+
- _a fit man with dark beard and blue eyes_
|
| 167 |
+
- _a gentleman with beard and mustache_
|
| 168 |
+
|
| 169 |
+
### Background image
|
| 170 |
+
|
| 171 |
+
Background images are used to extract high-level background features only and serve as a reference (and not exact
|
| 172 |
+
background). Below are basic image requirements:
|
| 173 |
+
|
| 174 |
+
- File format: **JPEG**, **PNG** or **WEBP**
|
| 175 |
+
- Maximum file size: **12 MB**
|
| 176 |
+
- Recommended image size: **256x256 and above**
|
| 177 |
+
|
| 178 |
+
Examples of background images:
|
| 179 |
+
|
| 180 |
+
| <img src="images/background_image_01.jpg" width="240"> | <img src="images/background_image_02.jpg" width="240"> | <img src="images/background_image_03.jpg" width="240"> | <img src="images/background_image_04.jpg" width="240"> |
|
| 181 |
+
|--------------------------------------------------------|--------------------------------------------------------|--------------------------------------------------------|--------------------------------------------------------|
|
| 182 |
+
|
| 183 |
+
### Background prompt
|
| 184 |
+
|
| 185 |
+
Instead of a background image you can use text prompt to describe the background. Short and clear prompts work best.
|
| 186 |
+
Additionally, [Compel weighting syntax](https://github.com/damian0815/compel/blob/main/doc/syntax.md) is supported to
|
| 187 |
+
increase or decrease weight of certain tokens. Examples:
|
| 188 |
+
- _in an autumn park_
|
| 189 |
+
- _in front of a brick wall_
|
| 190 |
+
- _on an ocean beach with (palm trees)++_
|
| 191 |
+
- _in a shopping mall_
|
| 192 |
+
- _in a modern office_
|
| 193 |
+
|
| 194 |
+
### Additional notes
|
| 195 |
+
|
| 196 |
+
We use the "same-crop" approach for clothing and avatar images: images will be cropped roughly the same way (using pose
|
| 197 |
+
estimation), so we don't have to add too much new information (e.g. assume lower body clothing). So, if you use only a
|
| 198 |
+
photo of an upper body clothing the result will also be cropped the same way regardless of the avatar image (and the
|
| 199 |
+
other way around):
|
| 200 |
+
|
| 201 |
+
| Clothing Image | Avatar Image | Result Image |
|
| 202 |
+
|------------------------------------------------------|-----------------------------------------------------|--------------------------------------------------------|
|
| 203 |
+
| <img src="images/clothing_image_02.jpg" width="240"> | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/same_crop_result_01.jpg" width="240"> |
|
| 204 |
+
| <img src="images/clothing_image_03.jpg" width="240"> | <img src="images/avatar_image_03.jpg" width="240"> | <img src="images/same_crop_result_02.jpg" width="240"> |
|
| 205 |
+
|
| 206 |
+
## Try-On Output
|
| 207 |
+
|
| 208 |
+
### Response codes
|
| 209 |
+
|
| 210 |
+
HTTP status code is used as a high-level response status. In case of a successful API call HTTP code 200 will be
|
| 211 |
+
returned and response body will contain a resulting JPEG image with the maximum size of 768x1024 pixels. Response
|
| 212 |
+
will also have the "X-Seed" header set that should contain the actual seed used for image generation (for
|
| 213 |
+
reproducibility). Other status codes (not 200) indicate unsuccessful request, see the table below for additional
|
| 214 |
+
details:
|
| 215 |
+
|
| 216 |
+
| Response Code | Content-Type | Headers | Description | Example |
|
| 217 |
+
|:-------------:|:------------------:|:--------------:|-----------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------:|
|
| 218 |
+
| **200** | image/jpeg | X-Seed: {seed} | Successful API call. Response body contains the resulting image in JPEG format. | <img src="images/same_crop_result_01.jpg" width="160"> |
|
| 219 |
+
| **400** | application/json | | Bad request: at least one of request parameters is invalid. Response body should contain additional error details in JSON format. | { "detail": "Invalid upload file type: application/x-zip-compressed" } |
|
| 220 |
+
| **403** | application/json | | Indicates authentication issue (e.g. invalid API key). | |
|
| 221 |
+
| **422** | application/json | | Request validation error. Response body should contain error details in JSON format. | { "detail": [ { "loc": [ "string", 0], "msg": "string", "type": "string" } ] } |
|
| 222 |
+
| **429** | | | Too many requests. Might be triggered by the RapidAPI proxy in case of reaching maximum request rate or API call limit. | |
|
| 223 |
+
| **500** | | | Indicates an internal server error, might not have any details. | |
|
| 224 |
+
|
| 225 |
+
### NSFW content
|
| 226 |
+
|
| 227 |
+
We use NSFW content checker to ensure we don't output inappropriate images. If potential NSFW content is detected in the
|
| 228 |
+
generated image, the API will return HTTP status code 400 with a corresponding error message in JSON response.
|
| 229 |
+
|
| 230 |
+
## Use Cases and Recipes
|
| 231 |
+
|
| 232 |
+
Our Virtual Try-On API offers a flexible way to specify clothing, avatar and background, which makes it possible to not
|
| 233 |
+
only perform a classic task of virtual try-on, but also generate entirely new images or alter existing images in some
|
| 234 |
+
interesting aspects. Feel free to try and explore!
|
| 235 |
+
|
| 236 |
+
In all the examples below all unmentioned inputs are assumed to be empty.
|
| 237 |
+
|
| 238 |
+
### Image-based virtual try-on
|
| 239 |
+
|
| 240 |
+
The most common use case is to transfer clothing from one photo (e.g. from a product page) to another photo (e.g.
|
| 241 |
+
user avatar) while maintaining the avatar and the background.
|
| 242 |
+
|
| 243 |
+
| Clothing Image | Avatar Image | Result Image |
|
| 244 |
+
|------------------------------------------------------|----------------------------------------------------|----------------------------------------------------------|
|
| 245 |
+
| <img src="images/clothing_image_01.jpg" width="240"> | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/image_based_result_01.jpg" width="240"> |
|
| 246 |
+
|
| 247 |
+
### Image-based virtual try-on with background
|
| 248 |
+
|
| 249 |
+
Additionally, it's possible to replace the avatar background with a reference image or a text prompt.
|
| 250 |
+
|
| 251 |
+
| Clothing Image | Avatar Image | Background Image | Result Image |
|
| 252 |
+
|------------------------------------------------------|----------------------------------------------------|--------------------------------------------------------|---------------------------------------------------------------------|
|
| 253 |
+
| <img src="images/clothing_image_04.jpg" width="240"> | <img src="images/avatar_image_03.jpg" width="240"> | <img src="images/background_image_01.jpg" width="240"> | <img src="images/image_based_background_result_01.jpg" width="240"> |
|
| 254 |
+
|
| 255 |
+
And with a text prompt for the background:
|
| 256 |
+
|
| 257 |
+
| Clothing Image | Avatar Image | Background Prompt | Result Image |
|
| 258 |
+
|------------------------------------------------------|----------------------------------------------------|------------------------------|---------------------------------------------------------------------|
|
| 259 |
+
| <img src="images/clothing_image_04.jpg" width="240"> | <img src="images/avatar_image_03.jpg" width="240"> | in front of a snowy mountain | <img src="images/image_based_background_result_02.jpg" width="240"> |
|
| 260 |
+
|
| 261 |
+
### Avatar from a text prompt
|
| 262 |
+
|
| 263 |
+
It's possible to replace the person on the clothing image with an avatar, described in a text prompt. Background will be
|
| 264 |
+
changed as well and will be a random one if not specified:
|
| 265 |
+
|
| 266 |
+
| Clothing Image | Avatar Prompt | Background Prompt | Result Image |
|
| 267 |
+
|------------------------------------------------------|--------------------------------------------|--------------------|------------------------------------------------------------|
|
| 268 |
+
| <img src="images/clothing_image_02.jpg" width="240"> | a beautiful blond girl with long hair | | <img src="images/avatar_prompt_result_01.jpg" width="240"> |
|
| 269 |
+
| <img src="images/clothing_image_03.jpg" width="240"> | a gentleman with a long beard and mustache | near a fireplace | <img src="images/avatar_prompt_result_02.jpg" width="240"> |
|
| 270 |
+
|
| 271 |
+
You may also experiment with avatar prompts for more interesting results:
|
| 272 |
+
|
| 273 |
+
| Clothing Image | Avatar Prompt | Background Prompt | Result Image |
|
| 274 |
+
|------------------------------------------------------|---------------------|-----------------------|------------------------------------------------------------|
|
| 275 |
+
| <img src="images/clothing_image_03.jpg" width="240"> | (iron man mask)+++ | in the Sahara Desert | <img src="images/avatar_prompt_result_03.jpg" width="240"> |
|
| 276 |
+
|
| 277 |
+
### Clothing from a text prompt
|
| 278 |
+
|
| 279 |
+
Similarly, you can specify clothing with a text prompt while providing an avatar image:
|
| 280 |
+
|
| 281 |
+
| Clothing Prompt | Avatar Image | Result Image |
|
| 282 |
+
|-------------------------------------|----------------------------------------------------|--------------------------------------------------------------|
|
| 283 |
+
| a sheer blue sleeveless mini dress | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/clothing_prompt_result_01.jpg" width="240"> |
|
| 284 |
+
| a colorful t-shirt and black shorts | <img src="images/avatar_image_03.jpg" width="240"> | <img src="images/clothing_prompt_result_02.jpg" width="240"> |
|
| 285 |
+
|
| 286 |
+
### Modifying avatar's body
|
| 287 |
+
|
| 288 |
+
If you specify clothing and avatar images to be the same while providing an avatar prompt it's possible to change
|
| 289 |
+
avatar's body proportions. Note that it may require using additional term weighting to achieve stronger changes.
|
| 290 |
+
|
| 291 |
+
| Clothing Image | Avatar Image | Avatar Prompt | Result Image |
|
| 292 |
+
|------------------------------------------------------|------------------------------------------------------|-------------------------------|------------------------------------------------------------------|
|
| 293 |
+
| <img src="images/clothing_image_01.jpg" width="240"> | <img src="images/clothing_image_01.jpg" width="240"> | a (plus size)+ woman | <img src="images/avatar_modification_result_01.jpg" width="240"> |
|
| 294 |
+
| <img src="images/clothing_image_03.jpg" width="240"> | <img src="images/clothing_image_03.jpg" width="240"> | a (muscular bodybuilder)+++++ | <img src="images/avatar_modification_result_02.jpg" width="240"> |
|
| 295 |
+
|
| 296 |
+
### Txt2Img
|
| 297 |
+
|
| 298 |
+
As our diffusion model was fine-tuned to produce people wearing various clothing, it can better follow a clothing prompt
|
| 299 |
+
and output realistic people and garments:
|
| 300 |
+
|
| 301 |
+
| Clothing Prompt | Avatar Prompt | Background Prompt | Result Image |
|
| 302 |
+
|-------------------------------------------------|--------------------------------|------------------------|------------------------------------------------------|
|
| 303 |
+
| a paisley pattern purple shirt and beige chinos | a fit man with dark beard | plain white background | <img src="images/txt2img_result_01.jpg" width="240"> |
|
| 304 |
+
| a white polka dot pattern dress | a beautiful petite blond woman | on a yacht | <img src="images/txt2img_result_02.jpg" width="240"> |
|
| 305 |
+
|
| 306 |
+
### Other creative possibilities
|
| 307 |
+
|
| 308 |
+
If you specify the same image for clothing and avatar while providing a background prompt (or background image) you can
|
| 309 |
+
replace the background in a creative way:
|
| 310 |
+
|
| 311 |
+
| Clothing Image | Avatar Image | Background Prompt | Result Image |
|
| 312 |
+
|----------------------------------------------------|----------------------------------------------------|-------------------------|-------------------------------------------------------------|
|
| 313 |
+
| <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/avatar_image_02.jpg" width="240"> | on a snowy mountain top | <img src="images/new_background_result_01.jpg" width="240"> |
|
| 314 |
+
|
| 315 |
+
It's also possible to use a combination of clothing image, clothing prompt, avatar image and a background to add some
|
| 316 |
+
accessories:
|
| 317 |
+
|
| 318 |
+
| Clothing Image | Clothing Prompt | Avatar Image | Background Image | Result Image |
|
| 319 |
+
|------------------------------------------------------|--------------------------|------------------------------------------------------|--------------------------------------------------------|------------------------------------------------------------------|
|
| 320 |
+
| <img src="images/avatar_image_02.jpg" width="240"> | a (light brown purse)+++ | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/background_image_03.jpg" width="240"> | <img src="images/accessory_result_01.jpg" width="240"> |
|
| 321 |
+
|
| 322 |
+
## Performance
|
| 323 |
+
|
| 324 |
+
Typically, one try-on request is processed in 5-10 seconds (depending on type of conditions) excluding network latency.
|
| 325 |
+
In order to reduce network overhead you might want compress your images before feeding to the API (e.g. using JPEG).
|
| 326 |
+
Please note that in case of a high demand processing time might increase due to request being queued, though we
|
| 327 |
+
constantly monitor our GPU cluster capacity and perform scaling as needed.
|
| 328 |
+
|
| 329 |
+
## Known Issues and Limitations
|
| 330 |
+
|
| 331 |
+
As any generative model, our models are not perfect (though we constantly work on improvements):
|
| 332 |
+
- Prompt following might not be perfect, especially in case of long and sophisticated prompts. Prefer simpler and more
|
| 333 |
+
straightforward prompts whenever possible. Also be pretty verbose (e.g. use the word "plain" if you need something of
|
| 334 |
+
solid color). Additionally, Compel weighting might be used to increase weight of certain tokens.
|
| 335 |
+
- As usual, generative models struggle with hands, fingers and toes, though we try to mitigate it to a certain extent.
|
| 336 |
+
- Currently, we do not support trying on a single garment, only the full look.
|
| 337 |
+
- Hats and sunglasses are not currently transferred, but we are working on it.
|
| 338 |
+
- Backgrounds might lack some clarity as currently we focus more on clothing.
|
| 339 |
+
- In case of a specified background a hairstyle might change.
|
| 340 |
+
- Body shape of the avatar might change towards smaller sizes.
|
docs/images/accessory_result_01.jpg
ADDED
|
Git LFS Details
|
docs/images/avatar_image_01.jpg
ADDED
|
|
Git LFS Details
|
docs/images/avatar_image_02.jpg
ADDED
|
|
Git LFS Details
|
docs/images/avatar_image_03.jpg
ADDED
|
|
Git LFS Details
|
docs/images/avatar_image_04.jpg
ADDED
|
|
Git LFS Details
|
docs/images/avatar_modification_result_01.jpg
ADDED
|
|
Git LFS Details
|
docs/images/avatar_modification_result_02.jpg
ADDED
|
|
Git LFS Details
|
docs/images/avatar_prompt_result_01.jpg
ADDED
|
|
Git LFS Details
|
docs/images/avatar_prompt_result_02.jpg
ADDED
|
|
Git LFS Details
|
docs/images/avatar_prompt_result_03.jpg
ADDED
|
|
Git LFS Details
|
docs/images/background_image_01.jpg
ADDED
|
Git LFS Details
|
docs/images/background_image_02.jpg
ADDED
|
Git LFS Details
|
docs/images/background_image_03.jpg
ADDED
|
Git LFS Details
|
docs/images/background_image_04.jpg
ADDED
|
Git LFS Details
|
docs/images/clothing_image_01.jpg
ADDED
|
Git LFS Details
|
docs/images/clothing_image_02.jpg
ADDED
|
Git LFS Details
|
docs/images/clothing_image_03.jpg
ADDED
|
Git LFS Details
|
docs/images/clothing_image_04.jpg
ADDED
|
Git LFS Details
|
docs/images/clothing_prompt_result_01.jpg
ADDED
|
Git LFS Details
|
docs/images/clothing_prompt_result_02.jpg
ADDED
|
Git LFS Details
|
docs/images/image_based_background_result_01.jpg
ADDED
|
Git LFS Details
|
docs/images/image_based_background_result_02.jpg
ADDED
|
Git LFS Details
|
docs/images/image_based_result_01.jpg
ADDED
|
Git LFS Details
|
docs/images/new_background_result_01.jpg
ADDED
|
Git LFS Details
|
docs/images/same_crop_result_01.jpg
ADDED
|
Git LFS Details
|
docs/images/same_crop_result_02.jpg
ADDED
|
Git LFS Details
|
docs/images/txt2img_result_01.jpg
ADDED
|
Git LFS Details
|
docs/images/txt2img_result_02.jpg
ADDED
|
Git LFS Details
|