Spaces:
Running
on
Zero
Running
on
Zero
Added API documentation
Browse files- docs/API.md +340 -0
- docs/images/accessory_result_01.jpg +3 -0
- docs/images/avatar_image_01.jpg +3 -0
- docs/images/avatar_image_02.jpg +3 -0
- docs/images/avatar_image_03.jpg +3 -0
- docs/images/avatar_image_04.jpg +3 -0
- docs/images/avatar_modification_result_01.jpg +3 -0
- docs/images/avatar_modification_result_02.jpg +3 -0
- docs/images/avatar_prompt_result_01.jpg +3 -0
- docs/images/avatar_prompt_result_02.jpg +3 -0
- docs/images/avatar_prompt_result_03.jpg +3 -0
- docs/images/background_image_01.jpg +3 -0
- docs/images/background_image_02.jpg +3 -0
- docs/images/background_image_03.jpg +3 -0
- docs/images/background_image_04.jpg +3 -0
- docs/images/clothing_image_01.jpg +3 -0
- docs/images/clothing_image_02.jpg +3 -0
- docs/images/clothing_image_03.jpg +3 -0
- docs/images/clothing_image_04.jpg +3 -0
- docs/images/clothing_prompt_result_01.jpg +3 -0
- docs/images/clothing_prompt_result_02.jpg +3 -0
- docs/images/image_based_background_result_01.jpg +3 -0
- docs/images/image_based_background_result_02.jpg +3 -0
- docs/images/image_based_result_01.jpg +3 -0
- docs/images/new_background_result_01.jpg +3 -0
- docs/images/same_crop_result_01.jpg +3 -0
- docs/images/same_crop_result_02.jpg +3 -0
- docs/images/txt2img_result_01.jpg +3 -0
- docs/images/txt2img_result_02.jpg +3 -0
docs/API.md
ADDED
@@ -0,0 +1,340 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Virtual Try-On Diffusion API
|
2 |
+
|
3 |
+
<!-- TOC -->
|
4 |
+
* [Virtual Try-On Diffusion API](#virtual-try-on-diffusion-api)
|
5 |
+
* [Summary](#summary)
|
6 |
+
* [Consuming the API](#consuming-the-api)
|
7 |
+
* [Try-On Endpoints](#try-on-endpoints)
|
8 |
+
* [Try-On Input Parameters](#try-on-input-parameters)
|
9 |
+
* [Clothing image](#clothing-image)
|
10 |
+
* [Clothing prompt](#clothing-prompt)
|
11 |
+
* [Avatar image](#avatar-image)
|
12 |
+
* [Avatar prompt](#avatar-prompt)
|
13 |
+
* [Background image](#background-image)
|
14 |
+
* [Background prompt](#background-prompt)
|
15 |
+
* [Additional notes](#additional-notes)
|
16 |
+
* [Try-On Output](#try-on-output)
|
17 |
+
* [Response codes](#response-codes)
|
18 |
+
* [NSFW content](#nsfw-content)
|
19 |
+
* [Use Cases and Recipes](#use-cases-and-recipes)
|
20 |
+
* [Image-based virtual try-on](#image-based-virtual-try-on)
|
21 |
+
* [Image-based virtual try-on with background](#image-based-virtual-try-on-with-background)
|
22 |
+
* [Avatar from a text prompt](#avatar-from-a-text-prompt)
|
23 |
+
* [Clothing from a text prompt](#clothing-from-a-text-prompt)
|
24 |
+
* [Modifying avatar's body](#modifying-avatars-body)
|
25 |
+
* [Txt2Img](#txt2img)
|
26 |
+
* [Other creative possibilities](#other-creative-possibilities)
|
27 |
+
* [Performance](#performance)
|
28 |
+
* [Known Issues and Limitations](#known-issues-and-limitations)
|
29 |
+
<!-- TOC -->
|
30 |
+
|
31 |
+
## Summary
|
32 |
+
|
33 |
+
Virtual Try-On Diffusion [VTON-D] by [Texel.Moda](https://texelmoda.com) is a custom diffusion-based pipeline for fast
|
34 |
+
and flexible multi-modal virtual try-on. Clothing, avatar and background can be specified by reference images or text
|
35 |
+
prompts allowing for clothing transfer, avatar replacement, fashion image generation and other virtual try-on related
|
36 |
+
tasks. Check out the [demo on HuggingFace](https://huggingface.co/spaces/texelmoda/try-on-diffusion) to try the API in
|
37 |
+
a user-friendly way.
|
38 |
+
|
39 |
+
## Consuming the API
|
40 |
+
|
41 |
+
The API is exposed through the RapidAPI Hub which manages API subscriptions, API keys, payments and other things. Please
|
42 |
+
refer to the [RapidAPI Documentation](https://docs.rapidapi.com/docs/consumer-quick-start-guide) to get started.
|
43 |
+
|
44 |
+
Generally, in order to use an API you need to perform the following steps:
|
45 |
+
- Create a RapidAPI.com account.
|
46 |
+
- [Navigate to the API page](https://rapidapi.com/texelmoda-texelmoda-apis/api/try-on-diffusion) and subscribe to a
|
47 |
+
suitable pricing plan. We also provide a free BASIC plan with 100 API requests per month.
|
48 |
+
- Use the obtained RapidAPI key to authenticate (via the _X-RapidAPI-Key_ header) and use an API from any programming
|
49 |
+
language or tool you like.
|
50 |
+
|
51 |
+
Example API call using cURL:
|
52 |
+
```shell
|
53 |
+
curl --request POST \
|
54 |
+
--url https://try-on-diffusion.p.rapidapi.com/try-on-file \
|
55 |
+
--header 'Content-Type: multipart/form-data' \
|
56 |
+
--header 'x-rapidapi-host: try-on-diffusion.p.rapidapi.com' \
|
57 |
+
--header 'x-rapidapi-key: <RapidAPI Key>' \
|
58 |
+
--form clothing_image=1.jpg \
|
59 |
+
--form avatar_image=2.jpg
|
60 |
+
```
|
61 |
+
|
62 |
+
For a simple Python client implementation please see the
|
63 |
+
[HuggingFace demo application source](https://huggingface.co/spaces/texelmoda/try-on-diffusion/blob/main/try_on_diffusion_client.py).
|
64 |
+
|
65 |
+
## Try-On Endpoints
|
66 |
+
|
67 |
+
Try-On API consists of two endpoints that differ only in the method of passing reference images:
|
68 |
+
|
69 |
+
- **POST** _/try-on-file_ - takes reference images as uploaded files in the request body (using multipart/form-data).
|
70 |
+
|
71 |
+
|
72 |
+
- **POST** _/try-on-url_ - takes reference images as image URLs in POST parameters.
|
73 |
+
|
74 |
+
All image requirements, behavior and status codes are the same for both endpoints, choose the one that best suits your
|
75 |
+
application architecture.
|
76 |
+
|
77 |
+
## Try-On Input Parameters
|
78 |
+
|
79 |
+
All input parameters for the try-on endpoints are currently optional. Images and prompts serve as additional generation
|
80 |
+
conditions and can even be used in combination. Below is the short parameter summary with links to extended information
|
81 |
+
on certain parameters.
|
82 |
+
|
83 |
+
List of input parameters for the **POST** _/try-on-file_ endpoint:
|
84 |
+
|
85 |
+
| Parameter | Description | Required |
|
86 |
+
|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|
87 |
+
| [clothing_image](#clothing-image) | Clothing reference image in JPEG, PNG or WEBP format, maximum file size is 12 MB. | No |
|
88 |
+
| [clothing_prompt](#clothing-prompt) | Text prompt for clothing, can be used instead of an image. Compel weighting syntax is supported. Example: _red sleeveless mini dress_ | No |
|
89 |
+
| [avatar_image](#avatar-image) | Avatar image in JPEG, PNG or WEBP format, maximum file size is 12 MB. | No |
|
90 |
+
| avatar_sex | Avatar sex, either "male" or "female". Will be detected automatically, if left empty or omitted. Will enforce certain avatar sex if specified. | No |
|
91 |
+
| [avatar_prompt](#avatar-prompt) | Text prompt for the avatar, can be used instead of an image or with image to modify the avatar. Compel weighting syntax is supported. Example: _a gentleman with beard and mustache_ | No |
|
92 |
+
| [background_image](#background-image) | Optional background reference image in JPEG, PNG or WEBP format, maximum file size is 12 MB. Original avatar background is preserved if background is not specified. | No |
|
93 |
+
| [background_prompt](#background-prompt) | Optional background text prompt. Original avatar background is preserved if background is not specified. Example: _in an autumn park_ | No |
|
94 |
+
| seed | Seed for image generation. Default is -1 (random seed). Actual seed will also be output in the "X-Seed" response header. Example: _42_ | No |
|
95 |
+
|
96 |
+
List of input parameters for the **POST** _/try-on-url_ endpoint:
|
97 |
+
|
98 |
+
| Parameter | Description | Required |
|
99 |
+
|-------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|
100 |
+
| [clothing_image_url](#clothing-image) | Clothing reference image URL. Image should be in JPEG, PNG or WEBP format, maximum file size is 12 MB. | No |
|
101 |
+
| [clothing_prompt](#clothing-prompt) | Text prompt for clothing, can be used instead of an image. Compel weighting syntax is supported. Example: _red sleeveless mini dress_ | No |
|
102 |
+
| [avatar_image_url](#avatar-image) | Avatar image URL. Image should be in JPEG, PNG or WEBP format, maximum file size is 12 MB. | No |
|
103 |
+
| avatar_sex | Avatar sex, either "male" or "female". Will be detected automatically, if left empty or omitted. Will enforce certain avatar sex if specified. | No |
|
104 |
+
| [avatar_prompt](#avatar-prompt) | Text prompt for the avatar, can be used instead of an image or with image to modify the avatar. Compel weighting syntax is supported. Example: _a gentleman with beard and mustache_ | No |
|
105 |
+
| [background_image_url](#background-image) | Optional background reference image URL. Image should be in JPEG, PNG or WEBP format, maximum file size is 12 MB. Original avatar background is preserved if background is not specified. | No |
|
106 |
+
| [background_prompt](#background-prompt) | Optional background text prompt. Original avatar background is preserved if background is not specified. Example: _in an autumn park_ | No |
|
107 |
+
| seed | Seed for image generation. Default is -1 (random seed). Actual seed will also be output in the "X-Seed" response header. Example: _42_ | No |
|
108 |
+
|
109 |
+
### Clothing image
|
110 |
+
|
111 |
+
For best results clothing reference images should meet a number of requirements:
|
112 |
+
|
113 |
+
- File format: **JPEG**, **PNG** or **WEBP**
|
114 |
+
- Maximum file size: **12 MB**
|
115 |
+
- Minimum image size: **256x256**
|
116 |
+
- Recommended image size: **768x1024 and above**
|
117 |
+
- Clothing should be **dressed on a person**. Some flat lay clothing photos might work, but currently it's not guaranteed
|
118 |
+
- **Single person** on the image (though multiple persons might also work)
|
119 |
+
- **Frontal** photo, though some degree of rotation is fine
|
120 |
+
- **Good lighting** conditions and **high image quality** as it directly affects the result
|
121 |
+
- **Minimal occlusion** by hair, hands or accessories
|
122 |
+
|
123 |
+
To summarize: the better is the clothing image the better is the final result.
|
124 |
+
|
125 |
+
Examples of good clothing images:
|
126 |
+
|
127 |
+
| <img src="images/clothing_image_01.jpg" width="240"> | <img src="images/clothing_image_02.jpg" width="240"> | <img src="images/clothing_image_03.jpg" width="240"> | <img src="images/clothing_image_04.jpg" width="240"> |
|
128 |
+
|------------------------------------------------------|------------------------------------------------------|------------------------------------------------------|------------------------------------------------------|
|
129 |
+
|
130 |
+
### Clothing prompt
|
131 |
+
|
132 |
+
Instead of a clothing image you can use text prompt to describe the garment. Short and clear prompts work best.
|
133 |
+
Additionally, [Compel weighting syntax](https://github.com/damian0815/compel/blob/main/doc/syntax.md) is supported to
|
134 |
+
increase or decrease weight of certain tokens. Examples:
|
135 |
+
- _a sheer blue sleeveless mini dress_
|
136 |
+
- _a beige woolen sweater and white pleated skirt_
|
137 |
+
- _a black leather jacket and dark blue slim-fit jeans_
|
138 |
+
- _a floral pattern blouse and leggings_
|
139 |
+
- _a colorful+++ t-shirt and black shorts_
|
140 |
+
|
141 |
+
### Avatar image
|
142 |
+
|
143 |
+
Avatar images should also meet a some requirements:
|
144 |
+
|
145 |
+
- File format: **JPEG**, **PNG** or **WEBP**
|
146 |
+
- Maximum file size: **12 MB**
|
147 |
+
- Minimum image size: **256x256**
|
148 |
+
- Recommended image size: **768x1024 and above**
|
149 |
+
- **Single person** on the image (though multiple persons might also work)
|
150 |
+
- **Frontal** photo, though some degree of rotation is fine
|
151 |
+
- **Good lighting** conditions and **high image quality**
|
152 |
+
|
153 |
+
Examples of good avatar images:
|
154 |
+
|
155 |
+
| <img src="images/avatar_image_01.jpg" width="240"> | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/avatar_image_03.jpg" width="240"> | <img src="images/avatar_image_04.jpg" width="240"> |
|
156 |
+
|----------------------------------------------------|----------------------------------------------------|----------------------------------------------------|----------------------------------------------------|
|
157 |
+
|
158 |
+
### Avatar prompt
|
159 |
+
|
160 |
+
Instead of an avatar image you can use text prompt to describe the person. Short and clear prompts work best.
|
161 |
+
Additionally, [Compel weighting syntax](https://github.com/damian0815/compel/blob/main/doc/syntax.md) is supported to
|
162 |
+
increase or decrease weight of certain tokens. Examples:
|
163 |
+
- _a beautiful blond girl with long hair_
|
164 |
+
- _a cute redhead girl with freckles_
|
165 |
+
- _a (plus size)++ female model wearing sunglasses_
|
166 |
+
- _a fit man with dark beard and blue eyes_
|
167 |
+
- _a gentleman with beard and mustache_
|
168 |
+
|
169 |
+
### Background image
|
170 |
+
|
171 |
+
Background images are used to extract high-level background features only and serve as a reference (and not exact
|
172 |
+
background). Below are basic image requirements:
|
173 |
+
|
174 |
+
- File format: **JPEG**, **PNG** or **WEBP**
|
175 |
+
- Maximum file size: **12 MB**
|
176 |
+
- Recommended image size: **256x256 and above**
|
177 |
+
|
178 |
+
Examples of background images:
|
179 |
+
|
180 |
+
| <img src="images/background_image_01.jpg" width="240"> | <img src="images/background_image_02.jpg" width="240"> | <img src="images/background_image_03.jpg" width="240"> | <img src="images/background_image_04.jpg" width="240"> |
|
181 |
+
|--------------------------------------------------------|--------------------------------------------------------|--------------------------------------------------------|--------------------------------------------------------|
|
182 |
+
|
183 |
+
### Background prompt
|
184 |
+
|
185 |
+
Instead of a background image you can use text prompt to describe the background. Short and clear prompts work best.
|
186 |
+
Additionally, [Compel weighting syntax](https://github.com/damian0815/compel/blob/main/doc/syntax.md) is supported to
|
187 |
+
increase or decrease weight of certain tokens. Examples:
|
188 |
+
- _in an autumn park_
|
189 |
+
- _in front of a brick wall_
|
190 |
+
- _on an ocean beach with (palm trees)++_
|
191 |
+
- _in a shopping mall_
|
192 |
+
- _in a modern office_
|
193 |
+
|
194 |
+
### Additional notes
|
195 |
+
|
196 |
+
We use the "same-crop" approach for clothing and avatar images: images will be cropped roughly the same way (using pose
|
197 |
+
estimation), so we don't have to add too much new information (e.g. assume lower body clothing). So, if you use only a
|
198 |
+
photo of an upper body clothing the result will also be cropped the same way regardless of the avatar image (and the
|
199 |
+
other way around):
|
200 |
+
|
201 |
+
| Clothing Image | Avatar Image | Result Image |
|
202 |
+
|------------------------------------------------------|-----------------------------------------------------|--------------------------------------------------------|
|
203 |
+
| <img src="images/clothing_image_02.jpg" width="240"> | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/same_crop_result_01.jpg" width="240"> |
|
204 |
+
| <img src="images/clothing_image_03.jpg" width="240"> | <img src="images/avatar_image_03.jpg" width="240"> | <img src="images/same_crop_result_02.jpg" width="240"> |
|
205 |
+
|
206 |
+
## Try-On Output
|
207 |
+
|
208 |
+
### Response codes
|
209 |
+
|
210 |
+
HTTP status code is used as a high-level response status. In case of a successful API call HTTP code 200 will be
|
211 |
+
returned and response body will contain a resulting JPEG image with the maximum size of 768x1024 pixels. Response
|
212 |
+
will also have the "X-Seed" header set that should contain the actual seed used for image generation (for
|
213 |
+
reproducibility). Other status codes (not 200) indicate unsuccessful request, see the table below for additional
|
214 |
+
details:
|
215 |
+
|
216 |
+
| Response Code | Content-Type | Headers | Description | Example |
|
217 |
+
|:-------------:|:------------------:|:--------------:|-----------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------:|
|
218 |
+
| **200** | image/jpeg | X-Seed: {seed} | Successful API call. Response body contains the resulting image in JPEG format. | <img src="images/same_crop_result_01.jpg" width="160"> |
|
219 |
+
| **400** | application/json | | Bad request: at least one of request parameters is invalid. Response body should contain additional error details in JSON format. | { "detail": "Invalid upload file type: application/x-zip-compressed" } |
|
220 |
+
| **403** | application/json | | Indicates authentication issue (e.g. invalid API key). | |
|
221 |
+
| **422** | application/json | | Request validation error. Response body should contain error details in JSON format. | { "detail": [ { "loc": [ "string", 0], "msg": "string", "type": "string" } ] } |
|
222 |
+
| **429** | | | Too many requests. Might be triggered by the RapidAPI proxy in case of reaching maximum request rate or API call limit. | |
|
223 |
+
| **500** | | | Indicates an internal server error, might not have any details. | |
|
224 |
+
|
225 |
+
### NSFW content
|
226 |
+
|
227 |
+
We use NSFW content checker to ensure we don't output inappropriate images. If potential NSFW content is detected in the
|
228 |
+
generated image, the API will return HTTP status code 400 with a corresponding error message in JSON response.
|
229 |
+
|
230 |
+
## Use Cases and Recipes
|
231 |
+
|
232 |
+
Our Virtual Try-On API offers a flexible way to specify clothing, avatar and background, which makes it possible to not
|
233 |
+
only perform a classic task of virtual try-on, but also generate entirely new images or alter existing images in some
|
234 |
+
interesting aspects. Feel free to try and explore!
|
235 |
+
|
236 |
+
In all the examples below all unmentioned inputs are assumed to be empty.
|
237 |
+
|
238 |
+
### Image-based virtual try-on
|
239 |
+
|
240 |
+
The most common use case is to transfer clothing from one photo (e.g. from a product page) to another photo (e.g.
|
241 |
+
user avatar) while maintaining the avatar and the background.
|
242 |
+
|
243 |
+
| Clothing Image | Avatar Image | Result Image |
|
244 |
+
|------------------------------------------------------|----------------------------------------------------|----------------------------------------------------------|
|
245 |
+
| <img src="images/clothing_image_01.jpg" width="240"> | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/image_based_result_01.jpg" width="240"> |
|
246 |
+
|
247 |
+
### Image-based virtual try-on with background
|
248 |
+
|
249 |
+
Additionally, it's possible to replace the avatar background with a reference image or a text prompt.
|
250 |
+
|
251 |
+
| Clothing Image | Avatar Image | Background Image | Result Image |
|
252 |
+
|------------------------------------------------------|----------------------------------------------------|--------------------------------------------------------|---------------------------------------------------------------------|
|
253 |
+
| <img src="images/clothing_image_04.jpg" width="240"> | <img src="images/avatar_image_03.jpg" width="240"> | <img src="images/background_image_01.jpg" width="240"> | <img src="images/image_based_background_result_01.jpg" width="240"> |
|
254 |
+
|
255 |
+
And with a text prompt for the background:
|
256 |
+
|
257 |
+
| Clothing Image | Avatar Image | Background Prompt | Result Image |
|
258 |
+
|------------------------------------------------------|----------------------------------------------------|------------------------------|---------------------------------------------------------------------|
|
259 |
+
| <img src="images/clothing_image_04.jpg" width="240"> | <img src="images/avatar_image_03.jpg" width="240"> | in front of a snowy mountain | <img src="images/image_based_background_result_02.jpg" width="240"> |
|
260 |
+
|
261 |
+
### Avatar from a text prompt
|
262 |
+
|
263 |
+
It's possible to replace the person on the clothing image with an avatar, described in a text prompt. Background will be
|
264 |
+
changed as well and will be a random one if not specified:
|
265 |
+
|
266 |
+
| Clothing Image | Avatar Prompt | Background Prompt | Result Image |
|
267 |
+
|------------------------------------------------------|--------------------------------------------|--------------------|------------------------------------------------------------|
|
268 |
+
| <img src="images/clothing_image_02.jpg" width="240"> | a beautiful blond girl with long hair | | <img src="images/avatar_prompt_result_01.jpg" width="240"> |
|
269 |
+
| <img src="images/clothing_image_03.jpg" width="240"> | a gentleman with a long beard and mustache | near a fireplace | <img src="images/avatar_prompt_result_02.jpg" width="240"> |
|
270 |
+
|
271 |
+
You may also experiment with avatar prompts for more interesting results:
|
272 |
+
|
273 |
+
| Clothing Image | Avatar Prompt | Background Prompt | Result Image |
|
274 |
+
|------------------------------------------------------|---------------------|-----------------------|------------------------------------------------------------|
|
275 |
+
| <img src="images/clothing_image_03.jpg" width="240"> | (iron man mask)+++ | in the Sahara Desert | <img src="images/avatar_prompt_result_03.jpg" width="240"> |
|
276 |
+
|
277 |
+
### Clothing from a text prompt
|
278 |
+
|
279 |
+
Similarly, you can specify clothing with a text prompt while providing an avatar image:
|
280 |
+
|
281 |
+
| Clothing Prompt | Avatar Image | Result Image |
|
282 |
+
|-------------------------------------|----------------------------------------------------|--------------------------------------------------------------|
|
283 |
+
| a sheer blue sleeveless mini dress | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/clothing_prompt_result_01.jpg" width="240"> |
|
284 |
+
| a colorful t-shirt and black shorts | <img src="images/avatar_image_03.jpg" width="240"> | <img src="images/clothing_prompt_result_02.jpg" width="240"> |
|
285 |
+
|
286 |
+
### Modifying avatar's body
|
287 |
+
|
288 |
+
If you specify clothing and avatar images to be the same while providing an avatar prompt it's possible to change
|
289 |
+
avatar's body proportions. Note that it may require using additional term weighting to achieve stronger changes.
|
290 |
+
|
291 |
+
| Clothing Image | Avatar Image | Avatar Prompt | Result Image |
|
292 |
+
|------------------------------------------------------|------------------------------------------------------|-------------------------------|------------------------------------------------------------------|
|
293 |
+
| <img src="images/clothing_image_01.jpg" width="240"> | <img src="images/clothing_image_01.jpg" width="240"> | a (plus size)+ woman | <img src="images/avatar_modification_result_01.jpg" width="240"> |
|
294 |
+
| <img src="images/clothing_image_03.jpg" width="240"> | <img src="images/clothing_image_03.jpg" width="240"> | a (muscular bodybuilder)+++++ | <img src="images/avatar_modification_result_02.jpg" width="240"> |
|
295 |
+
|
296 |
+
### Txt2Img
|
297 |
+
|
298 |
+
As our diffusion model was fine-tuned to produce people wearing various clothing, it can better follow a clothing prompt
|
299 |
+
and output realistic people and garments:
|
300 |
+
|
301 |
+
| Clothing Prompt | Avatar Prompt | Background Prompt | Result Image |
|
302 |
+
|-------------------------------------------------|--------------------------------|------------------------|------------------------------------------------------|
|
303 |
+
| a paisley pattern purple shirt and beige chinos | a fit man with dark beard | plain white background | <img src="images/txt2img_result_01.jpg" width="240"> |
|
304 |
+
| a white polka dot pattern dress | a beautiful petite blond woman | on a yacht | <img src="images/txt2img_result_02.jpg" width="240"> |
|
305 |
+
|
306 |
+
### Other creative possibilities
|
307 |
+
|
308 |
+
If you specify the same image for clothing and avatar while providing a background prompt (or background image) you can
|
309 |
+
replace the background in a creative way:
|
310 |
+
|
311 |
+
| Clothing Image | Avatar Image | Background Prompt | Result Image |
|
312 |
+
|----------------------------------------------------|----------------------------------------------------|-------------------------|-------------------------------------------------------------|
|
313 |
+
| <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/avatar_image_02.jpg" width="240"> | on a snowy mountain top | <img src="images/new_background_result_01.jpg" width="240"> |
|
314 |
+
|
315 |
+
It's also possible to use a combination of clothing image, clothing prompt, avatar image and a background to add some
|
316 |
+
accessories:
|
317 |
+
|
318 |
+
| Clothing Image | Clothing Prompt | Avatar Image | Background Image | Result Image |
|
319 |
+
|------------------------------------------------------|--------------------------|------------------------------------------------------|--------------------------------------------------------|------------------------------------------------------------------|
|
320 |
+
| <img src="images/avatar_image_02.jpg" width="240"> | a (light brown purse)+++ | <img src="images/avatar_image_02.jpg" width="240"> | <img src="images/background_image_03.jpg" width="240"> | <img src="images/accessory_result_01.jpg" width="240"> |
|
321 |
+
|
322 |
+
## Performance
|
323 |
+
|
324 |
+
Typically, one try-on request is processed in 5-10 seconds (depending on type of conditions) excluding network latency.
|
325 |
+
In order to reduce network overhead you might want compress your images before feeding to the API (e.g. using JPEG).
|
326 |
+
Please note that in case of a high demand processing time might increase due to request being queued, though we
|
327 |
+
constantly monitor our GPU cluster capacity and perform scaling as needed.
|
328 |
+
|
329 |
+
## Known Issues and Limitations
|
330 |
+
|
331 |
+
As any generative model, our models are not perfect (though we constantly work on improvements):
|
332 |
+
- Prompt following might not be perfect, especially in case of long and sophisticated prompts. Prefer simpler and more
|
333 |
+
straightforward prompts whenever possible. Also be pretty verbose (e.g. use the word "plain" if you need something of
|
334 |
+
solid color). Additionally, Compel weighting might be used to increase weight of certain tokens.
|
335 |
+
- As usual, generative models struggle with hands, fingers and toes, though we try to mitigate it to a certain extent.
|
336 |
+
- Currently, we do not support trying on a single garment, only the full look.
|
337 |
+
- Hats and sunglasses are not currently transferred, but we are working on it.
|
338 |
+
- Backgrounds might lack some clarity as currently we focus more on clothing.
|
339 |
+
- In case of a specified background a hairstyle might change.
|
340 |
+
- Body shape of the avatar might change towards smaller sizes.
|
docs/images/accessory_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_image_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_image_02.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_image_03.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_image_04.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_modification_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_modification_result_02.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_prompt_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_prompt_result_02.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/avatar_prompt_result_03.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/background_image_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/background_image_02.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/background_image_03.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/background_image_04.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/clothing_image_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/clothing_image_02.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/clothing_image_03.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/clothing_image_04.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/clothing_prompt_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/clothing_prompt_result_02.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/image_based_background_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/image_based_background_result_02.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/image_based_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/new_background_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/same_crop_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/same_crop_result_02.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/txt2img_result_01.jpg
ADDED
![]() |
Git LFS Details
|
docs/images/txt2img_result_02.jpg
ADDED
![]() |
Git LFS Details
|