Spaces:
No application file
No application file
<!--Copyright 2023 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
<p align="center"> | |
<br> | |
<img src="https://raw.githubusercontent.com/huggingface/diffusers/77aadfee6a891ab9fcfb780f87c693f7a5beeb8e/docs/source/imgs/diffusers_library.jpg" width="400"/> | |
<br> | |
</p> | |
# Diffusers | |
π€ Diffusersλ μ΄λ―Έμ§, μ€λμ€, μ¬μ§μ΄ λΆμμ 3D ꡬ쑰λ₯Ό μμ±νκΈ° μν μ΅μ²¨λ¨ μ¬μ νλ ¨λ diffusion λͺ¨λΈμ μν λΌμ΄λΈλ¬λ¦¬μ λλ€. κ°λ¨ν μΆλ‘ μ루μ μ μ°Ύκ³ μλ , μ체 diffusion λͺ¨λΈμ νλ ¨νκ³ μΆλ , π€ Diffusersλ λ κ°μ§ λͺ¨λλ₯Ό μ§μνλ λͺ¨λμ ν΄λ°μ€μ λλ€. μ ν¬ λΌμ΄λΈλ¬λ¦¬λ [μ±λ₯λ³΄λ€ μ¬μ©μ±](conceptual/philosophy#usability-over-performance), [κ°νΈν¨λ³΄λ€ λ¨μν¨](conceptual/philosophy#simple-over-easy), κ·Έλ¦¬κ³ [μΆμνλ³΄λ€ μ¬μ©μ μ§μ κ°λ₯μ±](conceptual/philosophy#tweakable-contributorfriendly-over-abstraction)μ μ€μ μ λκ³ μ€κ³λμμ΅λλ€. | |
μ΄ λΌμ΄λΈλ¬λ¦¬μλ μΈ κ°μ§ μ£Όμ κ΅¬μ± μμκ° μμ΅λλ€: | |
- λͺ μ€μ μ½λλ§μΌλ‘ μΆλ‘ ν μ μλ μ΅μ²¨λ¨ [diffusion νμ΄νλΌμΈ](api/pipelines/overview). | |
- μμ± μλμ νμ§ κ°μ κ· νμ λ§μΆκΈ° μν΄ μνΈκ΅νμ μΌλ‘ μ¬μ©ν μ μλ [λ Έμ΄μ¦ μ€μΌμ€λ¬](api/schedulers/overview). | |
- λΉλ© λΈλ‘μΌλ‘ μ¬μ©ν μ μκ³ μ€μΌμ€λ¬μ κ²°ν©νμ¬ μ체μ μΈ end-to-end diffusion μμ€ν μ λ§λ€ μ μλ μ¬μ νμ΅λ [λͺ¨λΈ](api/models). | |
<div class="mt-10"> | |
<div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5"> | |
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./tutorials/tutorial_overview" | |
><div class="w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Tutorials</div> | |
<p class="text-gray-700">κ²°κ³Όλ¬Όμ μμ±νκ³ , λλ§μ diffusion μμ€ν μ ꡬμΆνκ³ , νμ° λͺ¨λΈμ νλ ¨νλ λ° νμν κΈ°λ³Έ κΈ°μ μ λ°°μ보μΈμ. π€ Diffusersλ₯Ό μ²μ μ¬μ©νλ κ²½μ° μ¬κΈ°μμ μμνλ κ²μ΄ μ’μ΅λλ€!</p> | |
</a> | |
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./using-diffusers/loading_overview" | |
><div class="w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">How-to guides</div> | |
<p class="text-gray-700">νμ΄νλΌμΈ, λͺ¨λΈ, μ€μΌμ€λ¬λ₯Ό λ‘λνλ λ° λμμ΄ λλ μ€μ©μ μΈ κ°μ΄λμ λλ€. λν νΉμ μμ μ νμ΄νλΌμΈμ μ¬μ©νκ³ , μΆλ ₯ μμ± λ°©μμ μ μ΄νκ³ , μΆλ‘ μλμ λ§κ² μ΅μ ννκ³ , λ€μν νμ΅ κΈ°λ²μ μ¬μ©νλ λ°©λ²λ λ°°μΈ μ μμ΅λλ€.</p> | |
</a> | |
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./conceptual/philosophy" | |
><div class="w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Conceptual guides</div> | |
<p class="text-gray-700">λΌμ΄λΈλ¬λ¦¬κ° μ μ΄λ° λ°©μμΌλ‘ μ€κ³λμλμ§ μ΄ν΄νκ³ , λΌμ΄λΈλ¬λ¦¬ μ΄μ©μ λν μ€λ¦¬μ κ°μ΄λλΌμΈκ³Ό μμ ꡬνμ λν΄ μμΈν μμ보μΈμ.</p> | |
</a> | |
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./api/models" | |
><div class="w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Reference</div> | |
<p class="text-gray-700">π€ Diffusers ν΄λμ€ λ° λ©μλμ μλ λ°©μμ λν κΈ°μ μ€λͺ .</p> | |
</a> | |
</div> | |
</div> | |
## Supported pipelines | |
| Pipeline | Paper/Repository | Tasks | | |
|---|---|:---:| | |
| [alt_diffusion](./api/pipelines/alt_diffusion) | [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679) | Image-to-Image Text-Guided Generation | | |
| [audio_diffusion](./api/pipelines/audio_diffusion) | [Audio Diffusion](https://github.com/teticio/audio-diffusion.git) | Unconditional Audio Generation | | |
| [controlnet](./api/pipelines/stable_diffusion/controlnet) | [Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543) | Image-to-Image Text-Guided Generation | | |
| [cycle_diffusion](./api/pipelines/cycle_diffusion) | [Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance](https://arxiv.org/abs/2210.05559) | Image-to-Image Text-Guided Generation | | |
| [dance_diffusion](./api/pipelines/dance_diffusion) | [Dance Diffusion](https://github.com/williamberman/diffusers.git) | Unconditional Audio Generation | | |
| [ddpm](./api/pipelines/ddpm) | [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation | | |
| [ddim](./api/pipelines/ddim) | [Denoising Diffusion Implicit Models](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation | | |
| [if](./if) | [**IF**](./api/pipelines/if) | Image Generation | | |
| [if_img2img](./if) | [**IF**](./api/pipelines/if) | Image-to-Image Generation | | |
| [if_inpainting](./if) | [**IF**](./api/pipelines/if) | Image-to-Image Generation | | |
| [latent_diffusion](./api/pipelines/latent_diffusion) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)| Text-to-Image Generation | | |
| [latent_diffusion](./api/pipelines/latent_diffusion) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)| Super Resolution Image-to-Image | | |
| [latent_diffusion_uncond](./api/pipelines/latent_diffusion_uncond) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation | | |
| [paint_by_example](./api/pipelines/paint_by_example) | [Paint by Example: Exemplar-based Image Editing with Diffusion Models](https://arxiv.org/abs/2211.13227) | Image-Guided Image Inpainting | | |
| [pndm](./api/pipelines/pndm) | [Pseudo Numerical Methods for Diffusion Models on Manifolds](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation | | |
| [score_sde_ve](./api/pipelines/score_sde_ve) | [Score-Based Generative Modeling through Stochastic Differential Equations](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation | | |
| [score_sde_vp](./api/pipelines/score_sde_vp) | [Score-Based Generative Modeling through Stochastic Differential Equations](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation | | |
| [semantic_stable_diffusion](./api/pipelines/semantic_stable_diffusion) | [Semantic Guidance](https://arxiv.org/abs/2301.12247) | Text-Guided Generation | | |
| [stable_diffusion_text2img](./api/pipelines/stable_diffusion/text2img) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Text-to-Image Generation | | |
| [stable_diffusion_img2img](./api/pipelines/stable_diffusion/img2img) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Image-to-Image Text-Guided Generation | | |
| [stable_diffusion_inpaint](./api/pipelines/stable_diffusion/inpaint) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Text-Guided Image Inpainting | | |
| [stable_diffusion_panorama](./api/pipelines/stable_diffusion/panorama) | [MultiDiffusion](https://multidiffusion.github.io/) | Text-to-Panorama Generation | | |
| [stable_diffusion_pix2pix](./api/pipelines/stable_diffusion/pix2pix) | [InstructPix2Pix: Learning to Follow Image Editing Instructions](https://arxiv.org/abs/2211.09800) | Text-Guided Image Editing| | |
| [stable_diffusion_pix2pix_zero](./api/pipelines/stable_diffusion/pix2pix_zero) | [Zero-shot Image-to-Image Translation](https://pix2pixzero.github.io/) | Text-Guided Image Editing | | |
| [stable_diffusion_attend_and_excite](./api/pipelines/stable_diffusion/attend_and_excite) | [Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models](https://arxiv.org/abs/2301.13826) | Text-to-Image Generation | | |
| [stable_diffusion_self_attention_guidance](./api/pipelines/stable_diffusion/self_attention_guidance) | [Improving Sample Quality of Diffusion Models Using Self-Attention Guidance](https://arxiv.org/abs/2210.00939) | Text-to-Image Generation Unconditional Image Generation | | |
| [stable_diffusion_image_variation](./stable_diffusion/image_variation) | [Stable Diffusion Image Variations](https://github.com/LambdaLabsML/lambda-diffusers#stable-diffusion-image-variations) | Image-to-Image Generation | | |
| [stable_diffusion_latent_upscale](./stable_diffusion/latent_upscale) | [Stable Diffusion Latent Upscaler](https://twitter.com/StabilityAI/status/1590531958815064065) | Text-Guided Super Resolution Image-to-Image | | |
| [stable_diffusion_model_editing](./api/pipelines/stable_diffusion/model_editing) | [Editing Implicit Assumptions in Text-to-Image Diffusion Models](https://time-diffusion.github.io/) | Text-to-Image Model Editing | | |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-to-Image Generation | | |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Image Inpainting | | |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Depth-Conditional Stable Diffusion](https://github.com/Stability-AI/stablediffusion#depth-conditional-stable-diffusion) | Depth-to-Image Generation | | |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Super Resolution Image-to-Image | | |
| [stable_diffusion_safe](./api/pipelines/stable_diffusion_safe) | [Safe Stable Diffusion](https://arxiv.org/abs/2211.05105) | Text-Guided Generation | | |
| [stable_unclip](./stable_unclip) | Stable unCLIP | Text-to-Image Generation | | |
| [stable_unclip](./stable_unclip) | Stable unCLIP | Image-to-Image Text-Guided Generation | | |
| [stochastic_karras_ve](./api/pipelines/stochastic_karras_ve) | [Elucidating the Design Space of Diffusion-Based Generative Models](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation | | |
| [text_to_video_sd](./api/pipelines/text_to_video) | [Modelscope's Text-to-video-synthesis Model in Open Domain](https://modelscope.cn/models/damo/text-to-video-synthesis/summary) | Text-to-Video Generation | | |
| [unclip](./api/pipelines/unclip) | [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125)(implementation by [kakaobrain](https://github.com/kakaobrain/karlo)) | Text-to-Image Generation | | |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Text-to-Image Generation | | |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Image Variations Generation | | |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Dual Image and Text Guided Generation | | |
| [vq_diffusion](./api/pipelines/vq_diffusion) | [Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822) | Text-to-Image Generation | | |