--- license: cc-by-nc-4.0 library_name: diffusers tags: - text-to-image - stable-diffusion - diffusion distillation --- # DMD2 Model Card ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63363b864067f020756275b7/YhssMfS_1e6q5fHKh9qrc.jpeg) > [**Improved Distribution Matching Distillation for Fast Image Synthesis**](https://arxiv.org/abs/2405.14867), > Tianwei Yin, Michaël Gharbi, Taesung Park, Richard Zhang, Eli Shechtman, Frédo Durand, William T. Freeman ## Contact Feel free to contact us if you have any questions about the paper! Tianwei Yin [tianweiy@mit.edu](mailto:tianweiy@mit.edu) ## Huggingface Demo Our 4-step (much higher quality, 2X slower) Text-to-Image demo is hosted at [DMD2-4step](https://913f7051c61c070e4e.gradio.live) Our 1-step Text-to-Image demo is hosted at [DMD2-1step](https://154dfe6ee5c63946cc.gradio.live) ## Usage We can use the standard diffuser pipeline: #### 4-step generation ```.bash import torch from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler from huggingface_hub import hf_hub_download from safetensors.torch import load_file base_model_id = "stabilityai/stable-diffusion-xl-base-1.0" repo_name = "tianweiy/DMD2" ckpt_name = "dmd2_sdxl_4step_unet.bin" # Load model. unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16) unet.load_state_dict(torch.load(hf_hub_download(repo_name, ckpt_name), map_location="cuda")) pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda") pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config) prompt="a photo of a cat" image=pipe(prompt=prompt, num_inference_steps=4, guidance_scale=0).images[0] ``` #### 1-step generation ```.bash import torch from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler from huggingface_hub import hf_hub_download from safetensors.torch import load_file base_model_id = "stabilityai/stable-diffusion-xl-base-1.0" repo_name = "tianweiy/DMD2" ckpt_name = "dmd2_sdxl_1step_unet.bin" # Load model. unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16) unet.load_state_dict(torch.load(hf_hub_download(repo_name, ckpt_name), map_location="cuda")) pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda") pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config) prompt="a photo of a cat" image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, timesteps=[399]).images[0] ``` For more information, please refer to the [code repository](https://github.com/tianweiy/DMD2) ## License Improved Distribution Matching Distillation is released under [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en). ## Citation If you find DMD2 useful or relevant to your research, please kindly cite our papers: ```bib @article{yin2024improved, title={Improved Distribution Matching Distillation for Fast Image Synthesis}, author={Yin, Tianwei and Gharbi, Micha{\"e}l and Park, Taesung and Zhang, Richard and Shechtman, Eli and Durand, Fredo and Freeman, William T}, journal={arXiv:2405.14867}, year={2024} } @inproceedings{yin2024onestep, title={One-step Diffusion with Distribution Matching Distillation}, author={Yin, Tianwei and Gharbi, Micha{\"e}l and Zhang, Richard and Shechtman, Eli and Durand, Fr{\'e}do and Freeman, William T and Park, Taesung}, booktitle={CVPR}, year={2024} } ``` ## Acknowledgments This work was done while Tianwei Yin was a full-time student at MIT. It was developed based on our reimplementation of the original DMD paper. This work was supported by the National Science Foundation under Cooperative Agreement PHY-2019786 (The NSF AI Institute for Artificial Intelligence and Fundamental Interactions, http://iaifi.org/), by NSF Grant 2105819, by NSF CISE award 1955864, and by funding from Google, GIST, Amazon, and Quanta Computer.