File size: 2,465 Bytes
3de264f 8d33972 3de264f ba14468 3de264f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
# Diffusers ControlNet Impl. & Reference Impl. generated image comparison
## Implementation Source Code & versions
- Diffusers (in development version): https://github.com/takuma104/diffusers/tree/e758682c00a7d23e271fd8c9fb7a48912838045c
- Reference Impl.: https://github.com/lllyasviel/ControlNet/tree/2f77609bf6a8a2243d9faa198365fc6222c5435f
## Environment
- OS: Ubuntu 22.04
- GPU: Nvidia RTX3060 (12GB)
## Scripts to generate plots:
- Create control image: [create_control_images.py](create_control_images.py)
- Diffusers generated image: [gen_diffusers_image.py](gen_diffusers_image.py)
- Reference generated image: [gen_reference_image.py](gen_reference_image.py)
- Create Plots: [create_plots.py](create_plots.py)
## Original image for control image:
- All images from [test_imgs](https://github.com/lllyasviel/ControlNet/tree/main/test_imgs) excepts vermeer image. Croped and resized to 512x512px.
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/bird_512x512.png" />
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/human_512x512.png" />
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/room_512x512.png" />
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/vermeer_512x512.png" />
## Generate Settings:
#### Control Images:
Converted above original images by [controlnet_hinter](https://github.com/takuma104/controlnet_hinter).
#### Prompts:
All images were generated with the same prompt.
- Prompt: `best quality, extremely detailed, illustration, looking at viewer`
- Negative Prompt: `monochrome, lowres, bad anatomy, worst quality, low quality`
#### Ohter setting (both common):
- sampler: DDIM
- guidance_scale: 9.0
- num_inference_steps: 20
- initial random latents: created on CPU using seed
## Results:
[![canny](plots/figure_canny.png)](plots/figure_canny.png)
[![depth](plots/figure_depth.png)](plots/figure_depth.png)
[![hed](plots/figure_hed.png)](plots/figure_hed.png)
[![mlsd](plots/figure_mlsd.png)](plots/figure_mlsd.png)
[![normal](plots/figure_normal.png)](plots/figure_normal.png)
[![openpose](plots/figure_openpose.png)](plots/figure_openpose.png)
[![scribble](plots/figure_scribble.png)](plots/figure_scribble.png)
[![seg](plots/figure_seg.png)](plots/figure_seg.png)
|