Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,70 @@
|
|
1 |
-
---
|
2 |
-
license: openrail
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: openrail
|
3 |
+
pipeline_tag: text-to-image
|
4 |
+
tags:
|
5 |
+
- Stable Diffusion
|
6 |
+
- photorealistic
|
7 |
+
- sd_1.5
|
8 |
+
---
|
9 |
+
|
10 |
+
https://civitai.com/models/597300/boltmonkey-photoreal?modelVersionId=667353
|
11 |
+
|
12 |
+
|
13 |
+
This is an extremely high-quality photorealistic SD1.5 model that I created as an offshoot to a business project of mine that I work on in my spare time. I believe in the open-source nature of AI and am gradually releasing some of my work that I do not intend to use for my ongoing project. I have been slowly developing this model for roughly a year.
|
14 |
+
|
15 |
+
I have labelled this model as a merge but it is already 30+ iterations deep which include a substantial number of blockmerges and multiple fine-tunes along the way.
|
16 |
+
|
17 |
+
The model is very realistic, especially for SD1.5.
|
18 |
+
|
19 |
+
Hands are generally 5-fingered and not mangled, but overly complex or poor prompting style can result in amputations or distortions.
|
20 |
+
|
21 |
+
Most textures are well-rendered, but I have found that extremely dusty environments (such as in a mine tunnel) look a bit too generic for my liking.
|
22 |
+
|
23 |
+
Lighting and shadows are a strong point of the model. Particularly, volumetric lighting (such as light rays through misty or dusty atmosphere) is well-rendered.
|
24 |
+
|
25 |
+
Most of my showcase uses animals, but the model is adept at generating humans, architecture, natural environments, food, etc... Though, I find that I have not trained enough on most forms of transport.
|
26 |
+
|
27 |
+

|
28 |
+

|
29 |
+

|
30 |
+

|
31 |
+

|
32 |
+

|
33 |
+

|
34 |
+

|
35 |
+

|
36 |
+

|
37 |
+

|
38 |
+

|
39 |
+

|
40 |
+

|
41 |
+

|
42 |
+

|
43 |
+

|
44 |
+
|
45 |
+
Suggestions for use
|
46 |
+
|
47 |
+
I have a lot to learn about prompting from the collective CivitAI userbase, but here are a couple of things that I have found work well:
|
48 |
+
|
49 |
+
TL;DR:
|
50 |
+
|
51 |
+
DDIM, 15-40 steps, CFG ~2-10, Clipskip 1-4 (depending on use), LoRAs work well.
|
52 |
+
|
53 |
+
|
54 |
+
|
55 |
+
This model works well with square and rectangular aspects. Resolutions of 768x+ work best but will sometimes result in duplications around 1024x. Having said that, 512x+ will still produce good images.
|
56 |
+
|
57 |
+
The quality of this model's output is very realistic even with minimal prompting, but is exceptional with well-structured prompts. Moreover, this model works very well with LoRA's so long as you are cognisant of to the LoRA's training resolution (768+ work best). I don't use anime LoRA's so I can't offer any suggestions there, but I will be interested in your results if you try it.
|
58 |
+
|
59 |
+
Good quality photorealistic images will result from extremely simple prompts (e.g., "cat") but the model responds very well to quality guidance prompts and some more complex prompting too.
|
60 |
+
|
61 |
+
The following prompts are my go to:
|
62 |
+
"ultrarealistic photography, 32k UHD, absurdres, natural light and shadows, volumetric lighting, natural skin textures, accurate attention to details, depth of field, sharp focus"
|
63 |
+
|
64 |
+
Typically, I would use DPM++_3m_SDE_GPU as my sampler with SGM_uniform noise schedule, but I find that this model works best (to my taste) with DDIM sampler and DDIM_uniform noise schedule.
|
65 |
+
|
66 |
+
15 Steps is enough to get good images most of the time, but I typically use 25-40. I have run a few generations with ComfyUI's maximum of 999 steps just to see how it fares. Obviously the results look great, but I see no real need to take it past 50 at the max.
|
67 |
+
|
68 |
+
CFG is a difficult one to give a value for. A CFG of 2-4 work well, but sometimes I will take it as far as 10 depending on what I am generating. I suggest starting with a value of 4 and gauging it for yourself. Obviously, lower values give the model more freedom.
|
69 |
+
|
70 |
+
This model works well without clipskipping but if you are merging several disparate concepts into 1 image then it may pay to skip 2 or 3 to give some fluidity to the concepts
|