BoltMonkey commited on
Commit
11a7c25
·
verified ·
1 Parent(s): 75a3aaf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -3
README.md CHANGED
@@ -1,3 +1,70 @@
1
- ---
2
- license: openrail
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: openrail
3
+ pipeline_tag: text-to-image
4
+ tags:
5
+ - Stable Diffusion
6
+ - photorealistic
7
+ - sd_1.5
8
+ ---
9
+
10
+ https://civitai.com/models/597300/boltmonkey-photoreal?modelVersionId=667353
11
+
12
+
13
+ This is an extremely high-quality photorealistic SD1.5 model that I created as an offshoot to a business project of mine that I work on in my spare time. I believe in the open-source nature of AI and am gradually releasing some of my work that I do not intend to use for my ongoing project. I have been slowly developing this model for roughly a year.
14
+
15
+ I have labelled this model as a merge but it is already 30+ iterations deep which include a substantial number of blockmerges and multiple fine-tunes along the way.
16
+
17
+ The model is very realistic, especially for SD1.5.
18
+
19
+ Hands are generally 5-fingered and not mangled, but overly complex or poor prompting style can result in amputations or distortions.
20
+
21
+ Most textures are well-rendered, but I have found that extremely dusty environments (such as in a mine tunnel) look a bit too generic for my liking.
22
+
23
+ Lighting and shadows are a strong point of the model. Particularly, volumetric lighting (such as light rays through misty or dusty atmosphere) is well-rendered.
24
+
25
+ Most of my showcase uses animals, but the model is adept at generating humans, architecture, natural environments, food, etc... Though, I find that I have not trained enough on most forms of transport.
26
+
27
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/T8-3LROcMo6hSihTE4ddx.png)
28
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/8mEs5A5HZp2el7i5nGhQA.png)
29
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/2iKSk1Zx_muykrlepHiWL.png)
30
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/PMBxD5xypOIMg4z1wHRZK.png)
31
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/tb_zJ_Te6OvrMPNZuxOZC.png)
32
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/j7v47skcSFbOCASEsGSOL.png)
33
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/LAJwc4Bjj5zjCl3Nm545z.png)
34
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/Ib2RGzy-YfxIiHEG-_9ew.png)
35
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/L6h67yde-8cforZbRoPq7.png)
36
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/w4QHbzjBedEL3SWCUeqti.png)
37
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/pdjf16jNDgk00HCS7FEaz.png)
38
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/SrIJbQTJ0ajNQPLuMJXWU.png)
39
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/3dEkBxiJ6k9dJLC6jDD5f.png)
40
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/_2xeTm1D1EiJMsB9T4_t9.png)
41
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/A7b2RLXOLkluqSfxtIsKh.png)
42
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/WXilW7ACN6ivAiAOFUza0.png)
43
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641e4b539128fb5692c61524/AaQlA3LufIfecXJoBwelr.png)
44
+
45
+ Suggestions for use
46
+
47
+ I have a lot to learn about prompting from the collective CivitAI userbase, but here are a couple of things that I have found work well:
48
+
49
+ TL;DR:
50
+
51
+ DDIM, 15-40 steps, CFG ~2-10, Clipskip 1-4 (depending on use), LoRAs work well.
52
+
53
+
54
+
55
+ This model works well with square and rectangular aspects. Resolutions of 768x+ work best but will sometimes result in duplications around 1024x. Having said that, 512x+ will still produce good images.
56
+
57
+ The quality of this model's output is very realistic even with minimal prompting, but is exceptional with well-structured prompts. Moreover, this model works very well with LoRA's so long as you are cognisant of to the LoRA's training resolution (768+ work best). I don't use anime LoRA's so I can't offer any suggestions there, but I will be interested in your results if you try it.
58
+
59
+ Good quality photorealistic images will result from extremely simple prompts (e.g., "cat") but the model responds very well to quality guidance prompts and some more complex prompting too.
60
+
61
+ The following prompts are my go to:
62
+ "ultrarealistic photography, 32k UHD, absurdres, natural light and shadows, volumetric lighting, natural skin textures, accurate attention to details, depth of field, sharp focus"
63
+
64
+ Typically, I would use DPM++_3m_SDE_GPU as my sampler with SGM_uniform noise schedule, but I find that this model works best (to my taste) with DDIM sampler and DDIM_uniform noise schedule.
65
+
66
+ 15 Steps is enough to get good images most of the time, but I typically use 25-40. I have run a few generations with ComfyUI's maximum of 999 steps just to see how it fares. Obviously the results look great, but I see no real need to take it past 50 at the max.
67
+
68
+ CFG is a difficult one to give a value for. A CFG of 2-4 work well, but sometimes I will take it as far as 10 depending on what I am generating. I suggest starting with a value of 4 and gauging it for yourself. Obviously, lower values give the model more freedom.
69
+
70
+ This model works well without clipskipping but if you are merging several disparate concepts into 1 image then it may pay to skip 2 or 3 to give some fluidity to the concepts