KBlueLeaf commited on
Commit
b21b792
·
verified ·
1 Parent(s): 51d4300

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md CHANGED
@@ -1,3 +1,67 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - stable cascade
7
  ---
8
+
9
+ # Stable-Cascade FP16 fix
10
+
11
+ **A modified version of [Stable-Cascade](https://huggingface.co/stabilityai/stable-cascade) which is compatibile with fp16 inference**
12
+
13
+ ## Demo
14
+ | FP16| BF16|
15
+ | - | - |
16
+ |![image/png](https://cdn-uploads.huggingface.co/production/uploads/630593e2fca1d8d92b81d2a1/fkWNY15JQbfh5pe1SY7wS.png)|![image/png](https://cdn-uploads.huggingface.co/production/uploads/630593e2fca1d8d92b81d2a1/XpfqkimqJTeDjggTaV4Mt.png)|
17
+
18
+ LPIPS difference: 0.088
19
+
20
+
21
+ | FP16 | BF16|
22
+ | - | - |
23
+ |![image/png](https://cdn-uploads.huggingface.co/production/uploads/630593e2fca1d8d92b81d2a1/muOkoNjVK6CFv2rs6QyBr.png)|![image/png](https://cdn-uploads.huggingface.co/production/uploads/630593e2fca1d8d92b81d2a1/rrgb8yMuJDyjJu6wd366j.png)|
24
+
25
+ LPIPS difference: 0.012
26
+
27
+ ## How
28
+ After doing some check to the L1 norm of each hidden state. I found the last block group(8, 24, 24, 8 <- this one) make the hiddens states become bigger and bigger.
29
+
30
+ So I just apply some transformation on the TimestepBlock to directly modify the scale of hidden state. (Since it is not a residual block, so this is possible)
31
+
32
+ How the transformation be done is written in the modified "stable_cascade.py", you can put the file into kohya-ss/sd-scripts' stable-cascade branch and uncomment things to check weights or doing the conversion by yourselve.
33
+
34
+
35
+ ### FP8
36
+ Some people may know the FP8 quant for inference SDXL with lowvram cards. The technique can be applied to this model too.<br>
37
+ But since the last block group is basically ruined, so it is recommend to ignore the last block group:<br>
38
+ ```python
39
+ for name, module in generator_c.named_modules():
40
+ if "up_blocks.1" in name: continue
41
+ if isinstance(module, torch.nn.Linear):
42
+ module.to(torch.float8_e5m2)
43
+ elif isinstance(module, torch.nn.Conv2d):
44
+ module.to(torch.float8_e5m2)
45
+ elif isinstance(module, torch.nn.MultiheadAttention):
46
+ module.to(torch.float8_e5m2)
47
+ ```
48
+
49
+ This sample code should transform 70% of weight into fp8. (Use FP8 weight with scale is better solution, it is recommended to implement that)
50
+
51
+ I have tried different transform settings which is more friendly for FP8 but the differences between original model is more significant.
52
+
53
+ FP8 Demo (Same Seed):
54
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630593e2fca1d8d92b81d2a1/wPoZeWGGhcPMck45--y_X.png)
55
+
56
+
57
+ ## Notice
58
+ The modified version of model will not be compatibile with the lora/lycoris trained on original weight. <br>
59
+ (actually it can, just do the same transformation, I'm considering to rewrite a version to use key name to determine what to do.)
60
+
61
+ Also the ControlNets will not be compatible too. Unless you also apply the needed transformation to them.
62
+
63
+ I don't want to do all of these by myself so hope some others will do that.
64
+
65
+ ## License
66
+ Stable-Cascade is published with a non-commercial lisence so I use CC-BY-NC 4.0 to publish this model.
67
+ **The source code to make this model is published with apache-2.0 license**