ZachNagengast commited on
Commit
427a46e
1 Parent(s): 7c47b8a

Update models

Browse files
Files changed (41) hide show
  1. README.md +82 -0
  2. original/compiled/TextEncoder.mlmodelc/analytics/coremldata.bin +1 -1
  3. original/compiled/TextEncoder.mlmodelc/coremldata.bin +2 -2
  4. original/compiled/TextEncoder.mlmodelc/metadata.json +6 -5
  5. original/compiled/TextEncoder.mlmodelc/model.mil +0 -0
  6. original/compiled/TextEncoder.mlmodelc/weights/weight.bin +2 -2
  7. original/compiled/TextEncoder2.mlmodelc/analytics/coremldata.bin +1 -1
  8. original/compiled/TextEncoder2.mlmodelc/coremldata.bin +2 -2
  9. original/compiled/TextEncoder2.mlmodelc/metadata.json +5 -4
  10. original/compiled/TextEncoder2.mlmodelc/model.mil +0 -0
  11. original/compiled/TextEncoder2.mlmodelc/weights/weight.bin +2 -2
  12. original/compiled/Unet.mlmodelc/analytics/coremldata.bin +1 -1
  13. original/compiled/Unet.mlmodelc/coremldata.bin +2 -2
  14. original/compiled/Unet.mlmodelc/metadata.json +18 -17
  15. original/compiled/Unet.mlmodelc/model.mil +0 -0
  16. original/compiled/Unet.mlmodelc/weights/weight.bin +2 -2
  17. original/compiled/VAEDecoder.mlmodelc/analytics/coremldata.bin +1 -1
  18. original/compiled/VAEDecoder.mlmodelc/coremldata.bin +2 -2
  19. original/compiled/VAEDecoder.mlmodelc/metadata.json +2 -2
  20. original/compiled/VAEDecoder.mlmodelc/model.mil +0 -0
  21. original/compiled/VAEDecoder.mlmodelc/weights/weight.bin +1 -1
  22. original/compiled/VAEEncoder.mlmodelc/analytics/coremldata.bin +1 -1
  23. original/compiled/VAEEncoder.mlmodelc/coremldata.bin +2 -2
  24. original/compiled/VAEEncoder.mlmodelc/metadata.json +8 -7
  25. original/compiled/VAEEncoder.mlmodelc/model.mil +0 -0
  26. original/compiled/VAEEncoder.mlmodelc/weights/weight.bin +2 -2
  27. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder.mlpackage/Data/com.apple.CoreML/model.mlmodel +3 -0
  28. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder.mlpackage/Data/com.apple.CoreML/weights/weight.bin +3 -0
  29. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder.mlpackage/Manifest.json +18 -0
  30. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder_2.mlpackage/Data/com.apple.CoreML/model.mlmodel +3 -0
  31. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder_2.mlpackage/Data/com.apple.CoreML/weights/weight.bin +3 -0
  32. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder_2.mlpackage/Manifest.json +18 -0
  33. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_unet.mlpackage/Data/com.apple.CoreML/model.mlmodel +3 -0
  34. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_unet.mlpackage/Data/com.apple.CoreML/weights/weight.bin +3 -0
  35. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_unet.mlpackage/Manifest.json +18 -0
  36. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_decoder.mlpackage/Data/com.apple.CoreML/model.mlmodel +3 -0
  37. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_decoder.mlpackage/Data/com.apple.CoreML/weights/weight.bin +3 -0
  38. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_decoder.mlpackage/Manifest.json +18 -0
  39. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_encoder.mlpackage/Data/com.apple.CoreML/model.mlmodel +3 -0
  40. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_encoder.mlpackage/Data/com.apple.CoreML/weights/weight.bin +3 -0
  41. original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_encoder.mlpackage/Manifest.json +18 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - stable-diffusion
5
+ - text-to-image
6
+ - core-ml
7
+ ---
8
+
9
+ # Stable Diffusion XL v0.9 Model Card
10
+
11
+ This model was generated using [Apple’s repository](https://github.com/apple/ml-stable-diffusion) which has [ASCL](https://github.com/apple/ml-stable-diffusion/blob/main/LICENSE.md).
12
+
13
+ This model card focuses on the model associated with the Stable Diffusion XL v0.9 Base model, codebase available [here](https://github.com/Stability-AI/generative-models).
14
+
15
+ SDXL v0.9 consists of a two-step pipeline for latent diffusion:
16
+ First, we use a base model to generate latents of the desired output size.
17
+ In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img")
18
+ to the latents generated in the first step, using the same prompt.
19
+
20
+ Only the base model is included here.
21
+
22
+ These weights here have been converted to Core ML for use on Apple Silicon hardware.
23
+
24
+ There are 2 variants of the Core ML weights:
25
+
26
+ ```
27
+ coreml-stable-diffusion-xl-v0-9-base
28
+ └── original
29
+ ├── compiled # Swift inference, "original" attention
30
+ └── packages # Python inference, "original"
31
+ ```
32
+
33
+ ### Model Description
34
+
35
+ - **Developed by:** Stability AI
36
+ - **Model type:** Diffusion-based text-to-image generative model
37
+ - **License:** [SDXL 0.9 Research License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9/blob/main/LICENSE.md)
38
+ - **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
39
+ - **Resources for more information:** [GitHub Repository](https://github.com/Stability-AI/generative-models) [SDXL paper on arXiv](https://arxiv.org/abs/2307.01952).
40
+
41
+ ### Model Sources
42
+
43
+ <!-- Provide the basic links for the model. -->
44
+
45
+ - **Repository:** https://github.com/Stability-AI/generative-models
46
+ - **Demo [optional]:** https://clipdrop.co/stable-diffusion
47
+
48
+ ## Uses
49
+
50
+ ### Direct Use
51
+
52
+ The model is intended for research purposes only. Possible research areas and tasks include
53
+
54
+ - Generation of artworks and use in design and other artistic processes.
55
+ - Applications in educational or creative tools.
56
+ - Research on generative models.
57
+ - Safe deployment of models which have the potential to generate harmful content.
58
+ - Probing and understanding the limitations and biases of generative models.
59
+
60
+ Excluded uses are described below.
61
+
62
+ ### Out-of-Scope Use
63
+
64
+ The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
65
+
66
+ ## Limitations and Bias
67
+
68
+ ### Limitations
69
+
70
+ - The model does not achieve perfect photorealism
71
+ - The model cannot render legible text
72
+ - The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere”
73
+ - Faces and people in general may not be generated properly.
74
+ - The autoencoding part of the model is lossy.
75
+
76
+ ### Bias
77
+ While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
78
+
79
+ ## Evaluation
80
+ ![comparison](https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9/resolve/main/comparison.png)
81
+ The chart above evaluates user preference for SDXL (with and without refinement) over Stable Diffusion 1.5 and 2.1.
82
+ The SDXL base model performs significantly better than the previous variants, and the model combined with the refinement module achieves the best overall performance.
original/compiled/TextEncoder.mlmodelc/analytics/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1d41d9734f7cae96ac14e5a1a74bf2506518892f3181b371c796d6694d38aa59
3
  size 207
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d61ad418454b8cd2e529d2ca1c2d6752b38920bb991afc50c9b7a18cf9deba9
3
  size 207
original/compiled/TextEncoder.mlmodelc/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:676e8a6290fbaeaf1271e8e9d6d3ce3fa08c3078df6bf330e5c52584ba98fce8
3
- size 857
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31a51c152cd09c8f8e7313c74922e1cf9dcd219b84c27f583fc8c2bb0255aca4
3
+ size 983
original/compiled/TextEncoder.mlmodelc/metadata.json CHANGED
@@ -1,6 +1,6 @@
1
  [
2
  {
3
- "shortDescription" : "Stable Diffusion generates images conditioned on text and\/or other images as input through the diffusion process. Please refer to https:\/\/arxiv.org\/abs\/2112.10752 for details.",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
@@ -30,21 +30,22 @@
30
  ],
31
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
32
  "specificationVersion" : 7,
33
- "storagePrecision" : "Float32",
34
- "license" : "OpenRAIL (https:\/\/huggingface.co\/spaces\/CompVis\/stable-diffusion-license)",
35
  "mlProgramOperationTypeHistogram" : {
36
  "Ios16.sigmoid" : 12,
37
  "Ios16.add" : 37,
38
  "Ios16.mul" : 36,
 
39
  "Transpose" : 60,
40
  "Ios16.gather" : 1,
41
  "Ios16.linear" : 72,
42
  "Ios16.reshape" : 120,
43
  "Ios16.matmul" : 24,
44
  "Ios16.layerNorm" : 25,
45
- "Ios16.softmax" : 12
46
  },
47
- "computePrecision" : "Mixed (Int32, Float32)",
48
  "isUpdatable" : "0",
49
  "availability" : {
50
  "macOS" : "13.0",
 
1
  [
2
  {
3
+ "shortDescription" : "This is a model that can be used to generate and modify images based on text prompts.It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT\/G and CLIP-ViT\/L).Please refer to https:\/\/arxiv.org\/abs\/2307.01952 for details",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
 
30
  ],
31
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
32
  "specificationVersion" : 7,
33
+ "storagePrecision" : "Float16",
34
+ "license" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9\/blob\/main\/LICENSE.md",
35
  "mlProgramOperationTypeHistogram" : {
36
  "Ios16.sigmoid" : 12,
37
  "Ios16.add" : 37,
38
  "Ios16.mul" : 36,
39
+ "Ios16.softmax" : 12,
40
  "Transpose" : 60,
41
  "Ios16.gather" : 1,
42
  "Ios16.linear" : 72,
43
  "Ios16.reshape" : 120,
44
  "Ios16.matmul" : 24,
45
  "Ios16.layerNorm" : 25,
46
+ "Ios16.cast" : 2
47
  },
48
+ "computePrecision" : "Mixed (Float32, Int32, Float16)",
49
  "isUpdatable" : "0",
50
  "availability" : {
51
  "macOS" : "13.0",
original/compiled/TextEncoder.mlmodelc/model.mil CHANGED
The diff for this file is too large to render. See raw diff
 
original/compiled/TextEncoder.mlmodelc/weights/weight.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2110183f4dc569f0522faf609549ca716492a21801cbd3ece3289a0a11cdb85b
3
- size 492278308
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6cca5606968298f28e3caaf87f17ff97929e6b19d512ecd988a6487252d13f3e
3
+ size 246145536
original/compiled/TextEncoder2.mlmodelc/analytics/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:eeb16c04fd05c6cbfbc31e02f12b46cd763948b046f5090e155394caab371330
3
  size 207
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ef8f4f1cf7c7b6982f2b471ed71745ba5fda876f0b1ffd121ccb3488e8351e2
3
  size 207
original/compiled/TextEncoder2.mlmodelc/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e60db2852d139e6d33e5c6d4dd83a82e2af6da2ff48998b5a3efd46493bbb00e
3
- size 857
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31a51c152cd09c8f8e7313c74922e1cf9dcd219b84c27f583fc8c2bb0255aca4
3
+ size 983
original/compiled/TextEncoder2.mlmodelc/metadata.json CHANGED
@@ -1,6 +1,6 @@
1
  [
2
  {
3
- "shortDescription" : "Stable Diffusion generates images conditioned on text and\/or other images as input through the diffusion process. Please refer to https:\/\/arxiv.org\/abs\/2112.10752 for details.",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
@@ -30,9 +30,10 @@
30
  ],
31
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
32
  "specificationVersion" : 7,
33
- "storagePrecision" : "Float32",
34
- "license" : "OpenRAIL (https:\/\/huggingface.co\/spaces\/CompVis\/stable-diffusion-license)",
35
  "mlProgramOperationTypeHistogram" : {
 
36
  "Ios16.mul" : 32,
37
  "Ios16.layerNorm" : 65,
38
  "Stack" : 1,
@@ -47,7 +48,7 @@
47
  "Ios16.reshape" : 320,
48
  "Ios16.reduceArgmax" : 1
49
  },
50
- "computePrecision" : "Mixed (Int32, Float32)",
51
  "isUpdatable" : "0",
52
  "availability" : {
53
  "macOS" : "13.0",
 
1
  [
2
  {
3
+ "shortDescription" : "This is a model that can be used to generate and modify images based on text prompts.It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT\/G and CLIP-ViT\/L).Please refer to https:\/\/arxiv.org\/abs\/2307.01952 for details",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
 
30
  ],
31
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
32
  "specificationVersion" : 7,
33
+ "storagePrecision" : "Float16",
34
+ "license" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9\/blob\/main\/LICENSE.md",
35
  "mlProgramOperationTypeHistogram" : {
36
+ "Ios16.cast" : 2,
37
  "Ios16.mul" : 32,
38
  "Ios16.layerNorm" : 65,
39
  "Stack" : 1,
 
48
  "Ios16.reshape" : 320,
49
  "Ios16.reduceArgmax" : 1
50
  },
51
+ "computePrecision" : "Mixed (Float32, Int32, Float16)",
52
  "isUpdatable" : "0",
53
  "availability" : {
54
  "macOS" : "13.0",
original/compiled/TextEncoder2.mlmodelc/model.mil CHANGED
The diff for this file is too large to render. See raw diff
 
original/compiled/TextEncoder2.mlmodelc/weights/weight.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:20a38d041b247e8881e346ece58927362b11ee28ed7a92de5bf099ba8b72a73b
3
- size 2778701504
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ef8c4f33384b8c36d50cd67651a109021140636446560a77e957bed9ac4b865
3
+ size 1389367424
original/compiled/Unet.mlmodelc/analytics/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3aad8ab40f69863996c71e3be7476eecdf0148c0db96fd31eeeda470077deeaf
3
  size 207
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:afab1e1bf38085ae7b9e976bd341abec70e994d89674f74c9728b2d8d185650a
3
  size 207
original/compiled/Unet.mlmodelc/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:63184014d142d0c1aee7e320cfe06c5b55364fe9c72237d4ae0ddecea279bcd2
3
- size 1349
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:34e11991f2c2b1cfcd418ac7060b429c72b0a6fcb3940ca84ff850eac215485d
3
+ size 1717
original/compiled/Unet.mlmodelc/metadata.json CHANGED
@@ -1,6 +1,6 @@
1
  [
2
  {
3
- "shortDescription" : "Stable Diffusion generates images conditioned on text or other images as input through the diffusion process. Please refer to https:\/\/arxiv.org\/abs\/2112.10752 for details.",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
@@ -20,8 +20,8 @@
20
  ],
21
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
22
  "specificationVersion" : 7,
23
- "storagePrecision" : "Float32",
24
- "license" : "OpenRAIL (https:\/\/huggingface.co\/spaces\/CompVis\/stable-diffusion-license)",
25
  "mlProgramOperationTypeHistogram" : {
26
  "UpsampleNearestNeighbor" : 2,
27
  "Ios16.reduceMean" : 512,
@@ -34,19 +34,20 @@
34
  "Ios16.square" : 46,
35
  "ExpandDims" : 6,
36
  "Ios16.sub" : 256,
 
37
  "Ios16.conv" : 794,
38
  "Ios16.gelu" : 70,
39
  "Ios16.matmul" : 280,
40
  "Ios16.reshape" : 676,
41
- "Ios16.rsqrt" : 210,
42
  "Ios16.batchNorm" : 46,
 
43
  "Ios16.silu" : 38,
44
  "Ios16.sqrt" : 46,
45
  "SliceByIndex" : 4,
46
  "Ios16.mul" : 842,
47
  "Ios16.cos" : 2
48
  },
49
- "computePrecision" : "Mixed (Float32, Int32)",
50
  "isUpdatable" : "0",
51
  "availability" : {
52
  "macOS" : "13.0",
@@ -62,8 +63,8 @@
62
  {
63
  "hasShapeFlexibility" : "0",
64
  "isOptional" : "0",
65
- "dataType" : "Float32",
66
- "formattedType" : "MultiArray (Float32 2 × 4 × 128 × 128)",
67
  "shortDescription" : "The low resolution latent feature maps being denoised through reverse diffusion",
68
  "shape" : "[2, 4, 128, 128]",
69
  "name" : "sample",
@@ -72,8 +73,8 @@
72
  {
73
  "hasShapeFlexibility" : "0",
74
  "isOptional" : "0",
75
- "dataType" : "Float32",
76
- "formattedType" : "MultiArray (Float32 2)",
77
  "shortDescription" : "A value emitted by the associated scheduler object to condition the model on a given noise schedule",
78
  "shape" : "[2]",
79
  "name" : "timestep",
@@ -82,8 +83,8 @@
82
  {
83
  "hasShapeFlexibility" : "0",
84
  "isOptional" : "0",
85
- "dataType" : "Float32",
86
- "formattedType" : "MultiArray (Float32 2 × 2048 × 1 × 77)",
87
  "shortDescription" : "Output embeddings from the associated text_encoder model to condition to generated image on text. A maximum of 77 tokens (~40 words) are allowed. Longer text is truncated. Shorter text does not reduce computation.",
88
  "shape" : "[2, 2048, 1, 77]",
89
  "name" : "encoder_hidden_states",
@@ -92,9 +93,9 @@
92
  {
93
  "hasShapeFlexibility" : "0",
94
  "isOptional" : "0",
95
- "dataType" : "Float32",
96
- "formattedType" : "MultiArray (Float32 2 × 1280)",
97
- "shortDescription" : "",
98
  "shape" : "[2, 1280]",
99
  "name" : "text_embeds",
100
  "type" : "MultiArray"
@@ -102,9 +103,9 @@
102
  {
103
  "hasShapeFlexibility" : "0",
104
  "isOptional" : "0",
105
- "dataType" : "Float32",
106
- "formattedType" : "MultiArray (Float32 2 × 6)",
107
- "shortDescription" : "",
108
  "shape" : "[2, 6]",
109
  "name" : "time_ids",
110
  "type" : "MultiArray"
 
1
  [
2
  {
3
+ "shortDescription" : "This is a model that can be used to generate and modify images based on text prompts.It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT\/G and CLIP-ViT\/L).Please refer to https:\/\/arxiv.org\/abs\/2307.01952 for details",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
 
20
  ],
21
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
22
  "specificationVersion" : 7,
23
+ "storagePrecision" : "Float16",
24
+ "license" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9\/blob\/main\/LICENSE.md",
25
  "mlProgramOperationTypeHistogram" : {
26
  "UpsampleNearestNeighbor" : 2,
27
  "Ios16.reduceMean" : 512,
 
34
  "Ios16.square" : 46,
35
  "ExpandDims" : 6,
36
  "Ios16.sub" : 256,
37
+ "Ios16.cast" : 1,
38
  "Ios16.conv" : 794,
39
  "Ios16.gelu" : 70,
40
  "Ios16.matmul" : 280,
41
  "Ios16.reshape" : 676,
 
42
  "Ios16.batchNorm" : 46,
43
+ "Ios16.rsqrt" : 210,
44
  "Ios16.silu" : 38,
45
  "Ios16.sqrt" : 46,
46
  "SliceByIndex" : 4,
47
  "Ios16.mul" : 842,
48
  "Ios16.cos" : 2
49
  },
50
+ "computePrecision" : "Mixed (Float32, Float16, Int32)",
51
  "isUpdatable" : "0",
52
  "availability" : {
53
  "macOS" : "13.0",
 
63
  {
64
  "hasShapeFlexibility" : "0",
65
  "isOptional" : "0",
66
+ "dataType" : "Float16",
67
+ "formattedType" : "MultiArray (Float16 2 × 4 × 128 × 128)",
68
  "shortDescription" : "The low resolution latent feature maps being denoised through reverse diffusion",
69
  "shape" : "[2, 4, 128, 128]",
70
  "name" : "sample",
 
73
  {
74
  "hasShapeFlexibility" : "0",
75
  "isOptional" : "0",
76
+ "dataType" : "Float16",
77
+ "formattedType" : "MultiArray (Float16 2)",
78
  "shortDescription" : "A value emitted by the associated scheduler object to condition the model on a given noise schedule",
79
  "shape" : "[2]",
80
  "name" : "timestep",
 
83
  {
84
  "hasShapeFlexibility" : "0",
85
  "isOptional" : "0",
86
+ "dataType" : "Float16",
87
+ "formattedType" : "MultiArray (Float16 2 × 2048 × 1 × 77)",
88
  "shortDescription" : "Output embeddings from the associated text_encoder model to condition to generated image on text. A maximum of 77 tokens (~40 words) are allowed. Longer text is truncated. Shorter text does not reduce computation.",
89
  "shape" : "[2, 2048, 1, 77]",
90
  "name" : "encoder_hidden_states",
 
93
  {
94
  "hasShapeFlexibility" : "0",
95
  "isOptional" : "0",
96
+ "dataType" : "Float16",
97
+ "formattedType" : "MultiArray (Float16 2 × 1280)",
98
+ "shortDescription" : "Additional embeddings passed to the unet based on the pooled output of the text encoders.",
99
  "shape" : "[2, 1280]",
100
  "name" : "text_embeds",
101
  "type" : "MultiArray"
 
103
  {
104
  "hasShapeFlexibility" : "0",
105
  "isOptional" : "0",
106
+ "dataType" : "Float16",
107
+ "formattedType" : "MultiArray (Float16 2 × 6)",
108
+ "shortDescription" : "Additional embeddings passed to the unet based on width and height dimensions.For SDXL, default values look like [1024, 1024, 0, 0, 1024, 1024]",
109
  "shape" : "[2, 6]",
110
  "name" : "time_ids",
111
  "type" : "MultiArray"
original/compiled/Unet.mlmodelc/model.mil CHANGED
The diff for this file is too large to render. See raw diff
 
original/compiled/Unet.mlmodelc/weights/weight.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:230a9fc8889551b0918a06b8180d853a71a3989b03fea5e2e9a4deb01c42b70b
3
- size 10270025728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4ea70231f6ddd496d7bdf5a2342804a3379cc07858894e3b7d0405ce6ddac51
3
+ size 5135067072
original/compiled/VAEDecoder.mlmodelc/analytics/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fb6787f60fb9b486d6446ac1f0a079338090db09327f5ac6205be584c090d6be
3
  size 207
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3cdeab9f108fbe64f0c0d010f12dc5ec310494651180c2d49ec841e2048eadaf
3
  size 207
original/compiled/VAEDecoder.mlmodelc/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:da0f7a9e420bb3c984b5a0ec7ae9da197aec0173345e72cc12e99b3cb1c9f634
3
- size 789
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8b118250857ca645b31395620884948e146637c4e3e3d5431ec921077de9d825
3
+ size 915
original/compiled/VAEDecoder.mlmodelc/metadata.json CHANGED
@@ -1,6 +1,6 @@
1
  [
2
  {
3
- "shortDescription" : "Stable Diffusion generates images conditioned on text and\/or other images as input through the diffusion process. Please refer to https:\/\/arxiv.org\/abs\/2112.10752 for details.",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
@@ -21,7 +21,7 @@
21
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
22
  "specificationVersion" : 7,
23
  "storagePrecision" : "Float32",
24
- "license" : "OpenRAIL (https:\/\/huggingface.co\/spaces\/CompVis\/stable-diffusion-license)",
25
  "mlProgramOperationTypeHistogram" : {
26
  "Ios16.mul" : 2,
27
  "Ios16.sqrt" : 30,
 
1
  [
2
  {
3
+ "shortDescription" : "This is a model that can be used to generate and modify images based on text prompts.It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT\/G and CLIP-ViT\/L).Please refer to https:\/\/arxiv.org\/abs\/2307.01952 for details",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
 
21
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
22
  "specificationVersion" : 7,
23
  "storagePrecision" : "Float32",
24
+ "license" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9\/blob\/main\/LICENSE.md",
25
  "mlProgramOperationTypeHistogram" : {
26
  "Ios16.mul" : 2,
27
  "Ios16.sqrt" : 30,
original/compiled/VAEDecoder.mlmodelc/model.mil CHANGED
The diff for this file is too large to render. See raw diff
 
original/compiled/VAEDecoder.mlmodelc/weights/weight.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:18fad6c79d09981d3a2762ad56067bffc8ce1075ff04d02c600d9f389acdc135
3
  size 197977216
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:663fd963f7a13afa32b8c415944bd3652f3c249335d02e2f57ff02bc3ac447bc
3
  size 197977216
original/compiled/VAEEncoder.mlmodelc/analytics/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4095a542239cda86e99e9bddb764910ef6849e945043d34f8f2687d0015ae403
3
  size 207
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c3ca3543a6edddd2a8268d7380d9abf1a0855d5e7cb897f6dd64ae71eda21628
3
  size 207
original/compiled/VAEEncoder.mlmodelc/coremldata.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:993e26b79e7209543da327b1e379ce4bdb4589bdc047068d3782c394428d6b29
3
- size 793
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:91f3577db670a798e18e4c89c20e64af914d145df096422944051a8aa843341f
3
+ size 919
original/compiled/VAEEncoder.mlmodelc/metadata.json CHANGED
@@ -1,6 +1,6 @@
1
  [
2
  {
3
- "shortDescription" : "Stable Diffusion generates images conditioned on text and\/or other images as input through the diffusion process. Please refer to https:\/\/arxiv.org\/abs\/2112.10752 for details.",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
@@ -20,14 +20,15 @@
20
  ],
21
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
22
  "specificationVersion" : 7,
23
- "storagePrecision" : "Float32",
24
- "license" : "OpenRAIL (https:\/\/huggingface.co\/spaces\/CompVis\/stable-diffusion-license)",
25
  "mlProgramOperationTypeHistogram" : {
26
  "Pad" : 3,
 
27
  "Ios16.mul" : 2,
28
  "Ios16.sqrt" : 22,
29
  "Ios16.sub" : 22,
30
- "Transpose" : 7,
31
  "Ios16.conv" : 28,
32
  "Ios16.add" : 34,
33
  "Ios16.linear" : 4,
@@ -37,10 +38,10 @@
37
  "Ios16.softmax" : 1,
38
  "Ios16.batchNorm" : 21,
39
  "Ios16.square" : 22,
40
- "Ios16.reshape" : 53,
41
  "Ios16.silu" : 21
42
  },
43
- "computePrecision" : "Mixed (Float32, Int32)",
44
  "isUpdatable" : "0",
45
  "availability" : {
46
  "macOS" : "13.0",
@@ -60,7 +61,7 @@
60
  "formattedType" : "MultiArray (Float32 1 × 3 × 1024 × 1024)",
61
  "shortDescription" : "The input image to base the initial latents on normalized to range [-1, 1]",
62
  "shape" : "[1, 3, 1024, 1024]",
63
- "name" : "x",
64
  "type" : "MultiArray"
65
  }
66
  ],
 
1
  [
2
  {
3
+ "shortDescription" : "This is a model that can be used to generate and modify images based on text prompts.It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT\/G and CLIP-ViT\/L).Please refer to https:\/\/arxiv.org\/abs\/2307.01952 for details",
4
  "metadataOutputVersion" : "3.0",
5
  "outputSchema" : [
6
  {
 
20
  ],
21
  "author" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9",
22
  "specificationVersion" : 7,
23
+ "storagePrecision" : "Float16",
24
+ "license" : "Please refer to the Model Card available at huggingface.co\/stabilityai\/stable-diffusion-xl-base-0.9\/blob\/main\/LICENSE.md",
25
  "mlProgramOperationTypeHistogram" : {
26
  "Pad" : 3,
27
+ "Ios16.cast" : 2,
28
  "Ios16.mul" : 2,
29
  "Ios16.sqrt" : 22,
30
  "Ios16.sub" : 22,
31
+ "Transpose" : 6,
32
  "Ios16.conv" : 28,
33
  "Ios16.add" : 34,
34
  "Ios16.linear" : 4,
 
38
  "Ios16.softmax" : 1,
39
  "Ios16.batchNorm" : 21,
40
  "Ios16.square" : 22,
41
+ "Ios16.reshape" : 49,
42
  "Ios16.silu" : 21
43
  },
44
+ "computePrecision" : "Mixed (Float16, Float32, Int32)",
45
  "isUpdatable" : "0",
46
  "availability" : {
47
  "macOS" : "13.0",
 
61
  "formattedType" : "MultiArray (Float32 1 × 3 × 1024 × 1024)",
62
  "shortDescription" : "The input image to base the initial latents on normalized to range [-1, 1]",
63
  "shape" : "[1, 3, 1024, 1024]",
64
+ "name" : "z",
65
  "type" : "MultiArray"
66
  }
67
  ],
original/compiled/VAEEncoder.mlmodelc/model.mil CHANGED
The diff for this file is too large to render. See raw diff
 
original/compiled/VAEEncoder.mlmodelc/weights/weight.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d81117fbeac1fc04756a65c45ab9b19216ad90a3e1f3671422c1058ff5a33c94
3
- size 136668992
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d7da2b2913ea54fd16b4e333eb32c0037c9638b2d1c7d435693ed37126407b94
3
+ size 68338112
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder.mlpackage/Data/com.apple.CoreML/model.mlmodel ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36b3de7a2f96c0c39ac7d41a812772ce5b4c4afd16add7ffee23e416e81a22d0
3
+ size 160264
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder.mlpackage/Data/com.apple.CoreML/weights/weight.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6cca5606968298f28e3caaf87f17ff97929e6b19d512ecd988a6487252d13f3e
3
+ size 246145536
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder.mlpackage/Manifest.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "fileFormatVersion": "1.0.0",
3
+ "itemInfoEntries": {
4
+ "0350E48E-A8C6-426E-BDD6-166A73298B13": {
5
+ "author": "com.apple.CoreML",
6
+ "description": "CoreML Model Weights",
7
+ "name": "weights",
8
+ "path": "com.apple.CoreML/weights"
9
+ },
10
+ "D2BA910D-6F75-4199-8750-1D4B0492A4F1": {
11
+ "author": "com.apple.CoreML",
12
+ "description": "CoreML Model Specification",
13
+ "name": "model.mlmodel",
14
+ "path": "com.apple.CoreML/model.mlmodel"
15
+ }
16
+ },
17
+ "rootModelIdentifier": "D2BA910D-6F75-4199-8750-1D4B0492A4F1"
18
+ }
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder_2.mlpackage/Data/com.apple.CoreML/model.mlmodel ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:528d719f21ac515b4a38b13c4d14045c408162cda304f98e1decacb81acbb069
3
+ size 418177
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder_2.mlpackage/Data/com.apple.CoreML/weights/weight.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ef8c4f33384b8c36d50cd67651a109021140636446560a77e957bed9ac4b865
3
+ size 1389367424
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_text_encoder_2.mlpackage/Manifest.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "fileFormatVersion": "1.0.0",
3
+ "itemInfoEntries": {
4
+ "DCD7CD28-C7E6-41BF-9F4E-18E1D19CA36A": {
5
+ "author": "com.apple.CoreML",
6
+ "description": "CoreML Model Specification",
7
+ "name": "model.mlmodel",
8
+ "path": "com.apple.CoreML/model.mlmodel"
9
+ },
10
+ "F4A213FD-B395-4524-A121-16101CA91D5D": {
11
+ "author": "com.apple.CoreML",
12
+ "description": "CoreML Model Weights",
13
+ "name": "weights",
14
+ "path": "com.apple.CoreML/weights"
15
+ }
16
+ },
17
+ "rootModelIdentifier": "DCD7CD28-C7E6-41BF-9F4E-18E1D19CA36A"
18
+ }
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_unet.mlpackage/Data/com.apple.CoreML/model.mlmodel ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:083d5f3088e9fa90ad7ad095af5670851ff4cbc2bb1d0ccffc84187d44c39760
3
+ size 2041217
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_unet.mlpackage/Data/com.apple.CoreML/weights/weight.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4ea70231f6ddd496d7bdf5a2342804a3379cc07858894e3b7d0405ce6ddac51
3
+ size 5135067072
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_unet.mlpackage/Manifest.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "fileFormatVersion": "1.0.0",
3
+ "itemInfoEntries": {
4
+ "276D98D0-72D2-43C4-9645-A5B77B1015F5": {
5
+ "author": "com.apple.CoreML",
6
+ "description": "CoreML Model Specification",
7
+ "name": "model.mlmodel",
8
+ "path": "com.apple.CoreML/model.mlmodel"
9
+ },
10
+ "8EEB7D0C-8CE2-48E2-AB89-6209C6DCB419": {
11
+ "author": "com.apple.CoreML",
12
+ "description": "CoreML Model Weights",
13
+ "name": "weights",
14
+ "path": "com.apple.CoreML/weights"
15
+ }
16
+ },
17
+ "rootModelIdentifier": "276D98D0-72D2-43C4-9645-A5B77B1015F5"
18
+ }
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_decoder.mlpackage/Data/com.apple.CoreML/model.mlmodel ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5002d25626e356a2a6322a2beb56b50b81a97ebb2852ee5fb891c0972db1366e
3
+ size 143856
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_decoder.mlpackage/Data/com.apple.CoreML/weights/weight.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:663fd963f7a13afa32b8c415944bd3652f3c249335d02e2f57ff02bc3ac447bc
3
+ size 197977216
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_decoder.mlpackage/Manifest.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "fileFormatVersion": "1.0.0",
3
+ "itemInfoEntries": {
4
+ "71C0A52F-911E-4A1F-8AEB-41D269851E22": {
5
+ "author": "com.apple.CoreML",
6
+ "description": "CoreML Model Specification",
7
+ "name": "model.mlmodel",
8
+ "path": "com.apple.CoreML/model.mlmodel"
9
+ },
10
+ "A45AFD33-166B-41F4-ADD9-9457DB486071": {
11
+ "author": "com.apple.CoreML",
12
+ "description": "CoreML Model Weights",
13
+ "name": "weights",
14
+ "path": "com.apple.CoreML/weights"
15
+ }
16
+ },
17
+ "rootModelIdentifier": "71C0A52F-911E-4A1F-8AEB-41D269851E22"
18
+ }
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_encoder.mlpackage/Data/com.apple.CoreML/model.mlmodel ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdc76cd1a7c13413fb720f20e3765a5750b297689cc73987d1312d734a0128df
3
+ size 119781
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_encoder.mlpackage/Data/com.apple.CoreML/weights/weight.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d7da2b2913ea54fd16b4e333eb32c0037c9638b2d1c7d435693ed37126407b94
3
+ size 68338112
original/packages/Stable_Diffusion_version_stabilityai_stable-diffusion-xl-base-0.9_vae_encoder.mlpackage/Manifest.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "fileFormatVersion": "1.0.0",
3
+ "itemInfoEntries": {
4
+ "8663546A-C9EA-416B-9FB4-F20E32D69D13": {
5
+ "author": "com.apple.CoreML",
6
+ "description": "CoreML Model Weights",
7
+ "name": "weights",
8
+ "path": "com.apple.CoreML/weights"
9
+ },
10
+ "C73A6D8A-8FF2-4D7C-A47B-60917AD21914": {
11
+ "author": "com.apple.CoreML",
12
+ "description": "CoreML Model Specification",
13
+ "name": "model.mlmodel",
14
+ "path": "com.apple.CoreML/model.mlmodel"
15
+ }
16
+ },
17
+ "rootModelIdentifier": "C73A6D8A-8FF2-4D7C-A47B-60917AD21914"
18
+ }