Warlord-K commited on
Commit
f0ff74b
·
1 Parent(s): 7df092c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -22
README.md CHANGED
@@ -14,24 +14,24 @@ datasets:
14
  library_name: diffusers
15
  ---
16
 
17
- # SSD-Tiny Model Card
18
 
19
 
20
  ## 🔥🔥Join our [Discord](https://discord.gg/rF44ueRG) to give feedback on our models and get early access🔥🔥
21
 
22
  ## Demo
23
 
24
- Try out the SSD-Tiny model at [Segmind SSD-Tiny]() for ⚡ fastest inference. You can also explore it on [🤗 Spaces](https://huggingface.co/spaces/segmind/SSD-Tiny)
25
 
26
  ## Model Description
27
 
28
- The SSD-Tiny Model is a distilled version of the Stable Diffusion XL (SDXL), offering a remarkable **70% reduction in size** and an impressive **80% speedup** while retaining high-quality text-to-image generation capabilities. Trained on diverse datasets, including Grit and Midjourney scrape data, it excels at creating a wide range of visual content based on textual prompts.
29
 
30
- Employing a knowledge distillation strategy, SSD-Tiny leverages the teachings of several expert models, including SDXL, ZavyChromaXL, and JuggernautXL, to combine their strengths and produce compelling visual outputs.
31
 
32
  Special thanks to the HF team 🤗, especially [Sayak](https://huggingface.co/sayakpaul), [Patrick](https://github.com/patrickvonplaten), and [Poli](https://huggingface.co/multimodalart), for their collaboration and guidance on this work.
33
 
34
- ## Image Comparison (SDXL-1.0 vs SSD-Tiny)
35
 
36
  ## Usage:
37
  This model can be used via the 🧨 Diffusers library.
@@ -52,7 +52,7 @@ To use the model, you can run the following:
52
  from diffusers import StableDiffusionXLPipeline
53
  import torch
54
 
55
- pipe = StableDiffusionXLPipeline.from_pretrained("segmind/SSD-Tiny", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
56
  pipe.to("cuda")
57
  # if using torch < 2.0
58
  # pipe.enable_xformers_memory_efficient_attention()
@@ -73,17 +73,17 @@ image = pipe(prompt=prompt, negative_prompt=neg_prompt).images[0]
73
 
74
  ### Key Features
75
 
76
- - **Text-to-Image Generation:** The SSD-Tiny model excels at generating images from text prompts, enabling a wide range of creative applications.
77
 
78
- - **Distilled for Speed:** Designed for efficiency, this model offers an impressive 80% speedup, making it suitable for real-time applications and scenarios where rapid image generation is essential.
79
 
80
  - **Diverse Training Data:** Trained on diverse datasets, the model can handle a variety of textual prompts and generate corresponding images effectively.
81
 
82
- - **Knowledge Distillation:** By distilling knowledge from multiple expert models, the SSD-Tiny Model combines their strengths and minimizes their limitations, resulting in improved performance.
83
 
84
  ### Model Architecture
85
 
86
- The SSD-Tiny Model is a compact version with a remarkable 70% reduction in size compared to the Base SDXL Model.
87
 
88
  ### Training Info
89
 
@@ -98,20 +98,20 @@ These are the key hyperparameters used during training:
98
 
99
  ### Speed Comparison
100
 
101
- SSD-Tiny has demonstrated an impressive 80% speedup compared to the Base SDXL Model. Below is a comparison on an A100 80GB.
102
 
103
  Below are the speed-up metrics on an RTX 4090 GPU.
104
 
105
 
106
  ### Model Sources
107
 
108
- For research and development purposes, the SSD-Tiny Model can be accessed via the Segmind AI platform. For more information and access details, please visit [Segmind](https://www.segmind.com/models/ssd-tiny).
109
 
110
  ## Uses
111
 
112
  ### Direct Use
113
 
114
- The SSD-Tiny Model is suitable for research and practical applications in various domains, including:
115
 
116
  - **Art and Design:** It can be used to generate artworks, designs, and other creative content, providing inspiration and enhancing the creative process.
117
 
@@ -125,11 +125,11 @@ The SSD-Tiny Model is suitable for research and practical applications in variou
125
 
126
  ### Downstream Use
127
 
128
- The SSD-Tiny Model can also be used directly with the 🧨 Diffusers library training scripts for further training, including:
129
 
130
  - **[LoRA](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_lora_sdxl.py):**
131
  ```bash
132
- export MODEL_NAME="segmind/SSD-Tiny"
133
  export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
134
  export DATASET_NAME="lambdalabs/pokemon-blip-captions"
135
 
@@ -143,14 +143,14 @@ accelerate launch train_text_to_image_lora_sdxl.py \
143
  --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \
144
  --mixed_precision="fp16" \
145
  --seed=42 \
146
- --output_dir="sd-pokemon-model-lora-tiny" \
147
  --validation_prompt="cute dragon creature" --report_to="wandb" \
148
  --push_to_hub
149
  ```
150
 
151
  - **[Fine-Tune](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_sdxl.py):**
152
  ```bash
153
- export MODEL_NAME="segmind/SSD-Tiny"
154
  export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
155
  export DATASET_NAME="lambdalabs/pokemon-blip-captions"
156
 
@@ -170,15 +170,15 @@ accelerate launch train_text_to_image_sdxl.py \
170
  --report_to="wandb" \
171
  --validation_prompt="a cute Sundar Pichai creature" --validation_epochs 5 \
172
  --checkpointing_steps=5000 \
173
- --output_dir="ssd-pokemon-model-tiny" \
174
  --push_to_hub
175
  ```
176
 
177
  - **[Dreambooth LoRA](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora_sdxl.py):**
178
  ```bash
179
- export MODEL_NAME="segmind/SSD-Tiny"
180
  export INSTANCE_DIR="dog"
181
- export OUTPUT_DIR="lora-trained-tiny"
182
  export VAE_PATH="madebyollin/sdxl-vae-fp16-fix"
183
 
184
  accelerate launch train_dreambooth_lora_sdxl.py \
@@ -204,9 +204,9 @@ accelerate launch train_dreambooth_lora_sdxl.py \
204
 
205
  ### Out-of-Scope Use
206
 
207
- The SSD-Tiny Model is not suitable for creating factual or accurate representations of people, events, or real-world information. It is not intended for tasks requiring high precision and accuracy.
208
 
209
  ## Limitations and Bias
210
 
211
  **Limitations & Bias:**
212
- The SSD-Tiny Model faces challenges in achieving absolute photorealism, especially in human depictions. While it may encounter difficulties in incorporating clear text and maintaining the fidelity of complex compositions due to its autoencoding approach, these challenges present opportunities for future enhancements. Importantly, the model's exposure to a diverse dataset, though not a cure-all for ingrained societal and digital biases, represents a foundational step toward more equitable technology. Users are encouraged to interact with this pioneering tool with an understanding of its current limitations, fostering an environment of conscious engagement and anticipation for its continued evolution.
 
14
  library_name: diffusers
15
  ---
16
 
17
+ # Segmind-Vega Model Card
18
 
19
 
20
  ## 🔥🔥Join our [Discord](https://discord.gg/rF44ueRG) to give feedback on our models and get early access🔥🔥
21
 
22
  ## Demo
23
 
24
+ Try out the Segmind-Vega model at [Segmind-Vega]() for ⚡ fastest inference. You can also explore it on [🤗 Spaces](https://huggingface.co/spaces/segmind/Segmind-Vega)
25
 
26
  ## Model Description
27
 
28
+ The Segmind-Vega Model is a distilled version of the Stable Diffusion XL (SDXL), offering a remarkable **70% reduction in size** and an impressive **100% speedup** while retaining high-quality text-to-image generation capabilities. Trained on diverse datasets, including Grit and Midjourney scrape data, it excels at creating a wide range of visual content based on textual prompts.
29
 
30
+ Employing a knowledge distillation strategy, Segmind-Vega leverages the teachings of several expert models, including SDXL, ZavyChromaXL, and JuggernautXL, to combine their strengths and produce compelling visual outputs.
31
 
32
  Special thanks to the HF team 🤗, especially [Sayak](https://huggingface.co/sayakpaul), [Patrick](https://github.com/patrickvonplaten), and [Poli](https://huggingface.co/multimodalart), for their collaboration and guidance on this work.
33
 
34
+ ## Image Comparison (SDXL-1.0 vs Segmind-Vega)
35
 
36
  ## Usage:
37
  This model can be used via the 🧨 Diffusers library.
 
52
  from diffusers import StableDiffusionXLPipeline
53
  import torch
54
 
55
+ pipe = StableDiffusionXLPipeline.from_pretrained("segmind/Segmind-Vega", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
56
  pipe.to("cuda")
57
  # if using torch < 2.0
58
  # pipe.enable_xformers_memory_efficient_attention()
 
73
 
74
  ### Key Features
75
 
76
+ - **Text-to-Image Generation:** The Segmind-Vega model excels at generating images from text prompts, enabling a wide range of creative applications.
77
 
78
+ - **Distilled for Speed:** Designed for efficiency, this model offers an impressive 100% speedup, making it suitable for real-time applications and scenarios where rapid image generation is essential.
79
 
80
  - **Diverse Training Data:** Trained on diverse datasets, the model can handle a variety of textual prompts and generate corresponding images effectively.
81
 
82
+ - **Knowledge Distillation:** By distilling knowledge from multiple expert models, the Segmind-Vega Model combines their strengths and minimizes their limitations, resulting in improved performance.
83
 
84
  ### Model Architecture
85
 
86
+ The Segmind-Vega Model is a compact version with a remarkable 70% reduction in size compared to the Base SDXL Model.
87
 
88
  ### Training Info
89
 
 
98
 
99
  ### Speed Comparison
100
 
101
+ Segmind-Vega has demonstrated an impressive 100% speedup compared to the Base SDXL Model. Below is a comparison on an A100 80GB.
102
 
103
  Below are the speed-up metrics on an RTX 4090 GPU.
104
 
105
 
106
  ### Model Sources
107
 
108
+ For research and development purposes, the Segmind-Vega Model can be accessed via the Segmind AI platform. For more information and access details, please visit [Segmind](https://www.segmind.com/models/Segmind-Vega).
109
 
110
  ## Uses
111
 
112
  ### Direct Use
113
 
114
+ The Segmind-Vega Model is suitable for research and practical applications in various domains, including:
115
 
116
  - **Art and Design:** It can be used to generate artworks, designs, and other creative content, providing inspiration and enhancing the creative process.
117
 
 
125
 
126
  ### Downstream Use
127
 
128
+ The Segmind-Vega Model can also be used directly with the 🧨 Diffusers library training scripts for further training, including:
129
 
130
  - **[LoRA](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_lora_sdxl.py):**
131
  ```bash
132
+ export MODEL_NAME="segmind/Segmind-Vega"
133
  export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
134
  export DATASET_NAME="lambdalabs/pokemon-blip-captions"
135
 
 
143
  --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \
144
  --mixed_precision="fp16" \
145
  --seed=42 \
146
+ --output_dir="vega-pokemon-model-lora" \
147
  --validation_prompt="cute dragon creature" --report_to="wandb" \
148
  --push_to_hub
149
  ```
150
 
151
  - **[Fine-Tune](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_sdxl.py):**
152
  ```bash
153
+ export MODEL_NAME="segmind/Segmind-Vega"
154
  export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
155
  export DATASET_NAME="lambdalabs/pokemon-blip-captions"
156
 
 
170
  --report_to="wandb" \
171
  --validation_prompt="a cute Sundar Pichai creature" --validation_epochs 5 \
172
  --checkpointing_steps=5000 \
173
+ --output_dir="vega-pokemon-model" \
174
  --push_to_hub
175
  ```
176
 
177
  - **[Dreambooth LoRA](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora_sdxl.py):**
178
  ```bash
179
+ export MODEL_NAME="segmind/Segmind-Vega"
180
  export INSTANCE_DIR="dog"
181
+ export OUTPUT_DIR="lora-trained-vega"
182
  export VAE_PATH="madebyollin/sdxl-vae-fp16-fix"
183
 
184
  accelerate launch train_dreambooth_lora_sdxl.py \
 
204
 
205
  ### Out-of-Scope Use
206
 
207
+ The Segmind-Vega Model is not suitable for creating factual or accurate representations of people, events, or real-world information. It is not intended for tasks requiring high precision and accuracy.
208
 
209
  ## Limitations and Bias
210
 
211
  **Limitations & Bias:**
212
+ The Segmind-Vega Model faces challenges in achieving absolute photorealism, especially in human depictions. While it may encounter difficulties in incorporating clear text and maintaining the fidelity of complex compositions due to its autoencoding approach, these challenges present opportunities for future enhancements. Importantly, the model's exposure to a diverse dataset, though not a cure-all for ingrained societal and digital biases, represents a foundational step toward more equitable technology. Users are encouraged to interact with this pioneering tool with an understanding of its current limitations, fostering an environment of conscious engagement and anticipation for its continued evolution.