RichardErkhov commited on
Commit
baef862
1 Parent(s): f1504f2

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +133 -0
README.md ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ bloom-560m-finetuned-sd-prompts - bnb 4bits
11
+ - Model creator: https://huggingface.co/mrm8488/
12
+ - Original model: https://huggingface.co/mrm8488/bloom-560m-finetuned-sd-prompts/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: bigscience-bloom-rail-1.0
20
+ tags:
21
+ - generated_from_trainer
22
+ - stable-diffusion
23
+ - diffusion
24
+ model-index:
25
+ - name: bloom-560m-finetuned-sd-prompts
26
+ results: []
27
+
28
+ datasets:
29
+ - Gustavosta/Stable-Diffusion-Prompts
30
+
31
+ widget:
32
+ - text: "<s>Prompt: young, curly haired, redhead Natalie Portman as a"
33
+ - text: "<s>Prompt: a powerful energy woman, by alexander fedosav"
34
+
35
+ inference:
36
+ parameters:
37
+ eos_token_id: 2
38
+ max_length: 128
39
+
40
+
41
+ ---
42
+
43
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
44
+ should probably proofread and complete it, then remove this comment. -->
45
+
46
+ # bloom-560m-finetuned-sd-prompts
47
+
48
+ This model is a fine-tuned version of [bigscience/bloom-560m](https://huggingface.co/bigscience/bloom-560m) on the [Gustavosta/Stable-Diffusion-Prompts](https://huggingface.co/datasets/Gustavosta/Stable-Diffusion-Prompts) dataset.
49
+ It achieves the following results on the evaluation set:
50
+ - Loss: 0.8742
51
+
52
+
53
+ ## Example of usage
54
+
55
+ ```py
56
+ import torch
57
+ from transformers import BloomTokenizerFast, BloomForCausalLM
58
+
59
+ device = 'cuda' if torch.cuda.is_available() else 'cpu'
60
+ ckpt = 'mrm8488/bloom-560m-finetuned-sd-prompts'
61
+
62
+ tokenizer = BloomTokenizerFast.from_pretrained(ckpt)
63
+ model = BloomForCausalLM.from_pretrained(ckpt).to(device)
64
+
65
+ def generate_prompt(text):
66
+ inputs = tokenizer(text, return_tensors='pt')
67
+ input_ids = inputs.input_ids.to(device)
68
+ attention_mask = inputs.attention_mask.to(device)
69
+ output = model.generate(input_ids, attention_mask=attention_mask, repetition_penalty=1.05, max_length=2048, eos_token_id=tokenizer.eos_token_id)
70
+
71
+ return tokenizer.decode(output[0], skip_special_tokens=False)
72
+
73
+ text = "<s>Prompt: pikachu dinning in the eiffel tower"
74
+
75
+ generate_prompt(text)
76
+
77
+ # Output: <s>Prompt: pikachu dinning in the eiffel tower, intricate, elegant, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha</s>
78
+ ```
79
+
80
+ ## Model description
81
+
82
+ More information needed
83
+
84
+ ## Intended uses & limitations
85
+
86
+ More information needed
87
+
88
+ ## Training and evaluation data
89
+
90
+ More information needed
91
+
92
+ ## Training procedure
93
+
94
+ ### Training hyperparameters
95
+
96
+ The following hyperparameters were used during training:
97
+ - learning_rate: 5e-05
98
+ - train_batch_size: 1
99
+ - eval_batch_size: 1
100
+ - seed: 42
101
+ - gradient_accumulation_steps: 4
102
+ - total_train_batch_size: 4
103
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
104
+ - lr_scheduler_type: linear
105
+ - num_epochs: 2
106
+ - mixed_precision_training: Native AMP
107
+
108
+ ### Training results
109
+
110
+ | Training Loss | Epoch | Step | Validation Loss |
111
+ |:-------------:|:-----:|:----:|:---------------:|
112
+ | 2.6743 | 0.17 | 100 | 2.0891 |
113
+ | 1.8919 | 0.33 | 200 | 1.7191 |
114
+ | 1.5907 | 0.5 | 300 | 1.4454 |
115
+ | 1.3865 | 0.67 | 400 | 1.3247 |
116
+ | 1.2487 | 0.83 | 500 | 1.2150 |
117
+ | 1.1565 | 1.0 | 600 | 1.1031 |
118
+ | 0.896 | 1.17 | 700 | 1.0612 |
119
+ | 0.8389 | 1.33 | 800 | 0.9994 |
120
+ | 0.8071 | 1.5 | 900 | 0.9530 |
121
+ | 0.7628 | 1.67 | 1000 | 0.9206 |
122
+ | 0.7423 | 1.83 | 1100 | 0.8883 |
123
+ | 0.7155 | 2.0 | 1200 | 0.8742 |
124
+
125
+
126
+ ### Framework versions
127
+
128
+ - Transformers 4.22.1
129
+ - Pytorch 1.12.1+cu113
130
+ - Datasets 2.5.1
131
+ - Tokenizers 0.12.1
132
+
133
+