Delta-Vector commited on
Commit
2add0ff
·
verified ·
1 Parent(s): 5cbbed4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +131 -0
README.md ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: agpl-3.0
3
+ tags:
4
+ - chat
5
+ datasets:
6
+ - NewEden/CivitAI-SD-Prompts
7
+ License: agpl-3.0
8
+ Language:
9
+ - En
10
+ Pipeline_tag: text-generation
11
+ Base_model: NewEden/Qwen-1.5B-Claude
12
+ Tags:
13
+ - Chat
14
+ ---
15
+
16
+ This is the first in a line of models dedicated to creating Stable-Diffusion prompts when given a character appearance, This has been finetuned ontop of
17
+ [NewEden/Qwen-1.5B-Claude](https://huggingface.co/NewEden/Qwen-1.5B-Claude).
18
+
19
+ ## Prompting
20
+
21
+ Model has been tuned with the Alapaca formatting. A typical input would look like this:
22
+ ```
23
+ ### Instruction:
24
+ Create a prompt for Stable Diffusion based on the information below.
25
+ ### Input:
26
+ Rae has short has dark brown hair and brown eyes, She is commonly seen wearing her Royal Academy uniform, which consists of a red jacket with gold lines, a white ruffled necktie, a red bow tie with an attached blue gem, and a long black skirt with white lines. Along with her uniform, she wears black leggings and brown shoes.
27
+ ### Response:
28
+ ```
29
+
30
+ ## System Prompting
31
+
32
+ I would highly recommend using the following system prompt for this model.
33
+
34
+ ```
35
+ Create a prompt for Stable Diffusion based on the information below.
36
+ ```
37
+
38
+ ## Axolotl Config
39
+
40
+ <details2><summary>See Axolotl Trainer config</summary>
41
+
42
+ ```yaml
43
+ base_model: NewEden/Qwen-1.5B-Claude
44
+ model_type: AutoModelForCausalLM
45
+ tokenizer_type: AutoTokenizer
46
+
47
+ trust_remote_code: true
48
+
49
+ load_in_8bit: false
50
+ load_in_4bit: false
51
+ strict: false
52
+
53
+ datasets:
54
+ - path: civit-slop-combined.jsonl
55
+ type: alpaca
56
+ conversation: mpt-30b-instruct
57
+
58
+ chat_template: alpaca
59
+
60
+ dataset_prepared_path:
61
+ val_set_size: 0.05
62
+ output_dir: ./outputs/sd-prompter
63
+ sequence_len: 2048
64
+ sample_packing: true
65
+ eval_sample_packing: false
66
+ pad_to_sequence_len: true
67
+
68
+ adapter:
69
+ lora_model_dir:
70
+ lora_r:
71
+ lora_alpha:
72
+ lora_dropout:
73
+ lora_target_linear: true
74
+ lora_fan_in_fan_out:
75
+
76
+ wandb_project: SDprompt-qwen
77
+ wandb_entity:
78
+ wandb_watch:
79
+ wandb_name: qwen1.5b-2
80
+ wandb_log_model:
81
+
82
+ gradient_accumulation_steps: 64
83
+ micro_batch_size: 2
84
+ num_epochs: 3
85
+ optimizer: adamw_torch
86
+ lr_scheduler: cosine
87
+ learning_rate: 0.00002
88
+
89
+ train_on_inputs: false
90
+ group_by_length: false
91
+ bf16: auto
92
+ fp16:
93
+ tf32: true
94
+
95
+ gradient_checkpointing: true
96
+ gradient_checkpointing_kwargs:
97
+ use_reentrant: false
98
+ early_stopping_patience:
99
+ resume_from_checkpoint:
100
+ local_rank:
101
+ logging_steps: 1
102
+ xformers_attention:
103
+ flash_attention: true
104
+
105
+ warmup_ratio: 0.05
106
+ evals_per_epoch: 4
107
+ saves_per_epoch: 1
108
+ debug:
109
+ #deepspeed: deepspeed_configs/zero2.json
110
+ #deepspeed: /training/axolotl/axolotl/deepspeed_configs/zero2.json
111
+ weight_decay: 0.0
112
+ #fsdp:
113
+ #fsdp_config:
114
+ # fsdp_limit_all_gathers: true
115
+ # fsdp_sync_module_states: true
116
+ # fsdp_offload_params: true
117
+ # fsdp_use_orig_params: false
118
+ # fsdp_cpu_ram_efficient_loading: true
119
+ # fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
120
+ # fsdp_transformer_layer_cls_to_wrap: Qwen2DecoderLayer
121
+ # fsdp_state_dict_type: FULL_STATE_DICT
122
+ special_tokens:
123
+ ```
124
+ </details2><br>
125
+
126
+ ## Credits
127
+
128
+ Thank you to [Kubernetes Bad](https://huggingface.co/kubernetes-bad), [Lucy Knada](https://huggingface.co/lucyknada), [CelineDion](https://huggingface.co/CelineDion), [Intervitens](https://huggingface.co/intervitens), [Kalomaze](https://huggingface.co/kalomaze) and the rest of [Anthracite](https://huggingface.co/anthracite-org) (But not Alpin.)
129
+
130
+ ## Training
131
+ The training was done for 2 epochs. I used 2 x [RTX 6000s](https://www.nvidia.com/en-us/design-visualization/rtx-6000/) GPUs graciously provided by [Kubernetes Bad](https://huggingface.co/kubernetes-bad) for the full-parameter fine-tuning of the model.