Delta-Vector commited on
Commit
c1f00ec
1 Parent(s): 9f357ae

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +137 -0
README.md ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: agpl-3.0
3
+ tags:
4
+ - chat
5
+ datasets:
6
+ - NewEden/CivitAI-SD-Prompts
7
+ License: agpl-3.0
8
+ Language:
9
+ - En
10
+ Pipeline_tag: text-generation
11
+ Base_model: NewEden/Qwen-1.5B-Claude
12
+ Tags:
13
+ - Chat
14
+ ---
15
+
16
+ ---
17
+ ### exl2 quant (measurement.json in main branch)
18
+ ---
19
+ ### check revisions for quants
20
+ ---
21
+
22
+ This is the first in a line of models dedicated to creating Stable-Diffusion prompts when given a character appearance, This has been finetuned ontop of
23
+ [NewEden/Qwen-1.5B-Claude](https://huggingface.co/NewEden/Qwen-1.5B-Claude).
24
+
25
+ ## Prompting
26
+
27
+ Model has been tuned with the Alapaca formatting. A typical input would look like this:
28
+ ```
29
+ ### Instruction:
30
+ Create a prompt for Stable Diffusion based on the information below.
31
+ ### Input:
32
+ Rae has short has dark brown hair and brown eyes, She is commonly seen wearing her Royal Academy uniform, which consists of a red jacket with gold lines, a white ruffled necktie, a red bow tie with an attached blue gem, and a long black skirt with white lines. Along with her uniform, she wears black leggings and brown shoes.
33
+ ### Response:
34
+ ```
35
+
36
+ ## System Prompting
37
+
38
+ I would highly recommend using the following system prompt for this model.
39
+
40
+ ```
41
+ Create a prompt for Stable Diffusion based on the information below.
42
+ ```
43
+
44
+ ## Axolotl Config
45
+
46
+ <details><summary>See Axolotl Trainer config</summary>
47
+
48
+ ```yaml
49
+ base_model: NewEden/Qwen-1.5B-Claude
50
+ model_type: AutoModelForCausalLM
51
+ tokenizer_type: AutoTokenizer
52
+
53
+ trust_remote_code: true
54
+
55
+ load_in_8bit: false
56
+ load_in_4bit: false
57
+ strict: false
58
+
59
+ datasets:
60
+ - path: civit-slop-combined.jsonl
61
+ type: alpaca
62
+ conversation: mpt-30b-instruct
63
+
64
+ chat_template: alpaca
65
+
66
+ dataset_prepared_path:
67
+ val_set_size: 0.05
68
+ output_dir: ./outputs/sd-prompter
69
+ sequence_len: 2048
70
+ sample_packing: true
71
+ eval_sample_packing: false
72
+ pad_to_sequence_len: true
73
+
74
+ adapter:
75
+ lora_model_dir:
76
+ lora_r:
77
+ lora_alpha:
78
+ lora_dropout:
79
+ lora_target_linear: true
80
+ lora_fan_in_fan_out:
81
+
82
+ wandb_project: SDprompt-qwen
83
+ wandb_entity:
84
+ wandb_watch:
85
+ wandb_name: qwen1.5b-2
86
+ wandb_log_model:
87
+
88
+ gradient_accumulation_steps: 64
89
+ micro_batch_size: 2
90
+ num_epochs: 3
91
+ optimizer: adamw_torch
92
+ lr_scheduler: cosine
93
+ learning_rate: 0.00002
94
+
95
+ train_on_inputs: false
96
+ group_by_length: false
97
+ bf16: auto
98
+ fp16:
99
+ tf32: true
100
+
101
+ gradient_checkpointing: true
102
+ gradient_checkpointing_kwargs:
103
+ use_reentrant: false
104
+ early_stopping_patience:
105
+ resume_from_checkpoint:
106
+ local_rank:
107
+ logging_steps: 1
108
+ xformers_attention:
109
+ flash_attention: true
110
+
111
+ warmup_ratio: 0.05
112
+ evals_per_epoch: 4
113
+ saves_per_epoch: 1
114
+ debug:
115
+ #deepspeed: deepspeed_configs/zero2.json
116
+ #deepspeed: /training/axolotl/axolotl/deepspeed_configs/zero2.json
117
+ weight_decay: 0.0
118
+ #fsdp:
119
+ #fsdp_config:
120
+ # fsdp_limit_all_gathers: true
121
+ # fsdp_sync_module_states: true
122
+ # fsdp_offload_params: true
123
+ # fsdp_use_orig_params: false
124
+ # fsdp_cpu_ram_efficient_loading: true
125
+ # fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
126
+ # fsdp_transformer_layer_cls_to_wrap: Qwen2DecoderLayer
127
+ # fsdp_state_dict_type: FULL_STATE_DICT
128
+ special_tokens:
129
+ ```
130
+ </details><br>
131
+
132
+ ## Credits
133
+
134
+ Thank you to [Kubernetes Bad](https://huggingface.co/kubernetes-bad)
135
+
136
+ ## Training
137
+ The training was done for 2 epochs. I used 2 x [RTX 6000s](https://www.nvidia.com/en-us/design-visualization/rtx-6000/) GPUs graciously provided by [Kubernetes Bad](https://huggingface.co/kubernetes-bad) for the full-parameter fine-tuning of the model.