Delta-Vector
/

SD-Prompter-1.5B-V0.1-EXL2

chat

Model card Files Files and versions Community

Delta-Vector commited on 25 days ago

Commit

c1f00ec

•

1 Parent(s): 9f357ae

Create README.md

Browse files

Files changed (1) hide show

README.md +137 -0

README.md ADDED Viewed

	@@ -0,0 +1,137 @@

+---
+license: agpl-3.0
+tags:
+- chat
+datasets:
+- NewEden/CivitAI-SD-Prompts
+License: agpl-3.0
+Language:
+- En
+Pipeline_tag: text-generation
+Base_model: NewEden/Qwen-1.5B-Claude
+Tags:
+- Chat
+---
+---
+### exl2 quant (measurement.json in main branch)
+---
+### check revisions for quants
+---
+This is the first in a line of models dedicated to creating Stable-Diffusion prompts when given a character appearance, This has been finetuned ontop of
+[NewEden/Qwen-1.5B-Claude](https://huggingface.co/NewEden/Qwen-1.5B-Claude).
+## Prompting
+Model has been tuned with the Alapaca formatting. A typical input would look like this:
+```
+### Instruction:
+Create a prompt for Stable Diffusion based on the information below.
+### Input:
+Rae has short has dark brown hair and brown eyes, She is commonly seen wearing her Royal Academy uniform, which consists of a red jacket with gold lines, a white ruffled necktie, a red bow tie with an attached blue gem, and a long black skirt with white lines. Along with her uniform, she wears black leggings and brown shoes.
+### Response:
+```
+## System Prompting
+I would highly recommend using the following system prompt for this model.
+```
+Create a prompt for Stable Diffusion based on the information below.
+```
+## Axolotl Config
+<details><summary>See Axolotl Trainer config</summary>
+```yaml
+base_model: NewEden/Qwen-1.5B-Claude
+model_type: AutoModelForCausalLM
+tokenizer_type: AutoTokenizer
+trust_remote_code: true
+load_in_8bit: false
+load_in_4bit: false
+strict: false
+datasets:
+  - path: civit-slop-combined.jsonl
+    type: alpaca
+    conversation: mpt-30b-instruct
+chat_template: alpaca
+dataset_prepared_path:
+val_set_size: 0.05
+output_dir: ./outputs/sd-prompter
+sequence_len: 2048
+sample_packing: true
+eval_sample_packing: false
+pad_to_sequence_len: true
+adapter:
+lora_model_dir:
+lora_r:
+lora_alpha:
+lora_dropout:
+lora_target_linear: true
+lora_fan_in_fan_out:
+wandb_project: SDprompt-qwen
+wandb_entity:
+wandb_watch:
+wandb_name: qwen1.5b-2
+wandb_log_model:
+gradient_accumulation_steps: 64
+micro_batch_size: 2
+num_epochs: 3
+optimizer: adamw_torch
+lr_scheduler: cosine
+learning_rate: 0.00002
+train_on_inputs: false
+group_by_length: false
+bf16: auto
+fp16:
+tf32: true
+gradient_checkpointing: true
+gradient_checkpointing_kwargs:
+  use_reentrant: false
+early_stopping_patience:
+resume_from_checkpoint:
+local_rank:
+logging_steps: 1
+xformers_attention:
+flash_attention: true
+warmup_ratio: 0.05
+evals_per_epoch: 4
+saves_per_epoch: 1
+debug:
+#deepspeed: deepspeed_configs/zero2.json
+#deepspeed: /training/axolotl/axolotl/deepspeed_configs/zero2.json
+weight_decay: 0.0
+#fsdp:
+#fsdp_config:
+#  fsdp_limit_all_gathers: true
+#  fsdp_sync_module_states: true
+#  fsdp_offload_params: true
+#  fsdp_use_orig_params: false
+#  fsdp_cpu_ram_efficient_loading: true
+#  fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
+#  fsdp_transformer_layer_cls_to_wrap: Qwen2DecoderLayer
+#  fsdp_state_dict_type: FULL_STATE_DICT
+special_tokens:
+```
+</details><br>
+## Credits
+Thank you to [Kubernetes Bad](https://huggingface.co/kubernetes-bad)
+## Training
+The training was done for 2 epochs. I used  2 x [RTX 6000s](https://www.nvidia.com/en-us/design-visualization/rtx-6000/) GPUs graciously provided by [Kubernetes Bad](https://huggingface.co/kubernetes-bad) for the full-parameter fine-tuning of the model.