Fischerboot commited on
Commit
bc065e0
1 Parent(s): 8bcede7

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +148 -3
README.md CHANGED
@@ -1,3 +1,148 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - concedo/KobbleTinyV2-1.1B
4
+ library_name: transformers
5
+ tags:
6
+ - mergekit
7
+ - merge
8
+
9
+ ---
10
+ # Tinyllama-2B
11
+
12
+ This is a merge and a finetune to create a small, but very useable Model, and i have to say, its very good.
13
+
14
+ ## Basic Question:
15
+
16
+ <img src="https://huggingface.co/Aculi/Tinyllama-2B/resolve/main/.huggingface/Screenshot%202024-07-29%20073647.jpg" alt="download.png" width="800" />
17
+
18
+
19
+ ## Prompt Template
20
+
21
+ Tinyllama-2B uses Alpaca:
22
+
23
+ ```
24
+ ### Instruction:
25
+ {prompt}
26
+
27
+ ### Response:
28
+ ```
29
+
30
+ ### Merge Info:
31
+
32
+ This is a frankenmerge of: [concedo/KobbleTinyV2-1.1B](https://huggingface.co/concedo/KobbleTinyV2-1.1B)
33
+
34
+ The following YAML configuration was used to produce this model:
35
+
36
+ ```yaml
37
+ dtype: bfloat16
38
+ merge_method: passthrough
39
+ slices:
40
+ - sources:
41
+ - layer_range: [0, 16]
42
+ model: concedo/KobbleTinyV2-1.1B
43
+ - sources:
44
+ - layer_range: [5, 16]
45
+ model: concedo/KobbleTinyV2-1.1B
46
+ parameters:
47
+ scale:
48
+ - filter: o_proj
49
+ value: 0.0
50
+ - filter: down_proj
51
+ value: 0.0
52
+ - value: 1.0
53
+ - sources:
54
+ - layer_range: [5, 16]
55
+ model: concedo/KobbleTinyV2-1.1B
56
+ parameters:
57
+ scale:
58
+ - filter: o_proj
59
+ value: 0.0
60
+ - filter: down_proj
61
+ value: 0.0
62
+ - value: 1.0
63
+ - sources:
64
+ - layer_range: [16, 22]
65
+ model: concedo/KobbleTinyV2-1.1B
66
+ ```
67
+
68
+ ## Finetune Info:
69
+
70
+ The following YAML configuration was used to finetune this model:
71
+
72
+ ```yaml
73
+ base_model: Fischerboot/2b-tiny-llama-alpaca-instr
74
+ model_type: LlamaForCausalLM
75
+ tokenizer_type: LlamaTokenizer
76
+
77
+ load_in_8bit: false
78
+ load_in_4bit: true
79
+ strict: false
80
+
81
+ datasets:
82
+ - path: Fischerboot/freedom-rp-alpaca-shortend
83
+ type: alpaca
84
+ - path: diffnamehard/toxic-dpo-v0.1-NoWarning-alpaca
85
+ type: alpaca
86
+ - path: Fischerboot/alpaca-undensored-fixed-50k
87
+ type: alpaca
88
+ - path: Fischerboot/DAN-alpaca
89
+ type: alpaca
90
+ - path: Fischerboot/rp-alpaca-next-oone
91
+ type: alpaca
92
+
93
+ dataset_prepared_path:
94
+ val_set_size: 0.05
95
+ output_dir: ./outputs/24r
96
+
97
+ adapter: qlora
98
+ lora_model_dir:
99
+
100
+ sequence_len: 2048
101
+ sample_packing: true
102
+ eval_sample_packing: false
103
+ pad_to_sequence_len: true
104
+
105
+ lora_r: 32
106
+ lora_alpha: 16
107
+ lora_dropout: 0.05
108
+ lora_target_modules:
109
+ lora_target_linear: true
110
+ lora_fan_in_fan_out:
111
+
112
+ wandb_project:
113
+ wandb_entity:
114
+ wandb_watch:
115
+ wandb_name:
116
+ wandb_log_model:
117
+
118
+ gradient_accumulation_steps: 4
119
+ micro_batch_size: 2
120
+ num_epochs: 4
121
+ optimizer: paged_adamw_32bit
122
+ lr_scheduler: cosine
123
+ learning_rate: 0.0002
124
+
125
+ train_on_inputs: false
126
+ group_by_length: false
127
+ bf16: auto
128
+ fp16:
129
+ tf32: false
130
+
131
+ gradient_checkpointing: true
132
+ early_stopping_patience:
133
+ resume_from_checkpoint:
134
+ local_rank:
135
+ logging_steps: 1
136
+ xformers_attention: true
137
+ flash_attention: true
138
+
139
+ warmup_steps: 10
140
+ evals_per_epoch: 2
141
+ saves_per_epoch: 1
142
+ debug:
143
+ deepspeed:
144
+ weight_decay: 0.0
145
+ fsdp:
146
+ fsdp_config:
147
+ special_tokens:
148
+ ```