chameleon-lizard commited on
Commit
9970ed7
·
verified ·
1 Parent(s): 7e48234

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -13,7 +13,29 @@ A continued pretrained version of unsloth/Qwen2.5-7B model using unsloth's low r
13
 
14
  For pretraining, posts from [SubMaroon/DTF_comments_Responses_Counts](https://huggingface.co/datasets/SubMaroon/DTF_Comments_Responses_Counts) were selected, deduplicated by simple `df.unique` and filtered by length of 1000 < x < 128000 tokens.
15
 
16
- Hyperparameters:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  ```
19
  num_train_epochs=2
@@ -29,6 +51,11 @@ packing=True,
29
  seed=42
30
  ```
31
 
 
 
 
 
 
32
  [Wandb](https://wandb.ai/a_okshus/DTF_comments/runs/fr5hfq6g?nw=nwusera_okshus)
33
 
34
  [GitHub: TODO]()
 
13
 
14
  For pretraining, posts from [SubMaroon/DTF_comments_Responses_Counts](https://huggingface.co/datasets/SubMaroon/DTF_Comments_Responses_Counts) were selected, deduplicated by simple `df.unique` and filtered by length of 1000 < x < 128000 tokens.
15
 
16
+ LoRA hyperparameters:
17
+
18
+ ```
19
+ r=32
20
+ target_modules=[
21
+ "q_proj",
22
+ "k_proj",
23
+ "v_proj",
24
+ "o_proj",
25
+ "gate_proj",
26
+ "up_proj",
27
+ "down_proj",
28
+ ]
29
+ lora_alpha=16
30
+ lora_dropout=0
31
+ bias="none"
32
+ use_gradient_checkpointing='unsloth'
33
+ use_rslora=True
34
+ random_state=42
35
+
36
+ ```
37
+
38
+ Training hyperparameters:
39
 
40
  ```
41
  num_train_epochs=2
 
51
  seed=42
52
  ```
53
 
54
+ Training time:
55
+
56
+ - NVidia Tesla A100 80GB: ~8.5 hours
57
+ - NVidia RTX 3090ti: ~33.5 hours
58
+
59
  [Wandb](https://wandb.ai/a_okshus/DTF_comments/runs/fr5hfq6g?nw=nwusera_okshus)
60
 
61
  [GitHub: TODO]()