vietphuon commited on
Commit
a33157e
1 Parent(s): 82d35f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -0
README.md CHANGED
@@ -10,6 +10,105 @@ tags:
10
  - llama
11
  - trl
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  # Uploaded model
15
 
 
10
  - llama
11
  - trl
12
  ---
13
+ DATASET
14
+ ------------------------------
15
+ - **What's new?:** Use the version 3.2 of dataset (Langfuse + AWS) that has better quality:
16
+ - Remove all the 10, 15 question count, just focus on 5 question count
17
+ - Fix all the Vietnamese quiz (make sure the output is Vietnamese)
18
+ - Fix some lazy duplicated topic (Biglead, Computing)
19
+ - Remove Paragraph, replace Paragraph with MCQ for all data points
20
+ - Train using the default training config (60 step, linear lr)
21
+
22
+ TRAINING
23
+ ------------------------------
24
+ - Overview:
25
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64952a1e5ba8e6c66e1a0fa8/QBR1IUoD7REKoGG_kJtRS.png)
26
+ - Use low rank 8 to avoid overfitting and keep the generalization of model
27
+
28
+ Step Training Loss
29
+ 1 1.216600
30
+ 2 1.181100
31
+ 3 1.236900
32
+ 4 1.157100
33
+ 5 1.184100
34
+ 6 1.103500
35
+ 7 1.150900
36
+ 8 1.112900
37
+ 9 1.074600
38
+ 10 1.095700
39
+ 11 0.966400
40
+ 12 0.977000
41
+ 13 1.004500
42
+ 14 0.931500
43
+ 15 0.869900
44
+ 16 0.886300
45
+ 17 0.900000
46
+ 18 0.792500
47
+ 19 0.814200
48
+ 20 0.808900
49
+ 21 0.815200
50
+ 22 0.771100
51
+ 23 0.800000
52
+ 24 0.782500
53
+ 25 0.772700
54
+ 26 0.698300
55
+ 27 0.759500
56
+ 28 0.718500
57
+ 29 0.711400
58
+ 30 0.759400
59
+ 31 0.717000
60
+ 32 0.708700
61
+ 33 0.726800
62
+ 34 0.724500
63
+ 35 0.747800
64
+ 36 0.715600
65
+ 37 0.708100
66
+ 38 0.648300
67
+ 39 0.677900
68
+ 40 0.685600
69
+ 41 0.726100
70
+ 42 0.687300
71
+ 43 0.663100
72
+ 44 0.628600
73
+ 45 0.663300
74
+ 46 0.683500
75
+ 47 0.673800
76
+ 48 0.651100
77
+ 49 0.683700
78
+ 50 0.702400
79
+ 51 0.664400
80
+ 52 0.671800
81
+ 53 0.673000
82
+ 54 0.704000
83
+ 55 0.621100
84
+ 56 0.668200
85
+ 57 0.686000
86
+ 58 0.639500
87
+ 59 0.665400
88
+ 60 0.680900
89
+
90
+ - 4757.667 seconds used for training.
91
+ - 79.29 minutes used for training.
92
+ - Peak reserved memory = 13.857 GB.
93
+ - Peak reserved memory for training = 12.73 GB.
94
+ - Peak reserved memory % of max memory = 93.959 %.
95
+ - Peak reserved memory for training % of max memory = 86.317 %.
96
+ - Final loss = 0.680900
97
+ - View full training here: https://wandb.ai/vietphuongnguyen2602-rockship/huggingface/runs/ns2ym0hr
98
+
99
+
100
+ FINAL BENCHMARKING
101
+ ------------------------------
102
+ - **Time to First Token (TTFT):** 0.002s
103
+ - **Time Per Output Token (TPOT):** 40.85ms/token
104
+ - **Throughput (token/s):** 25.66token/s
105
+ - **Average Token Latency (ms/token):** 40.90ms/token
106
+ - **Total Generation Time:** 63.015s
107
+ - **Input Tokenization Time:** 0.008s
108
+ - **Input Tokens:** 1909
109
+ - **Output Tokens:** 984
110
+ - **Total Tokens:** 2892
111
+ - **Memory Usage (GPU):** 1.49GB
112
 
113
  # Uploaded model
114