vietphuon
/

Llama-3.2-1B-Instruct-bnb-4bit-quizgen-241030-1

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

vietphuon commited on 9 days ago

Commit

a33157e

•

1 Parent(s): 82d35f3

Update README.md

Files changed (1) hide show

README.md +99 -0

README.md CHANGED Viewed

@@ -10,6 +10,105 @@ tags:
 - llama
 - trl
 ---
 # Uploaded  model

 - llama
 - trl
 ---
+DATASET
+------------------------------
+- **What's new?:** Use the version 3.2 of dataset (Langfuse + AWS) that has better quality:
+  - Remove all the 10, 15 question count, just focus on 5 question count
+  - Fix all the Vietnamese quiz (make sure the output is Vietnamese)
+  - Fix some lazy duplicated topic (Biglead, Computing)
+  - Remove Paragraph, replace Paragraph with MCQ for all data points
+  - Train using the default training config (60 step, linear lr)
+TRAINING
+------------------------------
+- Overview:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64952a1e5ba8e6c66e1a0fa8/QBR1IUoD7REKoGG_kJtRS.png)
+- Use low rank 8 to avoid overfitting and keep the generalization of model
+Step	Training Loss
+1	1.216600
+2	1.181100
+3	1.236900
+4	1.157100
+5	1.184100
+6	1.103500
+7	1.150900
+8	1.112900
+9	1.074600
+10	1.095700
+11	0.966400
+12	0.977000
+13	1.004500
+14	0.931500
+15	0.869900
+16	0.886300
+17	0.900000
+18	0.792500
+19	0.814200
+20	0.808900
+21	0.815200
+22	0.771100
+23	0.800000
+24	0.782500
+25	0.772700
+26	0.698300
+27	0.759500
+28	0.718500
+29	0.711400
+30	0.759400
+31	0.717000
+32	0.708700
+33	0.726800
+34	0.724500
+35	0.747800
+36	0.715600
+37	0.708100
+38	0.648300
+39	0.677900
+40	0.685600
+41	0.726100
+42	0.687300
+43	0.663100
+44	0.628600
+45	0.663300
+46	0.683500
+47	0.673800
+48	0.651100
+49	0.683700
+50	0.702400
+51	0.664400
+52	0.671800
+53	0.673000
+54	0.704000
+55	0.621100
+56	0.668200
+57	0.686000
+58	0.639500
+59	0.665400
+60	0.680900
+- 4757.667 seconds used for training.
+- 79.29 minutes used for training.
+- Peak reserved memory = 13.857 GB.
+- Peak reserved memory for training = 12.73 GB.
+- Peak reserved memory % of max memory = 93.959 %.
+- Peak reserved memory for training % of max memory = 86.317 %.
+- Final loss = 0.680900
+- View full training here: https://wandb.ai/vietphuongnguyen2602-rockship/huggingface/runs/ns2ym0hr
+FINAL BENCHMARKING
+------------------------------
+- **Time to First Token (TTFT):** 0.002s
+- **Time Per Output Token (TPOT):** 40.85ms/token
+- **Throughput (token/s):** 25.66token/s
+- **Average Token Latency (ms/token):** 40.90ms/token
+- **Total Generation Time:** 63.015s
+- **Input Tokenization Time:** 0.008s
+- **Input Tokens:** 1909
+- **Output Tokens:** 984
+- **Total Tokens:** 2892
+- **Memory Usage (GPU):** 1.49GB
 # Uploaded  model