vietphuon's picture
Update README.md
61d2375 verified
---
base_model: unsloth/Llama-3.2-1B-Instruct-bnb-4bit
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
---
DATASET
------------------------------
- **What's new?:** Use the version 3.2 of dataset (Langfuse + AWS) that has better quality:
- Remove all the 10, 15 question count, just focus on 5 question count
- Fix all the Vietnamese quiz (make sure the output is Vietnamese)
- Fix some lazy duplicated topic (Biglead, Computing)
- Remove Paragraph, replace Paragraph with MCQ for all data points
- Train using the default training config (60 step, linear lr)
TRAINING
------------------------------
- Overview:
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64952a1e5ba8e6c66e1a0fa8/QBR1IUoD7REKoGG_kJtRS.png)
- Use low rank 8 to avoid overfitting and keep the generalization of model
Step Training Loss
1 1.216600
2 1.181100
3 1.236900
4 1.157100
5 1.184100
6 1.103500
7 1.150900
8 1.112900
9 1.074600
10 1.095700
11 0.966400
12 0.977000
13 1.004500
14 0.931500
15 0.869900
16 0.886300
17 0.900000
18 0.792500
19 0.814200
20 0.808900
21 0.815200
22 0.771100
23 0.800000
24 0.782500
25 0.772700
26 0.698300
27 0.759500
28 0.718500
29 0.711400
30 0.759400
31 0.717000
32 0.708700
33 0.726800
34 0.724500
35 0.747800
36 0.715600
37 0.708100
38 0.648300
39 0.677900
40 0.685600
41 0.726100
42 0.687300
43 0.663100
44 0.628600
45 0.663300
46 0.683500
47 0.673800
48 0.651100
49 0.683700
50 0.702400
51 0.664400
52 0.671800
53 0.673000
54 0.704000
55 0.621100
56 0.668200
57 0.686000
58 0.639500
59 0.665400
60 0.680900
- 4757.667 seconds used for training.
- 79.29 minutes used for training.
- Peak reserved memory = 13.857 GB.
- Peak reserved memory for training = 12.73 GB.
- Peak reserved memory % of max memory = 93.959 %.
- Peak reserved memory for training % of max memory = 86.317 %.
- Final loss = 0.680900
- View full training here: https://wandb.ai/vietphuongnguyen2602-rockship/huggingface/runs/ns2ym0hr
FINAL BENCHMARKING
------------------------------
- **Time to First Token (TTFT):** 0.002s
- **Time Per Output Token (TPOT):** 37.15ms/token
- **Throughput (token/s):** 27.00token/s
- **Average Token Latency (ms/token):** 37.21ms/token
- **Total Generation Time:** 19.171s
- **Input Tokenization Time:** 0.008s
- **Input Tokens:** 1909
- **Output Tokens:** 517
- **Total Tokens:** 2426
- **Memory Usage (GPU):** 1.38GB
# Uploaded model
- **Developed by:** vietphuon
- **License:** apache-2.0
- **Finetuned from model :** unsloth/Llama-3.2-1B-Instruct-bnb-4bit
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)