|
--- |
|
language: |
|
- ms |
|
- en |
|
- zh |
|
- ta |
|
--- |
|
|
|
# Llama 3.2 1B Malaysian Reasoning |
|
|
|
Continue finetuning https://huggingface.co/meta-llama/Llama-3.2-1B on highly curated 1.2B tokens Malaysian instruction including reasoning dataset. |
|
|
|
## Improvement |
|
|
|
1. 128k context length. |
|
2. Support respond in Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu. |
|
3. Able to code in Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu. |
|
4. Multi-turn Malaysian context such as related to Malaysian Legislation, politics, religions and languages. |
|
5. Standard RAG. |
|
6. Reasoning! Support minimal reasoning in Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu. |
|
|
|
## MalayMMLU |
|
|
|
``` |
|
Model Accuracy shot by_letter category |
|
0 Llama-3.2-1B-Malaysian-Reasoning 48.939419 0shot True STEM |
|
1 Llama-3.2-1B-Malaysian-Reasoning 42.529898 0shot True Language |
|
2 Llama-3.2-1B-Malaysian-Reasoning 45.995663 0shot True Social science |
|
3 Llama-3.2-1B-Malaysian-Reasoning 49.323099 0shot True Others |
|
4 Llama-3.2-1B-Malaysian-Reasoning 49.043231 0shot True Humanities |
|
{'Social science': 6918, 'Language': 6288, 'Humanities': 4395, 'Others': 4169, 'STEM': 2443} |
|
Model : Llama-3.2-1B-Malaysian-Reasoning |
|
Metric : first |
|
Shot : 0shot |
|
average accuracy 47.16626209232134 |
|
accuracy for STEM 48.93941874744167 |
|
accuracy for Language 42.529898218829516 |
|
accuracy for Social science 45.99566348655681 |
|
accuracy for Others 49.323099064523866 |
|
accuracy for Humanities 49.04323094425484 |
|
``` |
|
|
|
## Training session |
|
|
|
We done 2 stage of training, |
|
|
|
1. Finetune on [Malaysian SFT](https://huggingface.co/datasets/mesolitica/Malaysian-SFT) to make the model understand Malaysian context. |
|
- Wandb at https://wandb.ai/huseinzol05/lora-embedding-256-llama3.2-1b-small-malaysian-reasoning |
|
2. Continue finetune on [Malaysian Reasoning](https://huggingface.co/datasets/mesolitica/Malaysian-Reasoning) including small samples of [Malaysian SFT](https://huggingface.co/datasets/mesolitica/Malaysian-SFT) to make it become reasoning model. |
|
- Wandb at https://wandb.ai/huseinzol05/lora-embedding-256-llama3.2-1b-small-malaysian-reasoning-cont |
|
|
|
## How we train |
|
|
|
1. LoRA on `["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "embed_tokens", "lm_head"]`. |
|
2. 256 Rank with alpha 512, or alpha of 2.0 |
|
3. Multipacking with proper SDPA causal masking to prevent document contamination and also make sure proper position ids. |
|
4. Forked CCE loss for LoRA `lm_head` to reduce memory consumption. |
|
|
|
Low Rank adapters pushed at [malayloraenjoyer/Llama-3.2-1B-Malaysian-Reasoning-LoRA](https://huggingface.co/malayloraenjoyer/Llama-3.2-1B-Malaysian-Reasoning-LoRA). |
|
|
|
Source code at https://github.com/mesolitica/malaya/tree/master/session/small-malaysian-reasoning |