Safetensors
English
llama
loubnabnl HF staff commited on
Commit
82e8923
·
verified ·
1 Parent(s): 5b03bc3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -14,12 +14,11 @@ base_model:
14
 
15
  This is a continual-pre-training of [Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on a mix of 📐 [FineMath](https://huggingface.co/datasets/HuggingFaceTB/finemath) (our new high quality math dataset) and [FineWeb-Edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu).
16
 
17
- The model demonstrates superior math performance compared to Llama 3.2 3B, while having similar performance on Knowledge, reasoning and Common sense benchmarks:
18
 
19
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61c141342aac764ce1654e43/HZ6KOc8IVXXOABrdv0dyK.png)
20
 
21
- It was trained on **160B tokens** using a mix of 40% FineWeb-Edu and 30% FineMath-4+ and 30% InfiWebMath-4+ from FineMath. We use [nanotron](https://github.com/huggingface/smollm/tree/main/pre-training/continual-pretraining) for the training. You can find the training scripts in our [SmolLM2 GitHub repo](https://github.com/huggingface/smollm).
22
-
23
 
24
  ## Use
25
 
 
14
 
15
  This is a continual-pre-training of [Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on a mix of 📐 [FineMath](https://huggingface.co/datasets/HuggingFaceTB/finemath) (our new high quality math dataset) and [FineWeb-Edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu).
16
 
17
+ The model demonstrates superior math performance compared to Llama 3.2 3B, while maintaining similar performance on knowledge, reasoning, and common sense benchmarks:
18
 
19
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61c141342aac764ce1654e43/HZ6KOc8IVXXOABrdv0dyK.png)
20
 
21
+ It was trained on **160B tokens** using a mix of 40% FineWeb-Edu and 60% from FineMath (30% FineMath-4+ subset and 30% InfiWebMath-4+ subset). We use [nanotron](https://github.com/huggingface/smollm/tree/main/pre-training/continual-pretraining) for the training, and you can find the training scripts in our [SmolLM2 GitHub repo](https://github.com/huggingface/smollm).
 
22
 
23
  ## Use
24