---
base_model:
- unsloth/SmolLM2-1.7B-Instruct-bnb-4bit
- HuggingFaceTB/SmolLM2-1.7B-Instruct
tags:
- text-generation-inference
- transformers
- unsloth
- trl
- sft
license: apache-2.0
language:
- en
datasets:
- AI-MO/NuminaMath-TIR
---

# Uploaded  model

- **Developed by:** Qurtana
- **License:** apache-2.0
- **Finetuned from model :** unsloth/SmolLM2-1.7B-Instruct-bnb-4bit

This model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

Trained using rank-stablized QLoRA with r = 64 and alpha = 5 for one epoch using the "ChatML" data prep.

The following heads were targeted: "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "embed_tokens", and "lm_head".

I strongly believe that this should achieve better performance than the original, particularly in math and reasoning. Hopefully the MUSR and MATH Lvl 5 evaluations reflect this.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)


Dataset Citation:
```@misc{numina_math_datasets,
  author = {Jia LI, Edward Beeching, Lewis Tunstall, Ben Lipkin, Roman Soletskyi, Shengyi Costa Huang, Kashif Rasul, Longhui Yu, Albert Jiang, Ziju Shen, Zihan Qin, Bin Dong, Li Zhou, Yann Fleureau, Guillaume Lample, and Stanislas Polu},
  title = {NuminaMath TIR},
  year = {2024},
  publisher = {Numina},
  journal = {Hugging Face repository},
  howpublished = {\url{[https://huggingface.co/AI-MO/NuminaMath-TIR](https://github.com/project-numina/aimo-progress-prize/blob/main/report/numina_dataset.pdf)}}
}