--- base_model: - unsloth/SmolLM2-1.7B-Instruct-bnb-4bit - HuggingFaceTB/SmolLM2-1.7B-Instruct tags: - text-generation-inference - transformers - unsloth - trl - sft license: apache-2.0 language: - en datasets: - AI-MO/NuminaMath-TIR --- # Uploaded model - **Developed by:** Qurtana - **License:** apache-2.0 - **Finetuned from model :** unsloth/SmolLM2-1.7B-Instruct-bnb-4bit This model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. Trained using rank-stablized QLoRA with r = 64 and alpha = 5 for one epoch using the "ChatML" data prep. The following heads were targeted: "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "embed_tokens", and "lm_head". I strongly believe that this should achieve better performance than the original, particularly in math and reasoning. Hopefully the MUSR and MATH Lvl 5 evaluations reflect this. [](https://github.com/unslothai/unsloth)