Uploaded model

  • Developed by: yasserrmd
  • License: apache-2.0
  • Finetuned from model : Qwen/Qwen2.5-3B-Instruct

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
238
Safetensors
Model size
3.09B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for yasserrmd/Coder-GRPO-3B

Base model

Qwen/Qwen2.5-3B
Quantized
(106)
this model
Quantizations
2 models

Dataset used to train yasserrmd/Coder-GRPO-3B