Model Card for Model ID

This model is a quantized version of bhuvana-ak7/OrpoLlama-3.2-1B-V1. OrpoLlama-3.2-1B-V1 model is a fine-tuned version of meta-llama/Llama-3.2-1B, using ORPO (Optimized Regularization for Prompt Optimization) Trainer. This model is fine-tuned using the mlabonne/orpo-dpo-mix-40k dataset. Only 1000 data samples were used to train quickly using ORPO.

Tasks Version Filter n-shot Metric Value Stderr
hellaswag 1 none 0 acc ↑ 0.4772 ± 0.0050
none 0 acc_norm ↑ 0.6366 ± 0.0048
Tasks Metric Base Model Quantized Model Changed
hellaswag acc 0.4772 0.4772 No
hellaswag acc_norm 0.6366 0.6366 No
Downloads last month
96
Safetensors
Model size
1.24B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.