Model Card for Model ID
This model is a quantized version of bhuvana-ak7/OrpoLlama-3.2-1B-V1. OrpoLlama-3.2-1B-V1 model is a fine-tuned version of meta-llama/Llama-3.2-1B, using ORPO (Optimized Regularization for Prompt Optimization) Trainer. This model is fine-tuned using the mlabonne/orpo-dpo-mix-40k dataset. Only 1000 data samples were used to train quickly using ORPO.
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
hellaswag | 1 | none | 0 | acc | ↑ | 0.4772 | ± | 0.0050 |
none | 0 | acc_norm | ↑ | 0.6366 | ± | 0.0048 |
Tasks | Metric | Base Model | Quantized Model | Changed |
---|---|---|---|---|
hellaswag | acc | 0.4772 | 0.4772 | No |
hellaswag | acc_norm | 0.6366 | 0.6366 | No |
- Downloads last month
- 96
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.