Model Card for Model ID

This is a fine-tuned version of the GPT-Neo 1.3B model from EleutherAI, trained using ORPO (Odds Ratio Preference Optimization) on the 'mlabonne/orpo-dpo-mix-40k' dataset. It was fine-tuned with LoRA (Low-Rank Adaptation) to allow efficient training.

Evaluation results

Tasks Version Filter n-shot Metric Value Stderr
hellaswag 1 none 0 acc ↑ 0.3859 ± 0.0049
none 0 acc_norm ↑ 0.4891 ± 0.0050
Downloads last month
8
Safetensors
Model size
1.32B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.