Model Card for Model ID

This is a fine-tuned version of the GPT-Neo 1.3B model from EleutherAI, trained using ORPO (Odds Ratio Preference Optimization) on the 'mlabonne/orpo-dpo-mix-40k' dataset. It was fine-tuned with LoRA (Low-Rank Adaptation) to allow efficient training.

Evaluation results

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
hellaswag	1	none	0	acc	↑	0.3859	±	0.0049
		none	0	acc_norm	↑	0.4891	±	0.0050

Downloads last month: 8

Safetensors

Model size

1.32B params

Tensor type

F32

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.