Model Card for Model ID

This model is a quantized version of bhuvana-ak7/OrpoLlama-3.2-1B-V1. OrpoLlama-3.2-1B-V1 model is a fine-tuned version of meta-llama/Llama-3.2-1B, using ORPO (Optimized Regularization for Prompt Optimization) Trainer. This model is fine-tuned using the mlabonne/orpo-dpo-mix-40k dataset. Only 1000 data samples were used to train quickly using ORPO.

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
hellaswag	1	none	0	acc	↑	0.4772	±	0.0050
		none	0	acc_norm	↑	0.6366	±	0.0048

Tasks	Metric	Base Model	Quantized Model	Changed
hellaswag	acc	0.4772	0.4772	No
hellaswag	acc_norm	0.6366	0.6366	No