4season/model_eval_test

Introduction

This model is test version, alignment-tuned model.

We utilize state-of-the-art instruction fine-tuning methods including direct preference optimization (DPO). After DPO training, we linearly merged models to boost performance.

Downloads last month
13
Safetensors
Model size
21.4B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for 4season/alignment_model_test

Quantizations
1 model