metadata
license: apache-2.0
datasets:
- johannhartmann/steroids
- johannhartmann/oh25_mistral_dpo_de
language:
- de
- en
This is a simple experiment using geman ORPO training for one epoch using qlora and unsloth on Vezora/Mistral-22B-v0.2