OPO Mistral-7B
Collection
8 items
•
Updated
This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-full on the yangzhao02/ListUltraFeedback dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Logps | Logits |
---|---|---|---|---|---|
-0.9788 | 0.4275 | 200 | -0.9776 | -289.3082 | -3.0493 |
-0.9812 | 0.8549 | 400 | -0.9810 | -291.1517 | -2.9495 |
Base model
mistralai/Mistral-7B-v0.1