Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
pbevan11
/
Mistral-Nemo-MCAI-SFT-DPO
like
0
Text Generation
Transformers
TensorBoard
Safetensors
pbevan11/multilingual-constitutional-preference-pairs
pbevan11/ultrafeedback_binarized_multilingual
mistral
alignment-handbook
trl
dpo
Generated from Trainer
conversational
text-generation-inference
Inference Endpoints
License:
apache-2.0
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Deploy
Use this model
0736e78
Mistral-Nemo-MCAI-SFT-DPO
/
all_results.json
pbevan11
Model save
0736e78
verified
4 months ago
raw
Copy download link
history
blame
216 Bytes
{
"epoch"
:
1.0
,
"total_flos"
:
0.0
,
"train_loss"
:
0.5964088123964976
,
"train_runtime"
:
1125.1043
,
"train_samples"
:
3969
,
"train_samples_per_second"
:
3.528
,
"train_steps_per_second"
:
0.074
}