Visualize in Weights & Biases

qwen2.5-0.5b-expo-L1EXPO-ES-1

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise dataset. It achieves the following results on the evaluation set:

  • Loss: 4.8354
  • Logps: -80.1753
  • Logits: -0.6936
  • Objective: 4.8114
  • Dpo Loss: 2.5735
  • Regularize: 4.8114
  • Ranking Simple: 0.5248
  • Ranking Idealized: 0.5295
  • Ranking Idealized Expo: 0.5212
  • Wo Beta: 13.9356

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 144
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Ranking Idealized Ranking Idealized Expo Wo Beta
0.4306 0.1417 50 0.5493 -90.4264 -1.4289 0.5433 0.7632 0.5433 0.5212 0.5295 0.5212 16.2237
1.748 0.2834 100 1.6975 -88.0491 -1.2535 1.6864 1.1354 1.6864 0.5228 0.5295 0.5212 15.6834
2.8697 0.4251 150 2.9624 -82.4967 -1.2524 2.8923 1.6846 2.8923 0.5243 0.5295 0.5212 15.1970
3.5268 0.5668 200 4.0302 -75.9716 -0.9581 3.9597 2.1590 3.9597 0.5238 0.5295 0.5212 14.5792
3.7241 0.7085 250 4.2694 -81.3047 -0.7680 4.2728 2.3310 4.2728 0.5259 0.5295 0.5212 14.5615
3.6109 0.8503 300 4.4908 -83.9815 -0.6388 4.4573 2.4072 4.4573 0.5264 0.5295 0.5212 14.3464
3.36 0.9920 350 4.6586 -80.7491 -0.5030 4.6212 2.4991 4.6212 0.5212 0.5295 0.5212 14.3467
3.112 1.1337 400 4.7244 -82.4974 -0.5664 4.7293 2.5403 4.7293 0.5186 0.5295 0.5212 14.4038
2.9448 1.2754 450 4.8354 -80.1753 -0.6936 4.8114 2.5735 4.8114 0.5248 0.5295 0.5212 13.9356
2.8517 1.4171 500 5.0044 -80.7676 -0.5973 5.0058 2.6782 5.0058 0.5269 0.5295 0.5212 14.2626
2.632 1.5588 550 4.8777 -80.5219 -0.6149 4.8844 2.5752 4.8844 0.5223 0.5295 0.5212 14.1469
2.5208 1.7005 600 4.9258 -80.1775 -0.5875 4.9621 2.5974 4.9621 0.5243 0.5295 0.5212 14.2669
2.4198 1.8422 650 5.0327 -81.0550 -0.5441 5.0454 2.6345 5.0454 0.5269 0.5295 0.5212 14.2479
2.2699 1.9839 700 4.9659 -79.7376 -0.5594 4.9951 2.6292 4.9951 0.5212 0.5295 0.5212 14.1755

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 3.2.0
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
494M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for hZzy/qwen2.5-0.5b-expo-L1EXPO-ES-1

Finetuned
(47)
this model

Dataset used to train hZzy/qwen2.5-0.5b-expo-L1EXPO-ES-1