Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-W0-noES5-0.05

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise_weighted dataset. It achieves the following results on the evaluation set:

  • Loss: 179.0654
  • Logps: -93.4070
  • Logits: -1.6430
  • Objective: 176.1436
  • Dpo Loss: 0.6785
  • Regularize: 0.3996
  • Ranking Simple: 0.5342
  • Wo Beta: 17.0972

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 144
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Wo Beta
181.8716 0.1417 50 182.0715 -90.3592 -1.4456 180.1049 0.6894 0.4083 0.5269 16.4217
157.5861 0.2834 100 180.9991 -91.7146 -1.5597 179.7807 0.6852 0.4085 0.5388 16.6421
147.4354 0.4251 150 179.6252 -88.3708 -1.5794 177.2645 0.6811 0.4007 0.5383 17.0744
138.6462 0.5668 200 179.4377 -90.4436 -1.5522 175.5756 0.6805 0.3987 0.5342 16.6929
122.5651 0.7085 250 178.9847 -93.0301 -1.6388 176.6755 0.6795 0.4017 0.5300 17.1382
110.1558 0.8503 300 178.2535 -94.1108 -1.6237 176.3972 0.6795 0.4012 0.5311 17.0693
99.307 0.9920 350 180.1735 -96.0474 -1.5854 177.3710 0.6788 0.4026 0.5336 17.1627
87.6253 1.1337 400 178.9584 -94.0894 -1.5779 176.3463 0.6783 0.3997 0.5336 17.1200
77.9665 1.2754 450 178.7482 -93.8793 -1.6468 175.9861 0.6798 0.4001 0.5321 16.9068
70.9202 1.4171 500 178.7919 -94.5267 -1.6244 175.6359 0.6787 0.3987 0.5342 16.9906
68.4 1.5588 550 179.6713 -93.1219 -1.6340 175.8635 0.6783 0.3992 0.5336 17.0272
62.6522 1.7005 600 179.6970 -93.5471 -1.6273 176.5027 0.6786 0.4002 0.5362 17.1271
62.1281 1.8422 650 178.4731 -92.9689 -1.6053 175.4037 0.6782 0.3979 0.5362 17.0350
58.8228 1.9839 700 178.7972 -93.4236 -1.6276 176.0926 0.6786 0.3992 0.5362 17.0199
45.2464 2.1256 750 178.6497 -93.8837 -1.6225 175.6834 0.6780 0.3985 0.5342 16.9906
46.081 2.2674 800 179.1421 -93.3818 -1.6332 176.3395 0.6787 0.4002 0.5331 17.0616
38.6939 2.4091 850 178.9382 -93.4067 -1.6352 175.9686 0.6784 0.3992 0.5362 17.0858
39.6509 2.5508 900 179.1196 -93.3231 -1.6445 176.1842 0.6785 0.3996 0.5357 17.1093
37.5296 2.6925 950 179.0316 -93.3121 -1.6429 176.0845 0.6784 0.3994 0.5347 17.0760
38.0286 2.8342 1000 179.0622 -93.4092 -1.6428 176.1425 0.6785 0.3996 0.5342 17.0993
41.7608 2.9759 1050 179.0654 -93.4070 -1.6430 176.1437 0.6785 0.3996 0.5342 17.0972

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 3.2.0
  • Tokenizers 0.19.1
Downloads last month
10
Safetensors
Model size
494M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-W0-noES5-0.05

Finetuned
(74)
this model

Dataset used to train hZzy/qwen2.5-0.5b-expo-L2EXPO-W0-noES5-0.05