Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-0.05-5e6

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4402
  • Logps: -77.2011
  • Logits: -0.8985
  • Objective: 0.4385
  • Dpo Loss: 0.6860
  • Regularize: 0.4385
  • Ranking Simple: 0.5320
  • Ranking Idealized: 0.6570
  • Ranking Idealized Expo: 0.5114

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 6
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 288
  • total_eval_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Ranking Idealized Ranking Idealized Expo
0.3566 0.2834 50 0.4133 -96.4831 -1.6203 0.4237 0.6898 0.4237 0.5165 0.6570 0.5114
0.3027 0.5668 100 0.4142 -88.4100 -1.3063 0.4151 0.6862 0.4151 0.5217 0.6570 0.5114
0.2706 0.8503 150 0.4262 -87.3674 -1.1981 0.4277 0.6857 0.4277 0.5279 0.6570 0.5114
0.2256 1.1337 200 0.4347 -81.8119 -1.2023 0.4344 0.6862 0.4344 0.5248 0.6570 0.5114
0.2005 1.4171 250 0.4292 -81.8212 -1.0616 0.4289 0.6815 0.4289 0.5227 0.6570 0.5114
0.187 1.7005 300 0.4369 -80.0077 -1.0398 0.4362 0.6845 0.4362 0.5258 0.6570 0.5114
0.1664 1.9839 350 0.4382 -79.6308 -0.9982 0.4359 0.6842 0.4359 0.5289 0.6570 0.5114
0.1368 2.2674 400 0.4408 -80.2038 -1.0155 0.4378 0.6859 0.4378 0.5320 0.6570 0.5114
0.122 2.5508 450 0.4415 -78.4288 -0.8946 0.4404 0.6863 0.4404 0.5258 0.6570 0.5114
0.1063 2.8342 500 0.4411 -78.1278 -0.8683 0.4384 0.6861 0.4384 0.5300 0.6570 0.5114
0.0878 3.1176 550 0.4406 -77.6391 -0.8292 0.4378 0.6848 0.4378 0.5331 0.6570 0.5114
0.0719 3.4010 600 0.4396 -77.4923 -0.8875 0.4373 0.6851 0.4373 0.5310 0.6570 0.5114
0.0618 3.6845 650 0.4395 -77.1838 -0.9103 0.4386 0.6855 0.4386 0.5269 0.6570 0.5114
0.0551 3.9679 700 0.4402 -77.7209 -0.9137 0.4388 0.6859 0.4388 0.5289 0.6570 0.5114
0.0388 4.2513 750 0.4404 -77.0700 -0.8976 0.4386 0.6859 0.4386 0.5310 0.6570 0.5114
0.0382 4.5347 800 0.4402 -77.2473 -0.8972 0.4384 0.6859 0.4384 0.5320 0.6570 0.5114
0.032 4.8181 850 0.4402 -77.2053 -0.8983 0.4385 0.6860 0.4385 0.5320 0.6570 0.5114

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
494M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-0.05-5e6

Finetuned
(64)
this model

Dataset used to train hZzy/qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-0.05-5e6