Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-noES-0.1

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise_weighted dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4111
  • Logps: -89.2333
  • Logits: -1.4016
  • Objective: 0.4056
  • Dpo Loss: 0.6787
  • Regularize: 0.4056
  • Ranking Simple: 0.5352
  • Ranking Idealized: 0.6025
  • Ranking Idealized Expo: 0.5233
  • Wo Beta: 16.2455

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 144
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Ranking Idealized Ranking Idealized Expo Wo Beta
0.4024 0.1417 50 0.4110 -90.5653 -1.4391 0.4097 0.6885 0.4097 0.5264 0.6025 0.5233 16.2893
0.3435 0.2834 100 0.4067 -93.4765 -1.4915 0.4041 0.6822 0.4041 0.5305 0.6025 0.5233 16.4079
0.3184 0.4251 150 0.4066 -91.6151 -1.4215 0.4020 0.6786 0.4020 0.5342 0.6025 0.5233 16.5344
0.2935 0.5668 200 0.4100 -91.5667 -1.3884 0.4060 0.6791 0.4060 0.5336 0.6025 0.5233 16.4082
0.2854 0.7085 250 0.4143 -90.3560 -1.4712 0.4087 0.6802 0.4087 0.5342 0.6025 0.5233 16.3706
0.249 0.8503 300 0.4091 -89.8744 -1.4821 0.4058 0.6798 0.4058 0.5336 0.6025 0.5233 16.2613
0.2289 0.9920 350 0.4118 -89.6124 -1.4819 0.4047 0.6786 0.4047 0.5362 0.6025 0.5233 16.3965
0.2105 1.1337 400 0.4060 -88.2954 -1.3976 0.4024 0.6778 0.4024 0.5352 0.6025 0.5233 16.3633
0.1773 1.2754 450 0.4122 -89.4120 -1.3974 0.4051 0.6770 0.4051 0.5373 0.6025 0.5233 16.2171
0.1579 1.4171 500 0.4140 -89.1284 -1.3760 0.4073 0.6801 0.4073 0.5378 0.6025 0.5233 16.2211
0.1534 1.5588 550 0.4124 -87.6963 -1.3890 0.4048 0.6781 0.4048 0.5388 0.6025 0.5233 16.2085
0.1396 1.7005 600 0.4126 -88.8736 -1.4152 0.4050 0.6781 0.4050 0.5357 0.6025 0.5233 16.2840
0.1433 1.8422 650 0.4109 -89.4824 -1.3995 0.4050 0.6781 0.4050 0.5357 0.6025 0.5233 16.2822
0.1202 1.9839 700 0.4113 -89.1037 -1.3927 0.4061 0.6790 0.4061 0.5336 0.6025 0.5233 16.2384
0.0927 2.1256 750 0.4115 -89.5013 -1.4006 0.4053 0.6785 0.4053 0.5362 0.6025 0.5233 16.1916
0.0932 2.2674 800 0.4109 -88.9918 -1.4040 0.4055 0.6784 0.4055 0.5357 0.6025 0.5233 16.2422
0.076 2.4091 850 0.4112 -89.0524 -1.4000 0.4056 0.6788 0.4056 0.5352 0.6025 0.5233 16.2403
0.0802 2.5508 900 0.4114 -89.2338 -1.4061 0.4059 0.6787 0.4059 0.5352 0.6025 0.5233 16.2290
0.0696 2.6925 950 0.4111 -89.2200 -1.4037 0.4056 0.6787 0.4056 0.5347 0.6025 0.5233 16.2510
0.0722 2.8342 1000 0.4111 -89.2367 -1.4019 0.4057 0.6787 0.4057 0.5352 0.6025 0.5233 16.2505
0.0733 2.9759 1050 0.4111 -89.2333 -1.4016 0.4056 0.6787 0.4056 0.5352 0.6025 0.5233 16.2455

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 3.2.0
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
494M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-noES-0.1

Finetuned
(70)
this model

Dataset used to train hZzy/qwen2.5-0.5b-expo-L2EXPO-noES-0.1