qwen2.5-0.5b-expo-L2EXPO-ES-0.1-W0
This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise_weighted dataset. It achieves the following results on the evaluation set:
- Loss: 283.0078
- Logps: -81.0689
- Logits: -0.5212
- Objective: 277.3703
- Dpo Loss: 0.7209
- Regularize: 0.6310
- Ranking Simple: 0.5331
- Ranking Idealized: 0.6030
- Ranking Idealized Expo: 0.5223
- Wo Beta: 14.2695
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 3
- gradient_accumulation_steps: 12
- total_train_batch_size: 144
- total_eval_batch_size: 12
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Dpo Loss | Logits | Logps | Validation Loss | Objective | Ranking Idealized | Ranking Idealized Expo | Ranking Simple | Regularize | Wo Beta |
---|---|---|---|---|---|---|---|---|---|---|---|---|
176.8183 | 0.1417 | 50 | 0.6874 | -1.4745 | -93.3291 | 185.6754 | 183.7365 | 0.6030 | 0.5223 | 0.5212 | 0.4178 | 16.4509 |
168.1755 | 0.2834 | 100 | 0.6819 | -1.4487 | -93.7829 | 195.2546 | 190.2426 | 0.6030 | 0.5223 | 0.5342 | 0.4320 | 16.2920 |
182.7148 | 0.4251 | 150 | 0.6969 | -1.2730 | -89.7329 | 218.5299 | 213.4884 | 0.6030 | 0.5223 | 0.5336 | 0.4859 | 15.8045 |
203.0993 | 0.5668 | 200 | 0.7051 | -1.0447 | -79.7062 | 251.9243 | 242.6406 | 0.6030 | 0.5223 | 0.5326 | 0.5518 | 14.6949 |
207.5481 | 0.7085 | 250 | 0.7055 | -1.0362 | -80.0940 | 251.9905 | 244.3510 | 0.6030 | 0.5223 | 0.5305 | 0.5542 | 14.8158 |
193.4843 | 0.8503 | 300 | 0.7150 | -0.7137 | -80.4296 | 266.7107 | 258.3957 | 0.6030 | 0.5223 | 0.5290 | 0.5881 | 14.5431 |
182.6922 | 0.9920 | 350 | 0.7073 | -0.6448 | -76.3638 | 262.3346 | 254.6360 | 0.6030 | 0.5223 | 0.5357 | 0.5802 | 14.6176 |
166.9683 | 1.1337 | 400 | 0.7152 | -0.6392 | -78.3482 | 272.3288 | 264.9111 | 0.6030 | 0.5223 | 0.5274 | 0.6056 | 14.6513 |
155.9364 | 1.2754 | 450 | 0.7186 | -0.4207 | -80.5230 | 275.0490 | 268.8637 | 0.6030 | 0.5223 | 0.5321 | 0.6129 | 14.7777 |
143.4724 | 1.4171 | 500 | 0.7209 | -0.5141 | -80.5587 | 275.9663 | 270.0383 | 0.6030 | 0.5223 | 0.5269 | 0.6150 | 14.4364 |
141.3444 | 1.5588 | 550 | 0.7139 | -0.6338 | -81.1271 | 275.0851 | 269.2189 | 0.6030 | 0.5223 | 0.5378 | 0.6159 | 14.6425 |
136.172 | 1.7029 | 600 | 273.6681 | -79.4221 | -0.5857 | 264.6510 | 0.7111 | 0.6012 | 0.5373 | 0.6030 | 0.5223 | 14.5631 |
130.7133 | 1.8446 | 650 | 276.3609 | -80.2130 | -0.4215 | 269.6939 | 0.7193 | 0.6141 | 0.5342 | 0.6030 | 0.5223 | 14.5456 |
122.624 | 1.9863 | 700 | 278.4690 | -80.9968 | -0.5263 | 271.4757 | 0.7178 | 0.6190 | 0.5378 | 0.6030 | 0.5223 | 14.4664 |
108.7022 | 2.1280 | 750 | 282.5668 | -84.0088 | -0.4657 | 276.0201 | 0.7207 | 0.6302 | 0.5347 | 0.6030 | 0.5223 | 14.4517 |
104.1923 | 2.2697 | 800 | 278.0555 | -81.6313 | -0.4640 | 272.7622 | 0.7166 | 0.6210 | 0.5383 | 0.6030 | 0.5223 | 14.4307 |
99.0867 | 2.4114 | 850 | 283.0078 | -81.0689 | -0.5212 | 277.3703 | 0.7209 | 0.6310 | 0.5331 | 0.6030 | 0.5223 | 14.2695 |
91.7475 | 2.5531 | 900 | 279.6676 | -81.6144 | -0.5149 | 275.1769 | 0.7200 | 0.6279 | 0.5373 | 0.6030 | 0.5223 | 14.3570 |
87.8681 | 2.6949 | 950 | 281.5718 | -81.8544 | -0.4428 | 275.7560 | 0.7191 | 0.6277 | 0.5362 | 0.6030 | 0.5223 | 14.3509 |
81.742 | 2.8366 | 1000 | 279.1324 | -81.4412 | -0.4951 | 274.5647 | 0.7197 | 0.6257 | 0.5336 | 0.6030 | 0.5223 | 14.3551 |
76.4372 | 2.9783 | 1050 | 279.1884 | -82.3960 | -0.4502 | 273.9026 | 0.7184 | 0.6249 | 0.5336 | 0.6030 | 0.5223 | 14.3203 |
67.4698 | 3.1200 | 1100 | 280.5317 | -82.9107 | -0.4190 | 274.7932 | 0.7169 | 0.6260 | 0.5326 | 0.6030 | 0.5223 | 14.3418 |
Framework versions
- Transformers 4.42.0
- Pytorch 2.3.0+cu121
- Datasets 3.2.0
- Tokenizers 0.19.1
- Downloads last month
- 6
Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-W0-ES-0.1
Base model
hZzy/qwen2.5-0.5b-sft-news-IFT