qwen2.5-0.5b-expo-L2EXPO-W2-ES-0.1
This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise_weighted dataset. It achieves the following results on the evaluation set:
- Loss: 0.0580
- Logps: -81.6322
- Logits: -0.3472
- Objective: 0.0564
- Dpo Loss: 0.7177
- Regularize: 0.6547
- Ranking Simple: 0.5450
- Ranking Idealized: 0.6030
- Ranking Idealized Expo: 0.5223
- Wo Beta: 14.1503
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 3
- gradient_accumulation_steps: 12
- total_train_batch_size: 144
- total_eval_batch_size: 12
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Logps | Logits | Objective | Dpo Loss | Regularize | Ranking Simple | Ranking Idealized | Ranking Idealized Expo | Wo Beta |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0.0387 | 0.1417 | 50 | 0.0387 | -89.7823 | -1.4878 | 0.0386 | 0.6849 | 0.4152 | 0.5259 | 0.6030 | 0.5223 | 16.3818 |
0.0376 | 0.2834 | 100 | 0.0400 | -87.3577 | -1.4813 | 0.0392 | 0.6794 | 0.4266 | 0.5326 | 0.6030 | 0.5223 | 16.1307 |
0.041 | 0.4251 | 150 | 0.0453 | -80.2072 | -1.3514 | 0.0446 | 0.6965 | 0.4931 | 0.5280 | 0.6030 | 0.5223 | 15.6134 |
0.0451 | 0.5668 | 200 | 0.0473 | -78.7022 | -0.9891 | 0.0464 | 0.6948 | 0.5121 | 0.5300 | 0.6030 | 0.5223 | 15.2852 |
0.0483 | 0.7085 | 250 | 0.0523 | -73.9930 | -0.9959 | 0.0507 | 0.7054 | 0.5778 | 0.5393 | 0.6030 | 0.5223 | 15.1111 |
0.0487 | 0.8503 | 300 | 0.0531 | -79.5956 | -1.0801 | 0.0509 | 0.7126 | 0.5977 | 0.5342 | 0.6030 | 0.5223 | 14.5847 |
0.0485 | 0.9920 | 350 | 0.0548 | -76.9095 | -0.8726 | 0.0533 | 0.7110 | 0.6159 | 0.5378 | 0.6030 | 0.5223 | 14.4121 |
0.0529 | 1.1337 | 400 | 0.0587 | -78.7635 | -0.4139 | 0.0575 | 0.7255 | 0.6577 | 0.5378 | 0.6030 | 0.5223 | 14.3951 |
0.0493 | 1.2754 | 450 | 0.0584 | -78.9623 | -0.4738 | 0.0572 | 0.7243 | 0.6702 | 0.5430 | 0.6030 | 0.5223 | 14.5363 |
0.0447 | 1.4171 | 500 | 0.0572 | -78.1551 | -0.4434 | 0.0565 | 0.7180 | 0.6433 | 0.5336 | 0.6030 | 0.5223 | 14.5089 |
0.0421 | 1.5588 | 550 | 0.0577 | -78.4112 | -0.3865 | 0.0563 | 0.7126 | 0.6425 | 0.5399 | 0.6030 | 0.5223 | 14.5141 |
0.0415 | 1.7005 | 600 | 0.0583 | -80.4593 | -0.2526 | 0.0569 | 0.7205 | 0.6520 | 0.5352 | 0.6030 | 0.5223 | 14.5863 |
0.0409 | 1.8422 | 650 | 0.0573 | -78.7705 | -0.3179 | 0.0556 | 0.7195 | 0.6460 | 0.5409 | 0.6030 | 0.5223 | 14.3763 |
0.0377 | 1.9839 | 700 | 0.0579 | -79.7789 | -0.4899 | 0.0557 | 0.7221 | 0.6579 | 0.5450 | 0.6030 | 0.5223 | 14.5156 |
0.0339 | 2.1256 | 750 | 0.0577 | -80.8265 | -0.4062 | 0.0555 | 0.7193 | 0.6551 | 0.5455 | 0.6030 | 0.5223 | 14.2194 |
0.0346 | 2.2674 | 800 | 0.0577 | -81.8186 | -0.2681 | 0.0559 | 0.7190 | 0.6534 | 0.5440 | 0.6030 | 0.5223 | 14.3033 |
0.0334 | 2.4091 | 850 | 0.0585 | -83.2126 | -0.2941 | 0.0564 | 0.7213 | 0.6627 | 0.5419 | 0.6030 | 0.5223 | 14.4189 |
0.032 | 2.5508 | 900 | 0.0580 | -82.8344 | -0.2672 | 0.0564 | 0.7173 | 0.6562 | 0.5404 | 0.6030 | 0.5223 | 14.2070 |
0.029 | 2.6925 | 950 | 0.0580 | -81.6322 | -0.3472 | 0.0564 | 0.7177 | 0.6547 | 0.5450 | 0.6030 | 0.5223 | 14.1503 |
0.0242 | 2.8342 | 1000 | 0.0572 | -81.8476 | -0.3613 | 0.0555 | 0.7141 | 0.6463 | 0.5435 | 0.6030 | 0.5223 | 14.2684 |
0.0262 | 2.9759 | 1050 | 0.0582 | -82.2240 | -0.3030 | 0.0566 | 0.7193 | 0.6593 | 0.5409 | 0.6030 | 0.5223 | 14.2806 |
0.0234 | 3.1176 | 1100 | 0.0584 | -83.6653 | -0.2790 | 0.0568 | 0.7198 | 0.6624 | 0.5404 | 0.6030 | 0.5223 | 14.3429 |
0.022 | 3.2593 | 1150 | 0.0581 | -83.5282 | -0.3076 | 0.0563 | 0.7167 | 0.6564 | 0.5440 | 0.6030 | 0.5223 | 14.3960 |
0.021 | 3.4010 | 1200 | 0.0574 | -82.2867 | -0.3495 | 0.0557 | 0.7152 | 0.6455 | 0.5393 | 0.6030 | 0.5223 | 14.2067 |
Framework versions
- Transformers 4.42.0
- Pytorch 2.3.0+cu121
- Datasets 3.2.0
- Tokenizers 0.19.1
- Downloads last month
- 5
Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-W2-ES-0.1
Base model
hZzy/qwen2.5-0.5b-sft-news-IFT