qwen2.5-0.5b-expo-L2EXPO-ES-1000
This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise dataset. It achieves the following results on the evaluation set:
- Loss: 5280.8613
- Logps: -85.1920
- Logits: -0.4645
- Objective: 5329.0571
- Dpo Loss: 2703.0312
- Regularize: 5329.0571
- Ranking Simple: 0.5264
- Ranking Idealized: 0.5212
- Ranking Idealized Expo: 0.5212
- Wo Beta: 14.0257
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 3
- gradient_accumulation_steps: 12
- total_train_batch_size: 144
- total_eval_batch_size: 12
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Dpo Loss | Logits | Logps | Validation Loss | Objective | Ranking Idealized | Ranking Idealized Expo | Ranking Simple | Regularize | Wo Beta |
---|---|---|---|---|---|---|---|---|---|---|---|---|
423.0596 | 0.1417 | 50 | 269.6528 | -1.3995 | -90.7221 | 547.3513 | 547.0782 | 0.5212 | 0.5212 | 0.5238 | 547.0782 | 16.2386 |
1707.6314 | 0.2834 | 100 | 848.0440 | -1.3188 | -86.9761 | 1694.1810 | 1675.6274 | 0.5212 | 0.5212 | 0.5202 | 1675.6274 | 15.6407 |
2824.1562 | 0.4251 | 150 | 1511.3630 | -1.2862 | -82.2178 | 3014.1465 | 2968.7720 | 0.5212 | 0.5212 | 0.5274 | 2968.7720 | 15.0323 |
3551.5363 | 0.5668 | 200 | 1970.6587 | -0.7799 | -81.0056 | 3928.5837 | 3925.4968 | 0.5212 | 0.5212 | 0.5248 | 3925.4968 | 14.6132 |
3769.9247 | 0.7085 | 250 | 2167.5466 | -0.7280 | -80.5808 | 4317.6050 | 4303.1143 | 0.5212 | 0.5212 | 0.5269 | 4303.1143 | 14.5829 |
3591.3281 | 0.8503 | 300 | 2308.4351 | -0.5846 | -82.7713 | 4553.7632 | 4559.6348 | 0.5212 | 0.5212 | 0.5248 | 4559.6348 | 14.5914 |
3315.5613 | 0.9920 | 350 | 2326.0144 | -0.7541 | -80.8051 | 4667.9404 | 4670.2617 | 0.5212 | 0.5212 | 0.5331 | 4670.2617 | 14.3052 |
3140.2284 | 1.1337 | 400 | 2524.6191 | -0.6474 | -81.5771 | 4876.3184 | 4879.0815 | 0.5212 | 0.5212 | 0.5228 | 4879.0815 | 14.3271 |
2984.025 | 1.2754 | 450 | 2466.7131 | -0.7908 | -84.2705 | 4773.4326 | 4785.7534 | 0.5212 | 0.5212 | 0.5248 | 4785.7534 | 14.3213 |
2769.3719 | 1.4171 | 500 | 2513.8191 | -0.7098 | -81.4917 | 4863.7148 | 4866.6235 | 0.5212 | 0.5212 | 0.5192 | 4866.6235 | 14.1934 |
2620.0086 | 1.5588 | 550 | 2463.1169 | -0.5653 | -81.8307 | 4887.2939 | 4877.4683 | 0.5212 | 0.5212 | 0.5248 | 4877.4683 | 14.1757 |
2530.9462 | 1.7005 | 600 | 2522.0715 | -0.4886 | -82.8727 | 4965.4233 | 5013.2871 | 0.5212 | 0.5212 | 0.5233 | 5013.2871 | 14.2573 |
2445.0009 | 1.8422 | 650 | 2509.7644 | -0.5173 | -81.8303 | 4964.3994 | 4986.9541 | 0.5212 | 0.5212 | 0.5243 | 4986.9541 | 14.2557 |
2287.7192 | 1.9839 | 700 | 2561.1602 | -0.5354 | -83.8738 | 5034.0654 | 5065.8521 | 0.5212 | 0.5212 | 0.5217 | 5065.8521 | 14.0847 |
2066.9519 | 2.1256 | 750 | 2654.1794 | -0.4949 | -82.1944 | 5229.8853 | 5264.4932 | 0.5212 | 0.5212 | 0.5254 | 5264.4932 | 14.0981 |
1963.7713 | 2.2674 | 800 | 2636.3833 | -0.4790 | -82.2307 | 5180.2388 | 5235.7695 | 0.5212 | 0.5212 | 0.5243 | 5235.7695 | 14.0378 |
1854.7628 | 2.4091 | 850 | 2612.3875 | -0.4900 | -82.9664 | 5130.6069 | 5171.9189 | 0.5212 | 0.5212 | 0.5269 | 5171.9189 | 14.1142 |
1711.9678 | 2.5508 | 900 | 2703.0312 | -0.4645 | -85.1920 | 5280.8613 | 5329.0571 | 0.5212 | 0.5212 | 0.5264 | 5329.0571 | 14.0257 |
1682.3781 | 2.6925 | 950 | 2644.8484 | -0.4320 | -83.9376 | 5177.0815 | 5195.8457 | 0.5212 | 0.5212 | 0.5254 | 5195.8457 | 14.1691 |
1508.6941 | 2.8342 | 1000 | 2632.4006 | -0.5014 | -83.4235 | 5124.7144 | 5131.5728 | 0.5212 | 0.5212 | 0.5243 | 5131.5728 | 14.1501 |
1432.2169 | 2.9759 | 1050 | 2638.4963 | -0.4687 | -83.8074 | 5215.6191 | 5232.5947 | 0.5212 | 0.5212 | 0.5295 | 5232.5947 | 14.2389 |
1247.6562 | 3.1223 | 1100 | 5184.8696 | -84.1614 | -0.5461 | 5190.6357 | 2631.7661 | 5190.6357 | 0.5264 | 0.5212 | 0.5212 | 14.1529 |
1136.2859 | 3.2641 | 1150 | 5110.2056 | -83.8852 | -0.5632 | 5112.0278 | 2590.1838 | 5112.0278 | 0.5280 | 0.5212 | 0.5212 | 14.0933 |
1042.7762 | 3.4058 | 1200 | 5146.4077 | -83.9630 | -0.5505 | 5162.1665 | 2612.2661 | 5162.1665 | 0.5274 | 0.5212 | 0.5212 | 14.1122 |
978.7787 | 3.5475 | 1250 | 5115.5093 | -83.8987 | -0.4993 | 5140.9258 | 2605.3420 | 5140.9258 | 0.5280 | 0.5212 | 0.5212 | 14.1279 |
864.8715 | 3.6892 | 1300 | 5143.4609 | -84.2929 | -0.5245 | 5173.7549 | 2621.0728 | 5173.7549 | 0.5259 | 0.5212 | 0.5212 | 14.1584 |
Framework versions
- Transformers 4.42.0
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
- Downloads last month
- 5
Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-ES-1000
Base model
hZzy/qwen2.5-0.5b-sft-news-IFT