qwen2.5-0.5b-expo-L2EXPO-ES-0.001
This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise_weighted dataset. It achieves the following results on the evaluation set:
- Loss: 0.3942
- Logps: -573.5075
- Logits: -8.8910
- Objective: 0.3931
- Dpo Loss: 0.6728
- Regularize: 0.3931
- Ranking Simple: 0.6102
- Ranking Idealized: 0.9871
- Ranking Idealized Expo: 0.6320
- Wo Beta: 160.3578
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 3
- gradient_accumulation_steps: 12
- total_train_batch_size: 144
- total_eval_batch_size: 12
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Dpo Loss | Logits | Logps | Validation Loss | Objective | Ranking Idealized | Ranking Idealized Expo | Ranking Simple | Regularize | Wo Beta |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0.418 | 0.1417 | 50 | 0.6927 | -1.8105 | -107.4392 | 0.4150 | 0.4128 | 0.9871 | 0.6320 | 0.5352 | 0.4128 | 22.0269 |
0.416 | 0.2834 | 100 | 0.6896 | -2.0485 | -230.8855 | 0.4087 | 0.4081 | 0.9871 | 0.6320 | 0.5559 | 0.4081 | 52.9494 |
0.387 | 0.4251 | 150 | 0.6844 | -3.9840 | -343.5519 | 0.4032 | 0.4021 | 0.9871 | 0.6320 | 0.5766 | 0.4021 | 90.2215 |
0.3587 | 0.5668 | 200 | 0.6754 | -6.1681 | -390.3867 | 0.3917 | 0.3893 | 0.9871 | 0.6320 | 0.6004 | 0.3893 | 124.6577 |
0.3299 | 0.7085 | 250 | 0.6765 | -7.7444 | -474.0688 | 0.3968 | 0.3968 | 0.9871 | 0.6320 | 0.5958 | 0.3968 | 147.7626 |
0.294 | 0.8503 | 300 | 0.6728 | -8.8910 | -573.5075 | 0.3942 | 0.3931 | 0.9871 | 0.6320 | 0.6102 | 0.3931 | 160.3578 |
0.2753 | 0.9920 | 350 | 0.6731 | -9.9981 | -593.1101 | 0.3965 | 0.3960 | 0.9871 | 0.6320 | 0.5937 | 0.3960 | 171.5761 |
0.2316 | 1.1337 | 400 | 0.6718 | -9.6479 | -564.7661 | 0.3966 | 0.3956 | 0.9871 | 0.6320 | 0.5875 | 0.3956 | 171.6054 |
0.2205 | 1.2754 | 450 | 0.6725 | -10.9673 | -599.2516 | 0.3962 | 0.3983 | 0.9871 | 0.6320 | 0.5859 | 0.3983 | 182.4877 |
0.2058 | 1.4171 | 500 | 0.6741 | -9.6175 | -589.5045 | 0.4005 | 0.4029 | 0.9871 | 0.6320 | 0.5797 | 0.4029 | 188.1013 |
0.2027 | 1.5588 | 550 | 0.6730 | -10.3937 | -622.4691 | 0.3995 | 0.4000 | 0.9871 | 0.6320 | 0.5947 | 0.4000 | 185.8620 |
0.1897 | 1.7029 | 600 | 0.4028 | -755.1119 | -11.5540 | 0.4023 | 0.6716 | 0.4023 | 0.5952 | 0.9871 | 0.6320 | 201.2357 |
0.1797 | 1.8446 | 650 | 0.3997 | -673.7770 | -10.8193 | 0.3992 | 0.6730 | 0.3992 | 0.5942 | 0.9871 | 0.6320 | 188.3079 |
0.1689 | 1.9863 | 700 | 0.3985 | -653.8336 | -11.0772 | 0.3970 | 0.6713 | 0.3970 | 0.5911 | 0.9871 | 0.6320 | 182.3852 |
0.1492 | 2.1280 | 750 | 0.3959 | -624.3672 | -11.4717 | 0.3956 | 0.6708 | 0.3956 | 0.6025 | 0.9871 | 0.6320 | 182.7602 |
0.143 | 2.2697 | 800 | 0.3955 | -657.3067 | -11.2559 | 0.3958 | 0.6701 | 0.3958 | 0.6009 | 0.9871 | 0.6320 | 190.5371 |
Framework versions
- Transformers 4.42.0
- Pytorch 2.3.0+cu121
- Datasets 3.2.0
- Tokenizers 0.19.1
- Downloads last month
- 4
Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-ES-0.001
Base model
hZzy/qwen2.5-0.5b-sft-news-IFT