hZzy's picture
Model save
37857a7 verified
|
raw
history blame
4.52 kB
metadata
license: apache-2.0
base_model: hZzy/qwen2.5-0.5b-sft-news-IFT
tags:
  - trl
  - expo
  - generated_from_trainer
model-index:
  - name: qwen2.5-0.5b-expo-L2EXPO-ES-1
    results: []

Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-ES-1

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.0286
  • Logps: -83.8820
  • Logits: -0.4938
  • Objective: 5.0013
  • Dpo Loss: 2.6194
  • Regularize: 5.0013
  • Ranking Simple: 0.5197
  • Ranking Idealized: 0.5295
  • Ranking Idealized Expo: 0.5212
  • Wo Beta: 14.2504

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 144
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Ranking Idealized Ranking Idealized Expo Wo Beta
0.6418 0.1417 50 0.7369 -89.5788 -1.4384 0.7343 0.7480 0.7343 0.5248 0.5295 0.5212 16.0414
1.7208 0.2834 100 1.7082 -87.8064 -1.3168 1.6950 1.0867 1.6950 0.5228 0.5295 0.5212 15.5148
2.841 0.4251 150 2.9302 -83.1791 -1.1086 2.8768 1.6352 2.8768 0.5300 0.5295 0.5212 15.0680
3.5072 0.5668 200 4.2317 -80.2960 -0.8688 4.2210 2.3120 4.2210 0.5155 0.5295 0.5212 14.5319
3.7707 0.7085 250 4.3648 -80.5389 -0.7639 4.3627 2.2988 4.3627 0.5212 0.5295 0.5212 14.5663
3.5773 0.8503 300 4.3904 -83.8565 -0.5388 4.3972 2.2955 4.3972 0.5238 0.5295 0.5212 14.3098
3.359 0.9920 350 4.6868 -82.1212 -0.5555 4.6293 2.4176 4.6293 0.5264 0.5295 0.5212 14.3177
3.0892 1.1337 400 4.8991 -80.1851 -0.4846 4.9208 2.5732 4.9208 0.5238 0.5295 0.5212 14.1271
3.001 1.2754 450 4.8651 -82.0773 -0.5097 4.8038 2.4966 4.8038 0.5233 0.5295 0.5212 14.2309
2.8358 1.4171 500 4.8734 -81.9592 -0.4937 4.8544 2.5685 4.8544 0.5243 0.5295 0.5212 14.2662
2.6622 1.5588 550 4.8760 -81.5020 -0.5513 4.9098 2.5441 4.9098 0.5243 0.5295 0.5212 14.2522
2.5417 1.7005 600 5.0324 -83.9181 -0.5043 5.0251 2.5863 5.0251 0.5259 0.5295 0.5212 14.2325
2.435 1.8422 650 5.0286 -83.8820 -0.4938 5.0013 2.6194 5.0013 0.5197 0.5295 0.5212 14.2504

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1