OpenELM-1_1B-CPO / README.md
CharlesLi's picture
Model save
bbb9d0c verified
|
raw
history blame
7.57 kB
metadata
library_name: transformers
tags:
  - trl
  - cpo
  - generated_from_trainer
model-index:
  - name: OpenELM-1_1B-CPO
    results: []

OpenELM-1_1B-CPO

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1904
  • Rewards/chosen: -3.6406
  • Rewards/rejected: -4.4375
  • Rewards/accuracies: 0.5918
  • Rewards/margins: 0.8008
  • Logps/rejected: -444.0
  • Logps/chosen: -364.0
  • Logits/rejected: -7.5312
  • Logits/chosen: -8.875
  • Nll Loss: 1.1719

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Logits/chosen Logits/rejected Logps/chosen Logps/rejected Validation Loss Nll Loss Rewards/accuracies Rewards/chosen Rewards/margins Rewards/rejected
2.4271 0.1047 100 -12.3125 -12.125 -336.0 -328.0 2.2959 1.0859 0.4980 -3.3594 -0.0850 -3.2812
2.2538 0.2093 200 -9.875 -9.5 -338.0 -346.0 2.1836 1.0938 0.5234 -3.3906 0.0640 -3.4531
2.1253 0.3140 300 -11.4375 -11.0 -346.0 -360.0 2.1307 1.1172 0.5176 -3.4531 0.1416 -3.5938
2.0609 0.4186 400 -11.125 -10.625 -332.0 -344.0 2.1359 1.0703 0.5293 -3.3281 0.1187 -3.4375
2.1905 0.5233 500 -9.3125 -8.5 -338.0 -352.0 2.1286 1.0859 0.5254 -3.375 0.1357 -3.5156
2.1304 0.6279 600 -10.625 -9.625 -360.0 -398.0 2.1410 1.1562 0.5723 -3.6094 0.3672 -3.9688
2.2554 0.7326 700 -9.6875 -8.5625 -374.0 -416.0 2.1848 1.2031 0.5664 -3.7344 0.4258 -4.1562
2.0796 0.8373 800 -7.8438 -7.0312 -346.0 -374.0 2.1224 1.1172 0.5469 -3.4531 0.2852 -3.75
2.1021 0.9419 900 -6.2812 -5.2812 -350.0 -390.0 2.1099 1.1328 0.5723 -3.5 0.4062 -3.9062
1.5182 1.0471 1000 2.1662 -3.5 -3.8594 0.5664 0.3633 -386.0 -350.0 -9.375 -10.625 1.125
1.4917 1.1518 1100 2.1588 -3.5625 -4.0 0.5703 0.4395 -400.0 -356.0 -6.4688 -7.875 1.1484
1.5219 1.2564 1200 2.1449 -3.625 -4.1875 0.5938 0.5586 -420.0 -364.0 -6.6562 -7.7812 1.1719
1.5292 1.3611 1300 2.1489 -3.5312 -4.0 0.5742 0.4785 -402.0 -354.0 -7.75 -8.875 1.1406
1.4257 1.4657 1400 2.1193 -3.5781 -4.0938 0.5801 0.5156 -410.0 -358.0 -7.7188 -9.25 1.1562
1.4366 1.5704 1500 2.0983 -3.5938 -4.1562 0.5898 0.5586 -416.0 -358.0 -7.6875 -8.9375 1.1562
1.5246 1.6750 1600 2.1191 -3.5781 -4.2188 0.5938 0.625 -420.0 -358.0 -5.4688 -6.9062 1.1562
1.4534 1.7797 1700 2.0829 -3.4688 -4.0312 0.5762 0.5625 -404.0 -348.0 -9.0625 -10.0625 1.1172
1.4551 1.8844 1800 2.1033 -3.5625 -4.1562 0.5898 0.6016 -416.0 -356.0 -6.8438 -8.1875 1.1484
1.4969 1.9890 1900 2.1046 -3.5312 -4.125 0.5762 0.5938 -412.0 -354.0 -8.125 -9.3125 1.1406
0.9984 2.0937 2000 2.1806 -3.6406 -4.2812 0.5781 0.6367 -428.0 -364.0 -7.9375 -9.1875 1.1719
0.9885 2.1983 2100 2.1927 -3.6875 -4.5 0.5801 0.7930 -448.0 -370.0 -7.4062 -8.6875 1.1875
0.9814 2.3030 2200 2.1867 -3.625 -4.3438 0.5742 0.7266 -436.0 -362.0 -7.5 -8.8125 1.1719
0.9844 2.4076 2300 2.1905 -3.6875 -4.5312 0.5996 0.8438 -452.0 -368.0 -7.125 -8.375 1.1875
0.9931 2.5123 2400 2.1843 -3.6406 -4.4375 0.5820 0.7930 -442.0 -364.0 -7.375 -8.6875 1.1719
0.9537 2.6170 2500 2.1907 -3.6406 -4.4688 0.5898 0.8125 -446.0 -364.0 -7.5 -8.8125 1.1719
0.9512 2.7216 2600 2.1918 -3.6406 -4.4375 0.5898 0.8086 -446.0 -364.0 -7.5 -8.8125 1.1719
0.9604 2.8263 2700 2.1906 -3.6406 -4.4375 0.5879 0.7969 -442.0 -364.0 -7.5312 -8.875 1.1719
1.0208 2.9309 2800 2.1904 -3.6406 -4.4375 0.5918 0.8008 -444.0 -364.0 -7.5312 -8.875 1.1719

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.3.0
  • Datasets 3.0.0
  • Tokenizers 0.19.1