spin-v-diverse

This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-full on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0027
  • Rewards/real: -2.6757
  • Rewards/generated: -21.8763
  • Rewards/accuracies: 1.0
  • Rewards/margins: 19.2006
  • Logps/generated: -346.5988
  • Logps/real: -161.4224
  • Logits/generated: -2.5880
  • Logits/real: -2.4315

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/real Rewards/generated Rewards/accuracies Rewards/margins Logps/generated Logps/real Logits/generated Logits/real
0.0257 0.06 100 0.0288 1.0058 -5.7769 0.9928 6.7828 -185.6055 -124.6072 -2.8843 -2.6520
0.0096 0.13 200 0.0126 -0.1554 -12.6258 0.9984 12.4704 -254.0941 -136.2193 -2.5945 -2.2413
0.024 0.19 300 0.0126 0.1173 -11.0946 0.9968 11.2119 -238.7820 -133.4925 -2.7227 -2.5040
0.0065 0.26 400 0.0082 -0.1964 -13.6305 0.9984 13.4341 -264.1411 -136.6298 -2.7028 -2.4738
0.0073 0.32 500 0.0081 0.0850 -13.4368 0.9984 13.5218 -262.2040 -133.8156 -2.6477 -2.4285
0.0035 0.38 600 0.0071 -2.8739 -18.4641 1.0 15.5902 -312.4772 -163.4043 -2.5956 -2.3811
0.0097 0.45 700 0.0077 -2.2908 -16.9898 0.9984 14.6989 -297.7338 -157.5739 -2.5210 -2.2045
0.0052 0.51 800 0.0065 -1.6983 -19.8323 0.9992 18.1340 -326.1593 -151.6484 -2.7183 -2.5409
0.0037 0.58 900 0.0067 -1.2826 -16.6590 0.9984 15.3763 -294.4258 -147.4920 -2.6881 -2.5334
0.0023 0.64 1000 0.0047 -1.9423 -19.2263 1.0 17.2840 -320.0990 -154.0886 -2.6404 -2.4694
0.0041 0.7 1100 0.0050 -2.4756 -19.3047 1.0 16.8290 -320.8827 -159.4218 -2.6368 -2.4329
0.0033 0.77 1200 0.0037 -2.8600 -20.2625 1.0 17.4025 -330.4614 -163.2654 -2.6240 -2.4681
0.0042 0.83 1300 0.0032 -2.6738 -20.7669 1.0 18.0931 -335.5057 -161.4039 -2.5974 -2.4463
0.0031 0.9 1400 0.0030 -2.1767 -20.6456 0.9992 18.4690 -334.2925 -156.4323 -2.6144 -2.4595
0.0015 0.96 1500 0.0027 -2.6757 -21.8763 1.0 19.2006 -346.5988 -161.4224 -2.5880 -2.4315

Framework versions

  • Transformers 4.37.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.6
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
7.24B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AmberYifan/spin-v-diverse

Finetuned
(320)
this model