spin-v-diverse / README.md

Model save

5653f47 verified 11 months ago

4.74 kB

	---
	license: apache-2.0
	base_model: alignment-handbook/zephyr-7b-sft-full
	tags:
	- generated_from_trainer
	model-index:
	- name: spin-v-diverse
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# spin-v-diverse

	This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0027
	- Rewards/real: -2.6757
	- Rewards/generated: -21.8763
	- Rewards/accuracies: 1.0
	- Rewards/margins: 19.2006
	- Logps/generated: -346.5988
	- Logps/real: -161.4224
	- Logits/generated: -2.5880
	- Logits/real: -2.4315

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-07
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- total_train_batch_size: 32
	- total_eval_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rewards/real \| Rewards/generated \| Rewards/accuracies \| Rewards/margins \| Logps/generated \| Logps/real \| Logits/generated \| Logits/real \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------------:\|:-----------------:\|:------------------:\|:---------------:\|:---------------:\|:----------:\|:----------------:\|:-----------:\|
	\| 0.0257 \| 0.06 \| 100 \| 0.0288 \| 1.0058 \| -5.7769 \| 0.9928 \| 6.7828 \| -185.6055 \| -124.6072 \| -2.8843 \| -2.6520 \|
	\| 0.0096 \| 0.13 \| 200 \| 0.0126 \| -0.1554 \| -12.6258 \| 0.9984 \| 12.4704 \| -254.0941 \| -136.2193 \| -2.5945 \| -2.2413 \|
	\| 0.024 \| 0.19 \| 300 \| 0.0126 \| 0.1173 \| -11.0946 \| 0.9968 \| 11.2119 \| -238.7820 \| -133.4925 \| -2.7227 \| -2.5040 \|
	\| 0.0065 \| 0.26 \| 400 \| 0.0082 \| -0.1964 \| -13.6305 \| 0.9984 \| 13.4341 \| -264.1411 \| -136.6298 \| -2.7028 \| -2.4738 \|
	\| 0.0073 \| 0.32 \| 500 \| 0.0081 \| 0.0850 \| -13.4368 \| 0.9984 \| 13.5218 \| -262.2040 \| -133.8156 \| -2.6477 \| -2.4285 \|
	\| 0.0035 \| 0.38 \| 600 \| 0.0071 \| -2.8739 \| -18.4641 \| 1.0 \| 15.5902 \| -312.4772 \| -163.4043 \| -2.5956 \| -2.3811 \|
	\| 0.0097 \| 0.45 \| 700 \| 0.0077 \| -2.2908 \| -16.9898 \| 0.9984 \| 14.6989 \| -297.7338 \| -157.5739 \| -2.5210 \| -2.2045 \|
	\| 0.0052 \| 0.51 \| 800 \| 0.0065 \| -1.6983 \| -19.8323 \| 0.9992 \| 18.1340 \| -326.1593 \| -151.6484 \| -2.7183 \| -2.5409 \|
	\| 0.0037 \| 0.58 \| 900 \| 0.0067 \| -1.2826 \| -16.6590 \| 0.9984 \| 15.3763 \| -294.4258 \| -147.4920 \| -2.6881 \| -2.5334 \|
	\| 0.0023 \| 0.64 \| 1000 \| 0.0047 \| -1.9423 \| -19.2263 \| 1.0 \| 17.2840 \| -320.0990 \| -154.0886 \| -2.6404 \| -2.4694 \|
	\| 0.0041 \| 0.7 \| 1100 \| 0.0050 \| -2.4756 \| -19.3047 \| 1.0 \| 16.8290 \| -320.8827 \| -159.4218 \| -2.6368 \| -2.4329 \|
	\| 0.0033 \| 0.77 \| 1200 \| 0.0037 \| -2.8600 \| -20.2625 \| 1.0 \| 17.4025 \| -330.4614 \| -163.2654 \| -2.6240 \| -2.4681 \|
	\| 0.0042 \| 0.83 \| 1300 \| 0.0032 \| -2.6738 \| -20.7669 \| 1.0 \| 18.0931 \| -335.5057 \| -161.4039 \| -2.5974 \| -2.4463 \|
	\| 0.0031 \| 0.9 \| 1400 \| 0.0030 \| -2.1767 \| -20.6456 \| 0.9992 \| 18.4690 \| -334.2925 \| -156.4323 \| -2.6144 \| -2.4595 \|
	\| 0.0015 \| 0.96 \| 1500 \| 0.0027 \| -2.6757 \| -21.8763 \| 1.0 \| 19.2006 \| -346.5988 \| -161.4224 \| -2.5880 \| -2.4315 \|


	### Framework versions

	- Transformers 4.37.0
	- Pytorch 2.1.2+cu121
	- Datasets 2.14.6
	- Tokenizers 0.15.2