Model save

3678860 verified about 1 month ago

4.9 kB

	---
	license: apache-2.0
	base_model: hZzy/qwen2.5-0.5b-sft-news-IFT
	tags:
	- trl
	- expo
	- generated_from_trainer
	model-index:
	- name: qwen2.5-0.5b-expo-DPO-noES-0.1
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/zhiyuzha-university-of-florida/huggingface/runs/5jjpvn9b)
	# qwen2.5-0.5b-expo-DPO-noES-0.1

	This model is a fine-tuned version of [hZzy/qwen2.5-0.5b-sft-news-IFT](https://huggingface.co/hZzy/qwen2.5-0.5b-sft-news-IFT) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8493
	- Logps: -132.8567
	- Logits: -1.8165
	- Objective: 0.8653
	- Dpo Loss: 0.8653
	- Regularize: 0.8653
	- Ranking Simple: 0.5347
	- Wo Beta: 10.9418

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-06
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 3
	- gradient_accumulation_steps: 12
	- total_train_batch_size: 144
	- total_eval_batch_size: 12
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Logps \| Logits \| Objective \| Dpo Loss \| Regularize \| Ranking Simple \| Wo Beta \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:---------:\|:-------:\|:---------:\|:--------:\|:----------:\|:--------------:\|:-------:\|
	\| 0.6719 \| 0.1417 \| 50 \| 0.6856 \| -89.6776 \| -1.4697 \| 0.6879 \| 0.6879 \| 0.6879 \| 0.5269 \| 7.9221 \|
	\| 0.6459 \| 0.2834 \| 100 \| 0.6765 \| -92.9954 \| -1.6511 \| 0.6793 \| 0.6793 \| 0.6793 \| 0.5347 \| 7.8727 \|
	\| 0.5993 \| 0.4251 \| 150 \| 0.6771 \| -95.2729 \| -1.6963 \| 0.6805 \| 0.6805 \| 0.6805 \| 0.5347 \| 8.2155 \|
	\| 0.5557 \| 0.5668 \| 200 \| 0.6858 \| -115.4680 \| -1.8150 \| 0.6866 \| 0.6866 \| 0.6866 \| 0.5295 \| 7.9607 \|
	\| 0.5428 \| 0.7085 \| 250 \| 0.6745 \| -102.5668 \| -1.8495 \| 0.6741 \| 0.6741 \| 0.6741 \| 0.5367 \| 7.9891 \|
	\| 0.4987 \| 0.8503 \| 300 \| 0.7119 \| -110.0949 \| -1.9277 \| 0.7203 \| 0.7203 \| 0.7203 \| 0.5373 \| 8.9267 \|
	\| 0.4599 \| 0.9920 \| 350 \| 0.6886 \| -104.9833 \| -1.8474 \| 0.6912 \| 0.6912 \| 0.6912 \| 0.5352 \| 8.3749 \|
	\| 0.3498 \| 1.1337 \| 400 \| 0.7463 \| -115.0889 \| -1.8807 \| 0.7518 \| 0.7518 \| 0.7518 \| 0.5518 \| 9.5505 \|
	\| 0.3361 \| 1.2754 \| 450 \| 0.7563 \| -116.8004 \| -1.8356 \| 0.7673 \| 0.7673 \| 0.7673 \| 0.5419 \| 9.7252 \|
	\| 0.3584 \| 1.4171 \| 500 \| 0.7635 \| -117.5167 \| -1.8626 \| 0.7695 \| 0.7695 \| 0.7695 \| 0.5419 \| 9.6319 \|
	\| 0.3343 \| 1.5588 \| 550 \| 0.7698 \| -123.3863 \| -1.8209 \| 0.7814 \| 0.7814 \| 0.7814 \| 0.5352 \| 9.8258 \|
	\| 0.3105 \| 1.7005 \| 600 \| 0.7679 \| -119.8231 \| -1.7866 \| 0.7761 \| 0.7761 \| 0.7761 \| 0.5383 \| 9.8031 \|
	\| 0.3412 \| 1.8422 \| 650 \| 0.7750 \| -122.2944 \| -1.8323 \| 0.7848 \| 0.7848 \| 0.7848 \| 0.5383 \| 9.9494 \|
	\| 0.3156 \| 1.9839 \| 700 \| 0.8013 \| -126.3939 \| -1.8338 \| 0.8139 \| 0.8139 \| 0.8139 \| 0.5378 \| 10.3247 \|
	\| 0.2183 \| 2.1256 \| 750 \| 0.8467 \| -131.1257 \| -1.7999 \| 0.8604 \| 0.8604 \| 0.8604 \| 0.5352 \| 10.8931 \|
	\| 0.2338 \| 2.2674 \| 800 \| 0.8480 \| -132.1160 \| -1.8070 \| 0.8641 \| 0.8641 \| 0.8641 \| 0.5352 \| 10.9810 \|
	\| 0.2015 \| 2.4091 \| 850 \| 0.8572 \| -133.3811 \| -1.8018 \| 0.8720 \| 0.8720 \| 0.8720 \| 0.5378 \| 11.0252 \|
	\| 0.2348 \| 2.5508 \| 900 \| 0.8530 \| -133.6796 \| -1.8114 \| 0.8675 \| 0.8675 \| 0.8675 \| 0.5378 \| 10.9423 \|
	\| 0.2268 \| 2.6925 \| 950 \| 0.8525 \| -133.2829 \| -1.8136 \| 0.8684 \| 0.8684 \| 0.8684 \| 0.5336 \| 10.9785 \|
	\| 0.2198 \| 2.8342 \| 1000 \| 0.8493 \| -132.8809 \| -1.8167 \| 0.8652 \| 0.8652 \| 0.8652 \| 0.5342 \| 10.9383 \|
	\| 0.2221 \| 2.9759 \| 1050 \| 0.8493 \| -132.8567 \| -1.8165 \| 0.8653 \| 0.8653 \| 0.8653 \| 0.5347 \| 10.9418 \|


	### Framework versions

	- Transformers 4.42.0
	- Pytorch 2.3.0+cu121
	- Datasets 3.2.0
	- Tokenizers 0.19.1