End of training

403dfc0 verified about 2 months ago

5.2 kB

	---
	license: apache-2.0
	base_model: hZzy/qwen2.5-0.5b-sft-news-IFT
	tags:
	- alignment-handbook
	- ndcg
	- trl
	- expo
	- generated_from_trainer
	- trl
	- expo
	- generated_from_trainer
	datasets:
	- hZzy/train_pairwise
	model-index:
	- name: qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-0.005-5e6
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/zhiyuzha-university-of-florida/huggingface/runs/3gvck2ki)
	# qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-0.005-5e6

	This model is a fine-tuned version of [hZzy/qwen2.5-0.5b-sft-news-IFT](https://huggingface.co/hZzy/qwen2.5-0.5b-sft-news-IFT) on the hZzy/train_pairwise dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3951
	- Logps: -195.4572
	- Logits: -3.2699
	- Objective: 0.3956
	- Dpo Loss: 0.6771
	- Regularize: 0.3956
	- Ranking Simple: 0.5661
	- Ranking Idealized: 0.9194
	- Ranking Idealized Expo: 0.5310

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-06
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 6
	- gradient_accumulation_steps: 12
	- total_train_batch_size: 288
	- total_eval_batch_size: 24
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Logps \| Logits \| Objective \| Dpo Loss \| Regularize \| Ranking Simple \| Ranking Idealized \| Ranking Idealized Expo \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:---------:\|:-------:\|:---------:\|:--------:\|:----------:\|:--------------:\|:-----------------:\|:----------------------:\|
	\| 0.4052 \| 0.2834 \| 50 \| 0.4107 \| -129.0883 \| -1.8292 \| 0.4120 \| 0.6914 \| 0.4120 \| 0.5372 \| 0.9194 \| 0.5310 \|
	\| 0.3407 \| 0.5668 \| 100 \| 0.4017 \| -173.3319 \| -2.5066 \| 0.4063 \| 0.6839 \| 0.4063 \| 0.5548 \| 0.9194 \| 0.5310 \|
	\| 0.2596 \| 0.8503 \| 150 \| 0.4017 \| -188.6395 \| -2.4464 \| 0.4052 \| 0.6806 \| 0.4052 \| 0.5424 \| 0.9194 \| 0.5310 \|
	\| 0.1965 \| 1.1337 \| 200 \| 0.4002 \| -193.1247 \| -2.5977 \| 0.4041 \| 0.6801 \| 0.4041 \| 0.5589 \| 0.9194 \| 0.5310 \|
	\| 0.1784 \| 1.4171 \| 250 \| 0.3990 \| -189.4701 \| -2.7528 \| 0.4023 \| 0.6802 \| 0.4023 \| 0.5620 \| 0.9194 \| 0.5310 \|
	\| 0.1717 \| 1.7005 \| 300 \| 0.4021 \| -195.7304 \| -2.8777 \| 0.4042 \| 0.6799 \| 0.4042 \| 0.5455 \| 0.9194 \| 0.5310 \|
	\| 0.1527 \| 1.9839 \| 350 \| 0.3960 \| -211.6068 \| -3.1101 \| 0.3970 \| 0.6760 \| 0.3970 \| 0.5558 \| 0.9194 \| 0.5310 \|
	\| 0.1267 \| 2.2674 \| 400 \| 0.3981 \| -201.0368 \| -3.2515 \| 0.3998 \| 0.6776 \| 0.3998 \| 0.5620 \| 0.9194 \| 0.5310 \|
	\| 0.1121 \| 2.5508 \| 450 \| 0.3957 \| -192.7809 \| -2.9523 \| 0.3976 \| 0.6782 \| 0.3976 \| 0.5620 \| 0.9194 \| 0.5310 \|
	\| 0.1063 \| 2.8342 \| 500 \| 0.3941 \| -195.7920 \| -3.2835 \| 0.3949 \| 0.6760 \| 0.3949 \| 0.5671 \| 0.9194 \| 0.5310 \|
	\| 0.0891 \| 3.1176 \| 550 \| 0.3956 \| -196.1659 \| -3.1953 \| 0.3960 \| 0.6777 \| 0.3960 \| 0.5610 \| 0.9194 \| 0.5310 \|
	\| 0.0749 \| 3.4010 \| 600 \| 0.3962 \| -194.1237 \| -3.1966 \| 0.3973 \| 0.6781 \| 0.3973 \| 0.5744 \| 0.9194 \| 0.5310 \|
	\| 0.062 \| 3.6845 \| 650 \| 0.3956 \| -195.3244 \| -3.2412 \| 0.3967 \| 0.6778 \| 0.3967 \| 0.5702 \| 0.9194 \| 0.5310 \|
	\| 0.0583 \| 3.9679 \| 700 \| 0.3956 \| -196.4469 \| -3.2432 \| 0.3961 \| 0.6772 \| 0.3961 \| 0.5640 \| 0.9194 \| 0.5310 \|
	\| 0.0451 \| 4.2513 \| 750 \| 0.3952 \| -195.4398 \| -3.2666 \| 0.3955 \| 0.6771 \| 0.3955 \| 0.5671 \| 0.9194 \| 0.5310 \|
	\| 0.0438 \| 4.5347 \| 800 \| 0.3952 \| -195.2319 \| -3.2693 \| 0.3956 \| 0.6771 \| 0.3956 \| 0.5661 \| 0.9194 \| 0.5310 \|
	\| 0.0408 \| 4.8181 \| 850 \| 0.3951 \| -195.5095 \| -3.2704 \| 0.3956 \| 0.6771 \| 0.3956 \| 0.5661 \| 0.9194 \| 0.5310 \|


	### Framework versions

	- Transformers 4.42.0
	- Pytorch 2.3.0+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1