AA_preference_cosi_new_step10_0_80

This model is a fine-tuned version of llava-hf/llava-v1.6-mistral-7b-hf on the AA_preference_cosi_new_step10_0_80 dataset. It achieves the following results on the evaluation set:

Loss: 0.5448
Rewards/chosen: 0.4294
Rewards/rejected: -2.4664
Rewards/accuracies: 0.7969
Rewards/margins: 2.8958
Logps/rejected: -238.7556
Logps/chosen: -244.1195
Logits/rejected: -2.2467
Logits/chosen: -2.2820

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 4
total_train_batch_size: 256
total_eval_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 10
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen
0.5529	0.4673	50	0.5947	0.8739	-0.2594	0.7318	1.1333	-216.6857	-239.6745	-2.0297	-2.0593
0.5159	0.9346	100	0.5286	-0.2159	-2.0191	0.7812	1.8031	-234.2824	-250.5727	-1.9168	-1.9646
0.2666	1.4019	150	0.5667	0.7904	-1.6811	0.7891	2.4715	-230.9029	-240.5096	-2.2443	-2.2847
0.3127	1.8692	200	0.5356	0.6480	-1.8158	0.8047	2.4639	-232.2502	-241.9330	-2.3879	-2.4148
0.152	2.3364	250	0.5442	0.6088	-2.0845	0.7891	2.6933	-234.9365	-242.3255	-2.2443	-2.2814
0.1431	2.8037	300	0.5450	0.4225	-2.4743	0.7943	2.8968	-238.8347	-244.1887	-2.2467	-2.2822

Framework versions

Transformers 4.45.2
Pytorch 2.4.0+cu121
Datasets 2.21.0
Tokenizers 0.20.3

htlou
/

mm-interp-AA_preference_cosi_new_step10_0_80

AA_preference_cosi_new_step10_0_80

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for htlou/mm-interp-AA_preference_cosi_new_step10_0_80

Evaluation results