metadata
library_name: transformers
license: other
base_model: llava-hf/llava-v1.6-mistral-7b-hf
tags:
- llama-factory
- full
- generated_from_trainer
model-index:
- name: AA_preference_cocour_new_step10_0_90
results: []
AA_preference_cocour_new_step10_0_90
This model is a fine-tuned version of llava-hf/llava-v1.6-mistral-7b-hf on the AA_preference_cocour_new_step10_0_90 dataset. It achieves the following results on the evaluation set:
- Loss: 0.4735
- Rewards/chosen: 0.5420
- Rewards/rejected: -2.3019
- Rewards/accuracies: 0.8218
- Rewards/margins: 2.8439
- Logps/rejected: -225.0065
- Logps/chosen: -243.1698
- Logits/rejected: -2.6339
- Logits/chosen: -2.6492
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-06
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 4
- total_train_batch_size: 256
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.6087 | 0.4158 | 50 | 0.5778 | 0.7419 | -0.5074 | 0.7870 | 1.2493 | -207.0612 | -241.1710 | -2.5285 | -2.5456 |
0.5039 | 0.8316 | 100 | 0.5226 | 0.2943 | -1.5654 | 0.8056 | 1.8596 | -217.6406 | -245.6472 | -2.5932 | -2.6105 |
0.2346 | 1.2474 | 150 | 0.4851 | 0.5179 | -1.8870 | 0.8356 | 2.4048 | -220.8566 | -243.4111 | -2.5511 | -2.5729 |
0.26 | 1.6632 | 200 | 0.4692 | 0.8149 | -1.7120 | 0.8264 | 2.5269 | -219.1066 | -240.4409 | -2.6651 | -2.6766 |
0.1628 | 2.0790 | 250 | 0.4654 | 0.2522 | -2.4566 | 0.8264 | 2.7088 | -226.5530 | -246.0683 | -2.6802 | -2.6929 |
0.1808 | 2.4948 | 300 | 0.4721 | 0.8229 | -2.0114 | 0.8241 | 2.8342 | -222.1007 | -240.3612 | -2.6392 | -2.6528 |
0.1514 | 2.9106 | 350 | 0.4736 | 0.5409 | -2.3033 | 0.8218 | 2.8442 | -225.0204 | -243.1809 | -2.6346 | -2.6500 |
Framework versions
- Transformers 4.45.2
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.20.3