htlou's picture
Upload folder using huggingface_hub
b7656ae verified
metadata
library_name: transformers
license: other
base_model: llava-hf/llava-v1.6-mistral-7b-hf
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: AA_preference_cocour_new_step10_0_90
    results: []

AA_preference_cocour_new_step10_0_90

This model is a fine-tuned version of llava-hf/llava-v1.6-mistral-7b-hf on the AA_preference_cocour_new_step10_0_90 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4735
  • Rewards/chosen: 0.5420
  • Rewards/rejected: -2.3019
  • Rewards/accuracies: 0.8218
  • Rewards/margins: 2.8439
  • Logps/rejected: -225.0065
  • Logps/chosen: -243.1698
  • Logits/rejected: -2.6339
  • Logits/chosen: -2.6492

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.6087 0.4158 50 0.5778 0.7419 -0.5074 0.7870 1.2493 -207.0612 -241.1710 -2.5285 -2.5456
0.5039 0.8316 100 0.5226 0.2943 -1.5654 0.8056 1.8596 -217.6406 -245.6472 -2.5932 -2.6105
0.2346 1.2474 150 0.4851 0.5179 -1.8870 0.8356 2.4048 -220.8566 -243.4111 -2.5511 -2.5729
0.26 1.6632 200 0.4692 0.8149 -1.7120 0.8264 2.5269 -219.1066 -240.4409 -2.6651 -2.6766
0.1628 2.0790 250 0.4654 0.2522 -2.4566 0.8264 2.7088 -226.5530 -246.0683 -2.6802 -2.6929
0.1808 2.4948 300 0.4721 0.8229 -2.0114 0.8241 2.8342 -222.1007 -240.3612 -2.6392 -2.6528
0.1514 2.9106 350 0.4736 0.5409 -2.3033 0.8218 2.8442 -225.0204 -243.1809 -2.6346 -2.6500

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.20.3