Configuration Parsing Warning: In adapter_config.json: "peft.base_model_name_or_path" must be a string

shawgpt-ft4

This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7517

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.1354 1.0 1 4.2318
2.1268 2.0 2 4.1433
2.0717 3.0 3 3.9443
1.978 4.0 4 3.7584
1.8766 5.0 5 3.5920
1.7922 6.0 6 3.4372
1.7215 7.0 7 3.2960
1.6398 8.0 8 3.1710
1.582 9.0 9 3.0635
1.5468 10.0 10 2.9731
1.4888 11.0 11 2.8988
1.4533 12.0 12 2.8398
1.4204 13.0 13 2.7961
1.3977 14.0 14 2.7667
1.3837 15.0 15 2.7517

Framework versions

  • PEFT 0.13.2
  • Transformers 4.44.2
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.2
  • Tokenizers 0.19.1
Downloads last month
0
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for nour-sam/shawgpt-ft4