results_llama_8b_fim

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0074

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.3771 0.1145 100 1.0760
1.0804 0.2290 200 1.0491
1.1121 0.3434 300 1.0381
1.114 0.4579 400 1.0310
1.0847 0.5724 500 1.0264
1.0152 0.6869 600 1.0229
1.0289 0.8014 700 1.0203
1.0648 0.9159 800 1.0180
1.0885 1.0298 900 1.0156
1.0486 1.1442 1000 1.0122
1.1167 1.2587 1100 1.0108
1.0189 1.3732 1200 1.0098
1.0281 1.4877 1300 1.0090
1.0438 1.6022 1400 1.0084
1.0715 1.7167 1500 1.0079
1.0117 1.8311 1600 1.0076
1.024 1.9456 1700 1.0074

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 2.17.0
  • Tokenizers 0.21.0
Downloads last month
4
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for gui8600k/results_llama_8b_fim

Adapter
(537)
this model