gui8600k's picture
fine-tuned-llama3-8b-fim-newsGenerator-PTBR
cab954d verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B
tags:
  - generated_from_trainer
model-index:
  - name: results_llama_8b_fim
    results: []

results_llama_8b_fim

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0074

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.3771 0.1145 100 1.0760
1.0804 0.2290 200 1.0491
1.1121 0.3434 300 1.0381
1.114 0.4579 400 1.0310
1.0847 0.5724 500 1.0264
1.0152 0.6869 600 1.0229
1.0289 0.8014 700 1.0203
1.0648 0.9159 800 1.0180
1.0885 1.0298 900 1.0156
1.0486 1.1442 1000 1.0122
1.1167 1.2587 1100 1.0108
1.0189 1.3732 1200 1.0098
1.0281 1.4877 1300 1.0090
1.0438 1.6022 1400 1.0084
1.0715 1.7167 1500 1.0079
1.0117 1.8311 1600 1.0076
1.024 1.9456 1700 1.0074

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 2.17.0
  • Tokenizers 0.21.0