TianyiQ's picture
Upload folder using huggingface_hub
591063c verified
|
raw
history blame
2.52 kB
metadata
license: other
base_model: meta-llama/Meta-Llama-3-70B
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: C013_Meta-Llama-3-70B_pretrain_20240508_200642
    results: []

C013_Meta-Llama-3-70B_pretrain_20240508_200642

This model is a fine-tuned version of /mnt/fl/models/llama3/Meta-Llama-3-70B on the C013_data dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7400

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-06
  • train_batch_size: 2
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 32
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 128
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_ratio: 0.075
  • num_epochs: 4.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.8776 0.2090 7 0.7902
0.8473 0.4179 14 0.7703
0.8293 0.6269 21 0.7603
0.8173 0.8358 28 0.7481
0.7415 1.0448 35 0.7402
0.6794 1.2537 42 0.7419
0.6688 1.4627 49 0.7392
0.6498 1.6716 56 0.7367
0.6701 1.8806 63 0.7358
0.664 2.0896 70 0.7355
0.6447 2.2985 77 0.7361
0.6412 2.5075 84 0.7373
0.6458 2.7164 91 0.7383
0.6356 2.9254 98 0.7387
0.6398 3.1343 105 0.7387
0.6228 3.3433 112 0.7391
0.6139 3.5522 119 0.7395
0.591 3.7612 126 0.7398

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1