sqlcoder-7b-2 / README.md
rishdotblog's picture
Upload tokenizer
c4a442d verified
|
raw
history blame
2.82 kB
metadata
license: llama2
tags:
  - generated_from_trainer
base_model: codellama/CodeLlama-7b-hf
model-index:
  - name: sqlcoder_7b_fullft_ds7_linear
    results: []

sqlcoder_7b_fullft_ds7_linear

This model is a fine-tuned version of codellama/CodeLlama-7b-hf on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3517
  • Sql Exact Match String: 0
  • Tokens Match Avg: 0.9014
  • First Index Mismatch Avg: 2.2356
  • Mean Mismatch I Diff Avg: 12.5313
  • Count Mismatch I Diff Avg: 6.2756

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 600

Training results

Training Loss Epoch Step Validation Loss Sql Exact Match String Tokens Match Avg First Index Mismatch Avg Mean Mismatch I Diff Avg Count Mismatch I Diff Avg
0.14 0.1 100 0.3510 0 0.8940 2.0844 11.4371 6.88
0.1083 0.2 200 0.3677 0 0.8930 2.1733 11.3445 6.6044
0.0912 0.3 300 0.3710 0 0.8953 2.2444 12.0020 6.44
0.0699 0.4 400 0.3598 0 0.8996 2.1778 12.3582 6.3289
0.0619 0.5 500 0.3516 0 0.9010 2.2489 12.6065 6.2756
0.0766 0.6 600 0.3517 0 0.9014 2.2356 12.5313 6.2756

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1