Built with Axolotl

See axolotl config

axolotl version: 0.4.1

adapter: lora
base_model: unsloth/gemma-2b-it
batch_size: 8
bf16: true
chat_template: tokenizer_default_fallback_alpaca
datasets:
- data_files:
  - 88cfea977fe74782_train_data.json
  ds_type: json
  format: custom
  path: /workspace/input_data/88cfea977fe74782_train_data.json
  type:
    field_instruction: smiles
    field_output: molt5
    format: '{instruction}'
    no_input_format: '{instruction}'
    system_format: '{system}'
    system_prompt: ''
evals_per_epoch: 1
flash_attention: true
gpu_memory_limit: 80GiB
gradient_checkpointing: true
group_by_length: true
hub_model_id: willtensora/f0d6caa9-89a9-4666-9a6d-c8cda2015281
hub_strategy: checkpoint
learning_rate: 0.0002
logging_steps: 10
lora_alpha: 256
lora_dropout: 0.1
lora_r: 128
lora_target_linear: true
lr_scheduler: cosine
micro_batch_size: 1
model_type: AutoModelForCausalLM
num_epochs: 100
optimizer: adamw_bnb_8bit
output_dir: miner_id_24
pad_to_sequence_len: true
resize_token_embeddings_to_32x: false
sample_packing: false
saves_per_epoch: 2
sequence_len: 2048
tokenizer_type: GemmaTokenizerFast
train_on_inputs: false
trust_remote_code: true
val_set_size: 0.1
wandb_entity: ''
wandb_mode: online
wandb_project: Gradients-On-Demand
wandb_run: your_name
wandb_runid: default
warmup_ratio: 0.05
xformers_attention: true

f0d6caa9-89a9-4666-9a6d-c8cda2015281

This model is a fine-tuned version of unsloth/gemma-2b-it on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9427

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 8
  • total_eval_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 0.05 1 3.5172
1.0375 1.0 20 0.9186
0.5932 2.0 40 0.9190
0.4433 3.0 60 0.9756
0.3115 4.0 80 0.9780
0.2432 5.0 100 1.0348
0.219 6.0 120 1.1386
0.1868 7.0 140 1.0399
0.1624 8.0 160 1.2174
0.2109 9.0 180 1.1489
0.1223 10.0 200 1.2047
0.1149 11.0 220 1.2123
0.114 12.0 240 1.2854
0.0914 13.0 260 1.3633
0.0823 14.0 280 1.2355
0.0901 15.0 300 1.2453
0.093 16.0 320 1.3146
0.077 17.0 340 1.4159
0.0797 18.0 360 1.3376
0.0839 19.0 380 1.4419
0.0506 20.0 400 1.3841
0.0582 21.0 420 1.3847
0.0644 22.0 440 1.3697
0.0524 23.0 460 1.4068
0.0602 24.0 480 1.3840
0.0597 25.0 500 1.4276
0.0371 26.0 520 1.5041
0.0448 27.0 540 1.4607
0.0494 28.0 560 1.4608
0.042 29.0 580 1.5975
0.0334 30.0 600 1.4700
0.0403 31.0 620 1.5470
0.043 32.0 640 1.5968
0.0349 33.0 660 1.5662
0.0412 34.0 680 1.6331
0.0263 35.0 700 1.6191
0.0249 36.0 720 1.6646
0.0365 37.0 740 1.4995
0.0176 38.0 760 1.7255
0.0426 39.0 780 1.5561
0.0174 40.0 800 1.6246
0.0259 41.0 820 1.7055
0.0182 42.0 840 1.6314
0.013 43.0 860 1.5924
0.0194 44.0 880 1.7000
0.0194 45.0 900 1.6371
0.0171 46.0 920 1.7760
0.0094 47.0 940 1.7117
0.0061 48.0 960 1.7486
0.004 49.0 980 1.7964
0.003 50.0 1000 1.8029
0.0047 51.0 1020 1.7653
0.0033 52.0 1040 1.7602
0.0028 53.0 1060 1.7846
0.0091 54.0 1080 1.7363
0.0009 55.0 1100 1.7427
0.0005 56.0 1120 1.7763
0.0003 57.0 1140 1.8004
0.0004 58.0 1160 1.8191
0.0004 59.0 1180 1.8343
0.0004 60.0 1200 1.8433
0.0002 61.0 1220 1.8534
0.0003 62.0 1240 1.8619
0.0003 63.0 1260 1.8702
0.0002 64.0 1280 1.8774
0.0002 65.0 1300 1.8829
0.0003 66.0 1320 1.8894
0.0003 67.0 1340 1.8937
0.0001 68.0 1360 1.8985
0.0001 69.0 1380 1.9014
0.0003 70.0 1400 1.9057
0.0 71.0 1420 1.9103
0.0001 72.0 1440 1.9126
0.0003 73.0 1460 1.9165
0.0002 74.0 1480 1.9191
0.0002 75.0 1500 1.9210
0.0003 76.0 1520 1.9238
0.0001 77.0 1540 1.9273
0.0002 78.0 1560 1.9279
0.0002 79.0 1580 1.9301
0.0002 80.0 1600 1.9313
0.0003 81.0 1620 1.9321
0.0001 82.0 1640 1.9346
0.0 83.0 1660 1.9355
0.0004 84.0 1680 1.9356
0.0 85.0 1700 1.9385
0.0003 86.0 1720 1.9385
0.0001 87.0 1740 1.9396
0.0002 88.0 1760 1.9398
0.0001 89.0 1780 1.9407
0.0001 90.0 1800 1.9418
0.0002 91.0 1820 1.9418
0.0002 92.0 1840 1.9414
0.0003 93.0 1860 1.9418
0.0 94.0 1880 1.9427
0.0002 95.0 1900 1.9436
0.0003 96.0 1920 1.9425
0.0002 97.0 1940 1.9429
0.0003 98.0 1960 1.9430
0.0001 99.0 1980 1.9433
0.0002 100.0 2000 1.9427

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
10
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for willtensora/f0d6caa9-89a9-4666-9a6d-c8cda2015281

Adapter
(198)
this model