--- library_name: transformers license: apache-2.0 base_model: Trelis/SmolLM-135M-Instruct-layer-pruned-90M-raw tags: - trl - sft - generated_from_trainer model-index: - name: 99-v9 results: [] --- # 99-v9 This model is a fine-tuned version of [Trelis/SmolLM-135M-Instruct-layer-pruned-90M-raw](https://huggingface.co/Trelis/SmolLM-135M-Instruct-layer-pruned-90M-raw) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.7495 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.002 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - distributed_type: multi-GPU - num_devices: 8 - gradient_accumulation_steps: 4 - total_train_batch_size: 256 - total_eval_batch_size: 64 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.005 - lr_scheduler_warmup_steps: 89 - training_steps: 17894 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:-----:|:---------------:| | 0.6331 | 0.0500 | 894 | 0.6004 | | 0.5667 | 0.0999 | 1788 | 0.5463 | | 0.5423 | 0.1499 | 2682 | 0.5138 | | 0.5749 | 0.1998 | 3576 | 0.7377 | | 0.5378 | 0.2498 | 4470 | 0.7542 | | 0.506 | 0.2998 | 5364 | 0.7902 | | 0.5561 | 0.3497 | 6258 | 0.7810 | | 0.5259 | 0.3997 | 7152 | 0.7914 | | 0.5516 | 0.4496 | 8046 | 0.7611 | | 0.5131 | 0.4996 | 8940 | 0.6860 | | 0.5069 | 0.5496 | 9834 | 0.7247 | | 0.4977 | 0.5995 | 10728 | 0.7375 | | 0.4976 | 0.6495 | 11622 | 0.7436 | | 0.5018 | 0.6995 | 12516 | 0.7520 | | 0.537 | 0.7494 | 13410 | 0.7613 | | 0.5018 | 0.7994 | 14304 | 0.6922 | | 0.4891 | 0.8493 | 15198 | 0.7322 | | 0.4808 | 0.8993 | 16092 | 0.7430 | | 0.5231 | 0.9493 | 16986 | 0.7546 | | 0.5103 | 0.9992 | 17880 | 0.7495 | ### Framework versions - Transformers 4.44.2 - Pytorch 2.1.1+cu121 - Datasets 3.0.0 - Tokenizers 0.19.1