BL-pythia-31m-simplepile-lite-2048-scratch
Train from scratch based on config of EleutherAI/pythia-31m on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.9891
- Accuracy: 0.3498
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 2
- eval_batch_size: 1
- seed: 80085
- gradient_accumulation_steps: 64
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-07
- lr_scheduler_type: inverse_sqrt
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 2.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
7.4089 | 0.07 | 100 | 7.3885 | 0.1133 |
6.2774 | 0.13 | 200 | 6.2091 | 0.1621 |
5.7019 | 0.2 | 300 | 5.7450 | 0.1890 |
5.4922 | 0.27 | 400 | 5.4697 | 0.2080 |
5.233 | 0.33 | 500 | 5.2846 | 0.2195 |
5.0523 | 0.4 | 600 | 5.1479 | 0.2296 |
4.9396 | 0.47 | 700 | 5.0391 | 0.2376 |
4.7633 | 0.53 | 800 | 4.9366 | 0.2458 |
4.7516 | 0.6 | 900 | 4.8339 | 0.2559 |
4.5937 | 0.67 | 1000 | 4.7286 | 0.2676 |
4.5079 | 0.73 | 1100 | 4.6293 | 0.2798 |
4.4608 | 0.8 | 1200 | 4.5433 | 0.2903 |
4.3426 | 0.87 | 1300 | 4.4719 | 0.2988 |
4.1722 | 0.93 | 1400 | 4.4089 | 0.3057 |
4.1655 | 1.0 | 1500 | 4.3585 | 0.3107 |
4.0927 | 1.07 | 1600 | 4.3101 | 0.3161 |
4.1439 | 1.13 | 1700 | 4.2714 | 0.3206 |
4.0064 | 1.2 | 1800 | 4.2330 | 0.3249 |
4.0633 | 1.27 | 1900 | 4.2015 | 0.3281 |
3.9948 | 1.33 | 2000 | 4.1702 | 0.3311 |
3.9389 | 1.4 | 2100 | 4.1439 | 0.3338 |
3.8833 | 1.47 | 2200 | 4.1200 | 0.3367 |
3.8411 | 1.53 | 2300 | 4.0949 | 0.3395 |
3.8481 | 1.6 | 2400 | 4.0764 | 0.3408 |
3.8397 | 1.67 | 2500 | 4.0578 | 0.3420 |
3.8897 | 1.73 | 2600 | 4.0383 | 0.3440 |
3.8785 | 1.8 | 2700 | 4.0206 | 0.3459 |
3.8126 | 1.87 | 2800 | 4.0044 | 0.3478 |
3.783 | 1.93 | 2900 | 3.9891 | 0.3498 |
Framework versions
- Transformers 4.33.1
- Pytorch 2.2.0.dev20230907+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 24.7 |
ARC (25-shot) | 21.59 |
HellaSwag (10-shot) | 25.79 |
MMLU (5-shot) | 24.99 |
TruthfulQA (0-shot) | 50.62 |
Winogrande (5-shot) | 48.62 |
GSM8K (5-shot) | 0.0 |
DROP (3-shot) | 1.32 |
- Downloads last month
- 1,002
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.