Model Details
Model Description
Trismegistus for Llama 3.2 1b. Credits to teknium for dataset and original model.
Model Sources [optional]
Llama 3.2 1b
Uses
- Use for esoteric joy.
Bias, Risks, and Limitations
May be biased as hell.
Recommendation:
- Don't take it personally.
How to Get Started with the Model
- Run it.
Training Data
Training Hyperparameters
- lora 4bit peft
Speeds, Sizes, Times [optional]
- global_step=16905
- training_loss=1.169401215731269
- train_runtime: 21882.4747
- train_samples_per_second: 3.09
- train_steps_per_second: 0.773
- total_flos: 4.437195883099177e+17
- train_loss': 1.169401215731269
- epoch: 5.0
Evaluation and Metrics
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
arc_challenge | 1 | none | 0 | acc | ↑ | 0.3345 | ± | 0.0138 |
none | 0 | acc_norm | ↑ | 0.3695 | ± | 0.0141 | ||
arc_easy | 1 | none | 0 | acc | ↑ | 0.6044 | ± | 0.0100 |
none | 0 | acc_norm | ↑ | 0.5694 | ± | 0.0102 | ||
boolq | 2 | none | 0 | acc | ↑ | 0.6410 | ± | 0.0084 |
hellaswag | 1 | none | 0 | acc | ↑ | 0.4400 | ± | 0.0050 |
none | 0 | acc_norm | ↑ | 0.5728 | ± | 0.0049 | ||
openbookqa | 1 | none | 0 | acc | ↑ | 0.2260 | ± | 0.0187 |
none | 0 | acc_norm | ↑ | 0.3540 | ± | 0.0214 | ||
piqa | 1 | none | 0 | acc | ↑ | 0.7002 | ± | 0.0107 |
none | 0 | acc_norm | ↑ | 0.7024 | ± | 0.0107 | ||
winogrande | 1 | none | 0 | acc | ↑ | 0.5785 | ± | 0.0139 |
Environmental Impact
Will steal your horse and kill your cat.
- Downloads last month
- 115
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.