metadata
library_name: transformers
base_model:
- meta-llama/Llama-3.3-70B-Instruct
tags:
- generated_from_trainer
model-index:
- name: 70B-L3.3-Cirrus-x1
results: []
license: llama3.3
my mental when things do not go well
70B-L3.3-Cirrus-x1
I quite liked it, after messing around. Same data composition as Freya, applied differently.
Has occasional brainfarts which are fixed with a regen, the price for more creative outputs.
Recommended Model Settings | Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.
Prompt Format: Llama-3-Instruct
Temperature: 1.1
min_p: 0.05
Training time in total was ~22 Hours on a 8xH100 Node.
Then, ~3 Hours spent merging checkpoints and model experimentation on a 2xH200 Node.
https://sao10k.carrd.co/ for contact.