|
--- |
|
library_name: transformers |
|
base_model: |
|
- meta-llama/Llama-3.3-70B-Instruct |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: 70B-L3.3-Cirrus-x1 |
|
results: [] |
|
license: llama3.3 |
|
--- |
|
|
|
![yeah](https://huggingface.co/Sao10K/70B-L3.3-mhnnn-x1/resolve/main/Huh.jpg) |
|
*my mental when things do not go well* |
|
|
|
# 70B-L3.3-Cirrus-x1 |
|
|
|
I quite liked it, after messing around. Same data composition as Freya, applied differently. |
|
|
|
Has occasional brainfarts which are fixed with a regen, the price for more creative outputs. |
|
|
|
Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.* |
|
``` |
|
Prompt Format: Llama-3-Instruct |
|
Temperature: 1.1 |
|
min_p: 0.05 |
|
``` |
|
|
|
``` |
|
Training time in total was ~22 Hours on a 8xH100 Node. |
|
Then, ~3 Hours spent merging checkpoints and model experimentation on a 2xH200 Node. |
|
``` |
|
|
|
https://sao10k.carrd.co/ for contact. |
|
|
|
--- |
|
|
|
|