Behold, one of the first fine-tunes of Mistral's 7B 0.2 Base model. SatoshiN is trained on 4 epochs 2e-4 learning rate (cosine) of a diverse custom data-set, combined with a polishing round of that same data-set at a 1e-4 linear learning rate. It's a nice assistant that isn't afraid to ask questions, and gather additional information before providing a response to user prompts.

SatoshiN | Base-Model

Wikitext Perplexity: 6.27 | 5.4

**Similar to SOTA, this model runs a bit hot, try using lower temperatures below .5 if experiencing any nonsense)

Downloads last month
102
Safetensors
Model size
7.24B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for chrischain/SatoshiNv5

Quantizations
1 model

Spaces using chrischain/SatoshiNv5 6