Exllamav2 quant (exl2 / 8.0 bpw) made with ExLlamaV2 v0.1.1

Other EXL2 quants:

Quant Model Size lm_head
2.2
7777 MB
6
2.5
8521 MB
6
3.0
9944 MB
6
3.5
11355 MB
6
3.75
12070 MB
6
4.0
12785 MB
6
4.25
13504 MB
6
5.0
15634 MB
6
6.0
18589 MB
8
6.5
19948 MB
8
8.0
24070 MB
8

GGUF

Experimental RP-oriented MoE, the idea was to get a model that would be equal to or better than Mixtral 8x7B and it's finetunes in RP/ERP tasks.

There's:

Llama 3 SnowStorm v1.15B 4x8B

base_model: Sao10K_L3-8B-Stheno-v3.1
gate_mode: random
dtype: bfloat16
experts_per_token: 2
experts:
  - source_model: Nitral-AI_Poppy_Porpoise-1.0-L3-8B
  - source_model: NeverSleep_Llama-3-Lumimaid-8B-v0.1-OAS
  - source_model: openlynn_Llama-3-Soliloquy-8B-v2
  - source_model: Sao10K_L3-8B-Stheno-v3.1

Models used

Difference(from SnowStorm v1.0)

Vision

llama3_mmproj

image/png

Prompt format: Llama 3

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 68.01
AI2 Reasoning Challenge (25-Shot) 60.67
HellaSwag (10-Shot) 81.60
MMLU (5-Shot) 68.12
TruthfulQA (0-shot) 51.69
Winogrande (5-shot) 76.56
GSM8k (5-shot) 69.45
Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results