ashishdatta
commited on
Commit
•
38e8060
1
Parent(s):
f16df1b
README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,123 @@
|
|
1 |
---
|
|
|
|
|
2 |
license: other
|
3 |
-
|
4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
license: other
|
5 |
+
tags:
|
6 |
+
- causal-lm
|
7 |
+
datasets:
|
8 |
+
- HuggingFaceH4/ultrachat_200k
|
9 |
+
- allenai/ultrafeedback_binarized_cleaned
|
10 |
+
- meta-math/MetaMathQA
|
11 |
+
- WizardLM/WizardLM_evol_instruct_V2_196k
|
12 |
+
- openchat/openchat_sharegpt4_dataset
|
13 |
+
- LDJnr/Capybara
|
14 |
+
- Intel/orca_dpo_pairs
|
15 |
+
- hkust-nlp/deita-10k-v0
|
16 |
+
- Anthropic/hh-rlhf
|
17 |
+
- glaiveai/glaive-function-calling-v2
|
18 |
+
extra_gated_fields:
|
19 |
+
Name: text
|
20 |
+
Email: text
|
21 |
+
Country: text
|
22 |
+
Organization or Affiliation: text
|
23 |
+
I ALLOW Stability AI to email me about new model releases: checkbox
|
24 |
---
|
25 |
+
# `StableLM 2 12B Chat`
|
26 |
+
|
27 |
+
## Model Description
|
28 |
+
|
29 |
+
`Stable LM 2 12B Chat` is a 12 billion parameter instruction tuned language model trained on a mix of publicly available datasets and synthetic datasets, utilizing [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
|
30 |
+
GGUF files were generated with [b2684](https://github.com/ggerganov/llama.cpp/releases/tag/b2684) release
|
31 |
+
|
32 |
+
## Usage
|
33 |
+
|
34 |
+
`StableLM 2 12B Chat` uses the following instruction ChatML format.
|
35 |
+
|
36 |
+
```bash
|
37 |
+
./main -m stablelm-2-12b-q4_k_m.gguf -p "Implement snake game using pygame"
|
38 |
+
```
|
39 |
+
|
40 |
+
## Model Details
|
41 |
+
|
42 |
+
* **Developed by**: [Stability AI](https://stability.ai/)
|
43 |
+
* **Model type**: `StableLM 2 12B Chat` model is an auto-regressive language model based on the transformer decoder architecture.
|
44 |
+
* **Language(s)**: English
|
45 |
+
* **Paper**: [Stable LM 2 Chat Technical Report]((https://arxiv.org/abs/2402.17834)
|
46 |
+
* **Library**: [Alignment Handbook](https://github.com/huggingface/alignment-handbook.git)
|
47 |
+
* **Finetuned from model**:
|
48 |
+
* **License**: [StabilityAI Non-Commercial Research Community License](https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b/blob/main/LICENSE). If you want to use this model for your commercial products or purposes, please contact us [here](https://stability.ai/contact) to learn more.
|
49 |
+
* **Contact**: For questions and comments about the model, please email `[email protected]`.
|
50 |
+
|
51 |
+
### Training Dataset
|
52 |
+
|
53 |
+
The dataset is comprised of a mixture of open datasets large-scale datasets available on the [HuggingFace Hub](https://huggingface.co/datasets) as well as an internal safety dataset:
|
54 |
+
1. SFT Datasets
|
55 |
+
- HuggingFaceH4/ultrachat_200k
|
56 |
+
- meta-math/MetaMathQA
|
57 |
+
- WizardLM/WizardLM_evol_instruct_V2_196k
|
58 |
+
- Open-Orca/SlimOrca
|
59 |
+
- openchat/openchat_sharegpt4_dataset
|
60 |
+
- LDJnr/Capybara
|
61 |
+
- hkust-nlp/deita-10k-v0
|
62 |
+
- teknium/OpenHermes-2.5
|
63 |
+
- glaiveai/glaive-function-calling-v2
|
64 |
+
|
65 |
+
2. Safety Datasets:
|
66 |
+
- Anthropic/hh-rlhf
|
67 |
+
- Internal Safety Dataset
|
68 |
+
|
69 |
+
3. Preference Datasets:
|
70 |
+
- argilla/dpo-mix-7k
|
71 |
+
|
72 |
+
## Performance
|
73 |
+
|
74 |
+
### MT-Bench
|
75 |
+
|
76 |
+
| Model | Parameters | MT Bench (Inflection-corrected) |
|
77 |
+
|---------------------------------------|------------|---------------------------------|
|
78 |
+
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 13B/47B | 8.48 ± 0.06 |
|
79 |
+
| stabilityai/stablelm-2-12b-chat | 12B | 8.15 ± 0.08 |
|
80 |
+
| Qwen/Qwen1.5-14B-Chat | 14B | 7.95 ± 0.10 |
|
81 |
+
| HuggingFaceH4/zephyr-7b-gemma-v0.1 | 8.5B | 7.82 ± 0.03 |
|
82 |
+
| mistralai/Mistral-7B-Instruct-v0.2 | 7B | 7.48 ± 0.02 |
|
83 |
+
| meta-llama/Llama-2-70b-chat-hf | 70B | 7.29 ± 0.05 |
|
84 |
+
|
85 |
+
### OpenLLM Leaderboard
|
86 |
+
|
87 |
+
| Model | Parameters | Average | ARC Challenge (25-shot) | HellaSwag (10-shot) | MMLU (5-shot) | TruthfulQA (0-shot) | Winogrande (5-shot) | GSM8K (5-shot) |
|
88 |
+
| -------------------------------------- | ---------- | ------- | ---------------------- | ------------------- | ------------- | ------------------- | ------------------- | -------------- |
|
89 |
+
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 13B/47B | 72.71 | 70.14 | 87.55 | 71.40 | 64.98 | 81.06 | 61.11 |
|
90 |
+
| stabilityai/stablelm-2-12b-chat | 12B | 68.45 | 65.02 | 86.06 | 61.14 | 62.00 | 78.77 | 57.70 |
|
91 |
+
| Qwen/Qwen1.5-14B | 14B | 66.70 | 56.57 | 81.08 | 69.36 | 52.06 | 73.48 | 67.63 |
|
92 |
+
| mistralai/Mistral-7B-Instruct-v0.2 | 7B | 65.71 | 63.14 | 84.88 | 60.78 | 60.26 | 77.19 | 40.03 |
|
93 |
+
| HuggingFaceH4/zephyr-7b-gemma-v0.1 | 8.5B | 62.41 | 58.45 | 83.48 | 60.68 | 52.07 | 74.19 | 45.56 |
|
94 |
+
| Qwen/Qwen1.5-14B-Chat | 14B | 62.37 | 58.79 | 82.33 | 68.52 | 60.38 | 73.32 | 30.86 |
|
95 |
+
| google/gemma-7b | 8.5B | 63.75 | 61.09 | 82.20 | 64.56 | 44.79 | 79.01 | 50.87 |
|
96 |
+
| stabilityai/stablelm-2-12b | 12B | 63.53 | 58.45 | 84.33 | 62.09 | 48.16 | 78.10 | 56.03 |
|
97 |
+
| mistralai/Mistral-7B-v0.1 | 7B | 60.97 | 59.98 | 83.31 | 64.16 | 42.15 | 78.37 | 37.83 |
|
98 |
+
| meta-llama/Llama-2-13b-hf | 13B | 55.69 | 59.39 | 82.13 | 55.77 | 37.38 | 76.64 | 22.82 |
|
99 |
+
| meta-llama/Llama-2-13b-chat-hf | 13B | 54.92 | 59.04 | 81.94 | 54.64 | 41.12 | 74.51 | 15.24 |
|
100 |
+
|
101 |
+
## Use and Limitations
|
102 |
+
|
103 |
+
### Intended Use
|
104 |
+
|
105 |
+
The model is intended to be used in chat-like applications. Developers must evaluate the model for safety performance in their specific use case. Read more about [safety and limitations](#limitations-and-bias) below.
|
106 |
+
|
107 |
+
### Limitations and Bias
|
108 |
+
|
109 |
+
We strongly recommend pairing this model with an input and output classifier to prevent harmful responses.
|
110 |
+
Using this model will require guardrails around your inputs and outputs to ensure that any outputs returned are not hallucinations.
|
111 |
+
Additionally, as each use case is unique, we recommend running your own suite of tests to ensure proper performance of this model.
|
112 |
+
Finally, do not use the models if they are unsuitable for your application, or for any applications that may cause deliberate or unintentional harm to others.
|
113 |
+
|
114 |
+
## How to Cite
|
115 |
+
|
116 |
+
```
|
117 |
+
@article{bellagente2024stable,
|
118 |
+
title={Stable LM 2 1.6 B Technical Report},
|
119 |
+
author={Bellagente, Marco and Tow, Jonathan and Mahan, Dakota and Phung, Duy and Zhuravinskyi, Maksym and Adithyan, Reshinth and Baicoianu, James and Brooks, Ben and Cooper, Nathan and Datta, Ashish and others},
|
120 |
+
journal={arXiv preprint arXiv:2402.17834},
|
121 |
+
year={2024}
|
122 |
+
}
|
123 |
+
```
|