abhinavkulkarni commited on
Commit
58fe905
1 Parent(s): 5da5f16

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -69
README.md CHANGED
@@ -1,27 +1,24 @@
1
  ---
2
- license: cc-by-nc-sa-4.0
3
  language:
4
  - en
5
- library_name: transformers
6
- pipeline_tag: text-generation
7
  tags:
8
- - Orca
9
  - AWQ
10
  inference: false
11
  ---
12
 
13
- # orca_mini_v2_13b
14
- An **Uncensored** LLaMA-13b model in collaboration with [Eric Hartford](https://huggingface.co/ehartford), trained on explain tuned datasets, created using Instructions and Input from WizardLM, Alpaca & Dolly-V2 datasets and applying Orca Research Paper dataset construction approaches.
15
 
16
  This model is a 4-bit 128 group size AWQ quantized model. For more information about AWQ quantization, please click [here](https://github.com/mit-han-lab/llm-awq).
17
 
18
  ## Model Date
19
 
20
- July 8, 2023
21
 
22
  ## Model License
23
 
24
- Please refer to original Orca Mini v2 model license ([link](https://huggingface.co/psmathur/orca_mini_v2_13b)).
25
 
26
  Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/blob/main/LICENSE)).
27
 
@@ -29,6 +26,8 @@ Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/
29
 
30
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
31
 
 
 
32
  ## How to Use
33
 
34
  ```bash
@@ -47,7 +46,7 @@ from transformers import AutoModelForCausalLM, AutoConfig, AutoTokenizer
47
  from accelerate import init_empty_weights, load_checkpoint_and_dispatch
48
  from huggingface_hub import snapshot_download
49
 
50
- model_name = "psmathur/orca_mini_v2_13b"
51
 
52
  # Config
53
  config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
@@ -62,10 +61,10 @@ q_config = {
62
  "q_group_size": 128,
63
  }
64
 
65
- load_quant = snapshot_download('abhinavkulkarni/psmathur-orca_mini_v2_13b-w4-g128-awq')
66
 
67
  with init_empty_weights():
68
- model = AutoModelForCausalLM.from_pretrained(model_name, config=config,
69
  torch_dtype=torch.float16, trust_remote_code=True)
70
 
71
  real_quantize_model_weight(model, w_bit=w_bit, q_config=q_config, init_only=True)
@@ -93,81 +92,49 @@ print(tokenizer.decode(output[0], skip_special_tokens=True))
93
 
94
  This evaluation was done using [LM-Eval](https://github.com/EleutherAI/lm-evaluation-harness).
95
 
96
- [orca_mini_v2_13b](https://huggingface.co/psmathur/orca_mini_v2_13b)
97
 
98
  | Task |Version| Metric | Value | |Stderr|
99
  |--------|------:|---------------|------:|---|------|
100
- |wikitext| 1|word_perplexity|23.8997| | |
101
- | | |byte_perplexity| 1.8104| | |
102
- | | |bits_per_byte | 0.8563| | |
103
 
104
- [orca_mini_v2_13b (4-bit 128-group AWQ)](https://huggingface.co/abhinavkulkarni/psmathur-orca_mini_v2_13b-w4-g128-awq)
105
 
106
  | Task |Version| Metric | Value | |Stderr|
107
  |--------|------:|---------------|------:|---|------|
108
- |wikitext| 1|word_perplexity|27.4695| | |
109
- | | |byte_perplexity| 1.8581| | |
110
- | | |bits_per_byte | 0.8938| | |
111
 
112
  ## Acknowledgements
113
 
114
- If you found `orca_mini_v2_13b` useful in your research or applications, please kindly cite using the following BibTeX:
115
-
116
- ```
117
- @misc{orca_mini_v2_13b,
118
- author = {Pankaj Mathur},
119
- title = {orca_mini_v2_13b: An explain tuned LLaMA-13b model on uncensored wizardlm, alpaca, & dolly datasets},
120
- year = {2023},
121
- publisher = {GitHub, HuggingFace},
122
- journal = {GitHub repository, HuggingFace repository},
123
- howpublished = {\url{https://https://huggingface.co/psmathur/orca_mini_v2_13b},
124
- }
125
  ```
126
- ```
127
- @software{touvron2023llama,
128
- title={LLaMA: Open and Efficient Foundation Language Models},
129
- author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and Rodriguez, Aurelien and Joulin, Armand and Grave, Edouard and Lample, Guillaume},
130
- journal={arXiv preprint arXiv:2302.13971},
131
- year={2023}
132
  }
133
  ```
134
  ```
135
- @misc{openalpaca,
136
- author = {Yixuan Su and Tian Lan and Deng Cai},
137
- title = {OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA},
138
- year = {2023},
139
- publisher = {GitHub},
140
- journal = {GitHub repository},
141
- howpublished = {\url{https://github.com/yxuansu/OpenAlpaca}},
142
  }
143
  ```
144
  ```
145
- @misc{alpaca,
146
- author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
147
- title = {Stanford Alpaca: An Instruction-following LLaMA model},
148
- year = {2023},
149
- publisher = {GitHub},
150
- journal = {GitHub repository},
151
- howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
152
- }
153
- ```
154
- ```
155
- @online{DatabricksBlog2023DollyV2,
156
- author = {Mike Conover and Matt Hayes and Ankit Mathur and Jianwei Xie and Jun Wan and Sam Shah and Ali Ghodsi and Patrick Wendell and Matei Zaharia and Reynold Xin},
157
- title = {Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM},
158
- year = {2023},
159
- url = {https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm},
160
- urldate = {2023-06-30}
161
- }
162
- ```
163
- ```
164
- @misc{xu2023wizardlm,
165
- title={WizardLM: Empowering Large Language Models to Follow Complex Instructions},
166
- author={Can Xu and Qingfeng Sun and Kai Zheng and Xiubo Geng and Pu Zhao and Jiazhan Feng and Chongyang Tao and Daxin Jiang},
167
- year={2023},
168
- eprint={2304.12244},
169
- archivePrefix={arXiv},
170
- primaryClass={cs.CL}
171
  }
172
  ```
173
 
 
1
  ---
2
+ license: cc
3
  language:
4
  - en
 
 
5
  tags:
 
6
  - AWQ
7
  inference: false
8
  ---
9
 
10
+ # VMware/open-llama-13B-open-instruct (4-bit 128g AWQ Quantized)
11
+ [Instruction-tuned version](https://huggingface.co/VMware/open-llama-13b-open-instruct) of the fully trained [Open LLama 13B](https://huggingface.co/openlm-research/open_llama_13b) model.
12
 
13
  This model is a 4-bit 128 group size AWQ quantized model. For more information about AWQ quantization, please click [here](https://github.com/mit-han-lab/llm-awq).
14
 
15
  ## Model Date
16
 
17
+ July 5, 2023
18
 
19
  ## Model License
20
 
21
+ Please refer to original MPT model license ([link](https://huggingface.co/VMware/open-llama-13b-open-instruct)).
22
 
23
  Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/blob/main/LICENSE)).
24
 
 
26
 
27
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
28
 
29
+ For Docker users, the `nvcr.io/nvidia/pytorch:23.06-py3` image is runtime v12.1 but otherwise the same as the configuration above and has also been verified to work.
30
+
31
  ## How to Use
32
 
33
  ```bash
 
46
  from accelerate import init_empty_weights, load_checkpoint_and_dispatch
47
  from huggingface_hub import snapshot_download
48
 
49
+ model_name = "VMware/open-llama-13b-open-instruct"
50
 
51
  # Config
52
  config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
 
61
  "q_group_size": 128,
62
  }
63
 
64
+ load_quant = snapshot_download('abhinavkulkarni/open-llama-13b-open-instruct-w4-g128-awq')
65
 
66
  with init_empty_weights():
67
+ model = AutoModelForCausalLM.from_config(config=config,
68
  torch_dtype=torch.float16, trust_remote_code=True)
69
 
70
  real_quantize_model_weight(model, w_bit=w_bit, q_config=q_config, init_only=True)
 
92
 
93
  This evaluation was done using [LM-Eval](https://github.com/EleutherAI/lm-evaluation-harness).
94
 
95
+ [Open-LLaMA-13B-Instruct](https://huggingface.co/VMware/open-llama-13b-open-instruct)
96
 
97
  | Task |Version| Metric | Value | |Stderr|
98
  |--------|------:|---------------|------:|---|------|
99
+ |wikitext| 1|word_perplexity|11.6564| | |
100
+ | | |byte_perplexity| 1.5829| | |
101
+ | | |bits_per_byte | 0.6626| | |
102
 
103
+ [Open-LLaMA-13B-Instruct (4-bit 128-group AWQ)](https://huggingface.co/abhinavkulkarni/open-llama-13b-open-instruct-w4-g128-awq)
104
 
105
  | Task |Version| Metric | Value | |Stderr|
106
  |--------|------:|---------------|------:|---|------|
107
+ |wikitext| 1|word_perplexity|11.9652| | |
108
+ | | |byte_perplexity| 1.5907| | |
109
+ | | |bits_per_byte | 0.6696| | |
110
 
111
  ## Acknowledgements
112
 
113
+ If you found OpenLLaMA useful in your research or applications, please cite using the following BibTeX:
 
 
 
 
 
 
 
 
 
 
114
  ```
115
+ @software{openlm2023openllama,
116
+ author = {Geng, Xinyang and Liu, Hao},
117
+ title = {OpenLLaMA: An Open Reproduction of LLaMA},
118
+ month = May,
119
+ year = 2023,
120
+ url = {https://github.com/openlm-research/open_llama}
121
  }
122
  ```
123
  ```
124
+ @software{together2023redpajama,
125
+ author = {Together Computer},
126
+ title = {RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset},
127
+ month = April,
128
+ year = 2023,
129
+ url = {https://github.com/togethercomputer/RedPajama-Data}
 
130
  }
131
  ```
132
  ```
133
+ @article{touvron2023llama,
134
+ title={Llama: Open and efficient foundation language models},
135
+ author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and others},
136
+ journal={arXiv preprint arXiv:2302.13971},
137
+ year={2023}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
138
  }
139
  ```
140