daytoy-models
commited on
Commit
·
58b5ae6
1
Parent(s):
b9b8cac
Update README.md
Browse files
README.md
CHANGED
@@ -1,185 +1,87 @@
|
|
1 |
---
|
2 |
-
license:
|
3 |
-
|
4 |
-
|
5 |
-
-
|
6 |
-
|
|
|
|
|
7 |
---
|
8 |
-
# NexusRaven-13B: Surpassing the state-of-the-art in open-source function calling LLMs.
|
9 |
|
10 |
-
|
11 |
-
<a href="https://huggingface.co/Nexusflow" target="_blank">Nexusflow HF</a> - <a href="http://nexusflow.ai/blog" target="_blank">NexusRaven blog post</a> - <a href="https://huggingface.co/Nexusflow/NexusRaven-13B" target="_blank">NexusRaven-13B</a> - <a href="https://x.com/NexusflowX/status/1707470614012035561?s=20" target="_blank">NexusRaven-13B Twitter Thread</a> - <a href="https://github.com/nexusflowai/NexusRaven/" target="_blank">NexusRaven-13B Github</a> - <a href="https://huggingface.co/datasets/Nexusflow/NexusRaven_API_evaluation" target="_blank">NexusRaven API evaluation dataset</a>
|
12 |
-
</p>
|
13 |
|
14 |
-
|
15 |
-
<a><img src="NexusRaven.png" alt="NexusRaven" style="width: 40%; min-width: 300px; display: block; margin: auto;"></a>
|
16 |
-
</p>
|
17 |
|
18 |
-
|
19 |
-
- [NexusRaven-13B: Surpassing the state-of-the-art in open-source function calling LLMs.](#nexusraven-13b-surpassing-the-state-of-the-art-in-open-source-function-calling-llms)
|
20 |
-
- [Introducing NexusRaven-13B](#introducing-nexusraven-13b)
|
21 |
-
- [NexusRaven model usage](#nexusraven-model-usage)
|
22 |
-
- [Training procedure](#training-procedure)
|
23 |
-
- [Training hyperparameters](#training-hyperparameters)
|
24 |
-
- [Framework versions](#framework-versions)
|
25 |
-
- [Limitations](#limitations)
|
26 |
-
- [License](#license)
|
27 |
-
- [References](#references)
|
28 |
-
- [Citation](#citation)
|
29 |
-
- [Contact](#contact)
|
30 |
|
|
|
31 |
|
32 |
-
|
33 |
|
34 |
-
|
35 |
-
NexusRaven is an open-source and commercially viable function calling LLM that surpasses the state-of-the-art in function calling capabilities.
|
36 |
-
|
37 |
-
📊 Performance Highlights: With our demonstration retrieval system, NexusRaven-13B achieves a 95% success rate in using cybersecurity tools such as CVE/CPE Search and VirusTotal, while prompting GPT-4 achieves 64%. It has significantly lower cost and faster inference speed compared to GPT-4.
|
38 |
-
|
39 |
-
🔧 Generalization to the Unseen: NexusRaven-13B generalizes to tools never seen during model training, achieving a success rate comparable with GPT-3.5 in zero-shot setting, significantly outperforming all other open-source LLMs of similar sizes.
|
40 |
-
|
41 |
-
🔥 Commercially Permissive: The training of NexusRaven-13B does not involve any data generated by proprietary LLMs such as GPT-4. You have full control of the model when deployed in commercial applications.
|
42 |
-
|
43 |
-
<p align="center" width="100%">
|
44 |
-
<a><img src="Single-Attempt_Function_Calling.png" alt="NexusRaven" style="width: 80%; min-width: 300px; display: block; margin: auto;"></a>
|
45 |
-
<a><img src="Zero-shot_Evaluation.png" alt="NexusRaven" style="width: 80%; min-width: 300px; display: block; margin: auto;"></a>
|
46 |
-
</p>
|
47 |
-
|
48 |
-
|
49 |
-
## NexusRaven model usage
|
50 |
-
NexusRaven accepts a list of python functions. These python functions can do anything (including sending GET/POST requests to external APIs!). The two requirements include the python function signature and the appropriate docstring to generate the function call.
|
51 |
-
|
52 |
-
NexusRaven is highly compatible with langchain. See [langchain_example.py](https://huggingface.co/Nexusflow/NexusRaven-13B/blob/main/langchain_example.py). An example without langchain can be found in [non_langchain_example.py](https://huggingface.co/Nexusflow/NexusRaven-13B/blob/main/non_langchain_example.py).
|
53 |
-
|
54 |
-
Please note that the model will reflect on the answer sometimes, so we highly recommend stopping the model generation at a stopping criteria of `["\nReflection:"]`, to avoid spending unnecessary tokens during inference, but the reflection might help in some rare cases. This is reflected in our langchain example.
|
55 |
-
|
56 |
-
More information about how to prompt the model can be found in [prompting_readme.md](prompting_readme.md).
|
57 |
-
|
58 |
-
The "Initial Answer" can be executed to run the function.
|
59 |
-
|
60 |
-
### Quickstart
|
61 |
-
You can run the model on a GPU using the following code.
|
62 |
-
```python
|
63 |
-
# Please `pip install transformers accelerate`
|
64 |
-
from transformers import pipeline
|
65 |
-
|
66 |
-
|
67 |
-
pipeline = pipeline(
|
68 |
-
"text-generation",
|
69 |
-
model="Nexusflow/NexusRaven-13B",
|
70 |
-
torch_dtype="auto",
|
71 |
-
device_map="auto",
|
72 |
-
)
|
73 |
-
|
74 |
-
prompt_template = """
|
75 |
-
<human>:
|
76 |
-
OPTION:
|
77 |
-
<func_start>def hello_world(n : int)<func_end>
|
78 |
-
<docstring_start>
|
79 |
-
\"\"\"
|
80 |
-
Prints hello world to the user.
|
81 |
-
|
82 |
-
Args:
|
83 |
-
n (int) : Number of times to print hello world.
|
84 |
-
\"\"\"
|
85 |
-
<docstring_end>
|
86 |
-
|
87 |
-
OPTION:
|
88 |
-
<func_start>def hello_universe(n : int)<func_end>
|
89 |
-
<docstring_start>
|
90 |
-
\"\"\"
|
91 |
-
Prints hello universe to the user.
|
92 |
-
|
93 |
-
Args:
|
94 |
-
n (int) : Number of times to print hello universe.
|
95 |
-
\"\"\"
|
96 |
-
<docstring_end>
|
97 |
-
|
98 |
-
User Query: Question: {question}
|
99 |
-
|
100 |
-
Please pick a function from the above options that best answers the user query and fill in the appropriate arguments.<human_end>
|
101 |
-
"""
|
102 |
-
prompt = prompt_template.format(question="Please print hello world 10 times.")
|
103 |
-
|
104 |
-
result = pipeline(prompt, max_new_tokens=100, return_full_text=False, do_sample=False)[0]["generated_text"]
|
105 |
-
|
106 |
-
# Get the "Initial Call" only
|
107 |
-
start_str = "Initial Answer: "
|
108 |
-
end_str = "\nReflection: "
|
109 |
-
start_idx = result.find(start_str) + len(start_str)
|
110 |
-
end_idx = result.find(end_str)
|
111 |
-
function_call = result[start_idx: end_idx]
|
112 |
-
|
113 |
-
print (f"Generated Call: {function_call}")
|
114 |
```
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
```
|
119 |
-
Which can be executed.
|
120 |
-
|
121 |
|
122 |
-
|
123 |
|
124 |
-
|
|
|
125 |
|
126 |
-
|
127 |
-
- learning_rate: 3e-05
|
128 |
-
- train_batch_size: 1
|
129 |
-
- eval_batch_size: 1
|
130 |
-
- seed: 42
|
131 |
-
- distributed_type: multi-GPU
|
132 |
-
- num_devices: 8
|
133 |
-
- gradient_accumulation_steps: 16
|
134 |
-
- total_train_batch_size: 128
|
135 |
-
- total_eval_batch_size: 8
|
136 |
-
- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
|
137 |
-
- lr_scheduler_type: constant
|
138 |
-
- num_epochs: 2.0
|
139 |
|
|
|
|
|
140 |
|
141 |
-
|
|
|
|
|
|
|
|
|
142 |
|
143 |
-
|
144 |
-
- Pytorch 2.0.1+cu118
|
145 |
-
- Datasets 2.14.5
|
146 |
-
- Tokenizers 0.13.3
|
147 |
|
|
|
|
|
148 |
|
149 |
-
|
150 |
-
|
151 |
-
|
152 |
-
|
153 |
|
|
|
|
|
|
|
|
|
|
|
154 |
|
155 |
-
##
|
156 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
157 |
|
|
|
|
|
158 |
|
159 |
-
|
160 |
-
We thank the CodeLlama team for their amazing models!
|
161 |
|
162 |
-
|
163 |
-
@misc{rozière2023code,
|
164 |
-
title={Code Llama: Open Foundation Models for Code},
|
165 |
-
author={Baptiste Rozière and Jonas Gehring and Fabian Gloeckle and Sten Sootla and Itai Gat and Xiaoqing Ellen Tan and Yossi Adi and Jingyu Liu and Tal Remez and Jérémy Rapin and Artyom Kozhevnikov and Ivan Evtimov and Joanna Bitton and Manish Bhatt and Cristian Canton Ferrer and Aaron Grattafiori and Wenhan Xiong and Alexandre Défossez and Jade Copet and Faisal Azhar and Hugo Touvron and Louis Martin and Nicolas Usunier and Thomas Scialom and Gabriel Synnaeve},
|
166 |
-
year={2023},
|
167 |
-
eprint={2308.12950},
|
168 |
-
archivePrefix={arXiv},
|
169 |
-
primaryClass={cs.CL}
|
170 |
-
}
|
171 |
-
```
|
172 |
|
|
|
|
|
|
|
173 |
|
174 |
-
##
|
175 |
-
```
|
176 |
-
@misc{nexusraven,
|
177 |
-
title={NexusRaven: Surpassing the state-of-the-art in open-source function calling LLMs},
|
178 |
-
author={Nexusflow.ai team},
|
179 |
-
year={2023},
|
180 |
-
url={http://nexusflow.ai/blog}
|
181 |
-
}
|
182 |
-
```
|
183 |
|
184 |
-
|
185 |
-
Please reach out to [email protected] for any questions!
|
|
|
1 |
---
|
2 |
+
license: apache-2.0
|
3 |
+
pipeline_tag: text-generation
|
4 |
+
tags:
|
5 |
+
- finetuned
|
6 |
+
inference:
|
7 |
+
parameters:
|
8 |
+
temperature: 0.7
|
9 |
---
|
|
|
10 |
|
11 |
+
# Model Card for Mistral-7B-Instruct-v0.1
|
|
|
|
|
12 |
|
13 |
+
The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) generative text model using a variety of publicly available conversation datasets.
|
|
|
|
|
14 |
|
15 |
+
For full details of this model please read our [release blog post](https://mistral.ai/news/announcing-mistral-7b/)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
+
## Instruction format
|
18 |
|
19 |
+
In order to leverage instruction fine-tuning, your prompt should be surrounded by `[INST]` and `[\INST]` tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
|
20 |
|
21 |
+
E.g.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
```
|
23 |
+
text = "<s>[INST] What is your favourite condiment? [/INST]"
|
24 |
+
"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s> "
|
25 |
+
"[INST] Do you have mayonnaise recipes? [/INST]"
|
26 |
```
|
|
|
|
|
27 |
|
28 |
+
This format is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method:
|
29 |
|
30 |
+
```python
|
31 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
32 |
|
33 |
+
device = "cuda" # the device to load the model onto
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
+
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
|
36 |
+
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
|
37 |
|
38 |
+
messages = [
|
39 |
+
{"role": "user", "content": "What is your favourite condiment?"},
|
40 |
+
{"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
|
41 |
+
{"role": "user", "content": "Do you have mayonnaise recipes?"}
|
42 |
+
]
|
43 |
|
44 |
+
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
|
|
|
|
|
|
|
45 |
|
46 |
+
model_inputs = encodeds.to(device)
|
47 |
+
model.to(device)
|
48 |
|
49 |
+
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
|
50 |
+
decoded = tokenizer.batch_decode(generated_ids)
|
51 |
+
print(decoded[0])
|
52 |
+
```
|
53 |
|
54 |
+
## Model Architecture
|
55 |
+
This instruction model is based on Mistral-7B-v0.1, a transformer model with the following architecture choices:
|
56 |
+
- Grouped-Query Attention
|
57 |
+
- Sliding-Window Attention
|
58 |
+
- Byte-fallback BPE tokenizer
|
59 |
|
60 |
+
## Troubleshooting
|
61 |
+
- If you see the following error:
|
62 |
+
```
|
63 |
+
Traceback (most recent call last):
|
64 |
+
File "", line 1, in
|
65 |
+
File "/transformers/models/auto/auto_factory.py", line 482, in from_pretrained
|
66 |
+
config, kwargs = AutoConfig.from_pretrained(
|
67 |
+
File "/transformers/models/auto/configuration_auto.py", line 1022, in from_pretrained
|
68 |
+
config_class = CONFIG_MAPPING[config_dict["model_type"]]
|
69 |
+
File "/transformers/models/auto/configuration_auto.py", line 723, in getitem
|
70 |
+
raise KeyError(key)
|
71 |
+
KeyError: 'mistral'
|
72 |
+
```
|
73 |
|
74 |
+
Installing transformers from source should solve the issue
|
75 |
+
pip install git+https://github.com/huggingface/transformers
|
76 |
|
77 |
+
This should not be required after transformers-v4.33.4.
|
|
|
78 |
|
79 |
+
## Limitations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
+
The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance.
|
82 |
+
It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to
|
83 |
+
make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.
|
84 |
|
85 |
+
## The Mistral AI Team
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
|
87 |
+
Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.
|
|