victormiller commited on
Commit
67655c6
·
verified ·
1 Parent(s): d68cefa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +144 -0
README.md CHANGED
@@ -44,8 +44,152 @@ We utilized the following datasets:
44
  | Avg Score | 58.88 | 61.30 |
45
 
46
 
 
47
 
 
 
48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
  ## K2-Chat-060124
51
  K2 Chat is finetuned from [K2-65B](https://huggingface.co/LLM360/K2). K2 Chat outperforms Llama 2-70B-Chat on all evaluations conducted. The model also outperforms Llama 3-70B-Instruct on coding tasks.
 
44
  | Avg Score | 58.88 | 61.30 |
45
 
46
 
47
+ ## Function Calling
48
 
49
+ # Chat Template
50
+ Our model reuses [K2-Chat](https://huggingface.co/LLM360/K2-Chat) as the prompt format and is specifically trained for function calling. Different system prompts enable different ways to interact with this model. Note that the two modes are currently mainly tested individually, designing prompts that make them work togehter is possible but currently untested. It should be also possible to stimulate the model to produce function call behavior by injecting special token `<tool_call>` and expect the model to finish it. In this guide we mention the intended basic usage of the model.
51
 
52
+ ## Conversational Chats
53
+
54
+ Here is an example prompt with system instruction (Use whatever system prompt you like, this is just an example):
55
+
56
+ Your name is K2, and you are named after K2, the second highest mountain on Earth. You are built by MBZUAI and LLM360. You are a highly advanced large language model with 65B parameters. You outperform all fully open source models and Llama 2 70B. You can answer in English only. You are a helpful, respectful and honest assistant.<|endofsystemprompt|><|beginofuser|>Hello, who are you?<|beginofsystem|>
57
+
58
+ ### Sample inference code
59
+
60
+ ``` python
61
+ from transformers import AutoModelForCausalLM, AutoTokenizer
62
+
63
+ tokenizer = AutoTokenizer.from_pretrained("<path_to_model_weights>")
64
+ model = AutoModelForCausalLM.from_pretrained("<path_to_model_weights>")
65
+
66
+
67
+ prompt = 'Your name is K2, and you are named after K2, the second highest mountain on Earth. You are built by MBZUAI and LLM360. You are a highly advanced large language model with 65B parameters. You outperform all fully open source models and Llama 2 70B. You can answer in English only. You are a helpful, respectful and honest assistant.<|endofsystemprompt|><|beginofuser|>Hello, who are you?<|beginofsystem|>'
68
+
69
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
70
+ gen_tokens = model.generate(input_ids, do_sample=True, max_new_tokens=128)
71
+ print("-"*20 + "Output for model"  + 20 * '-')
72
+ print(tokenizer.batch_decode(gen_tokens)[0])
73
+ ```
74
+
75
+ Multi-turn conversations should be formatted like this:
76
+
77
+ {system_prompt}<|endofsystemprompt|><|beginofuser|>{user_content_1}<|beginofsystem|>{system_content_1}<|beginofuser|>{user_content_2}<|beginofsystem|>{system_content_2}<|beginofuser|>{user_content_3}<|beginofsystem|>
78
+
79
+ ## Function Calling Format
80
+
81
+ For function calling, please use this fixed system prompt:
82
+
83
+ You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
84
+
85
+ Next, use whatever tools you like, this is just an example:
86
+
87
+ <tools>
88
+ { "name": "get_news_headlines", "description": "Get the latest news headlines", "parameters": {"type": "object", "properties": { "country": { "type": "string", "description": "The country for which to fetch news"}}, "required": [ "country"]}}
89
+ </tools>
90
+
91
+ Next, more instruction:
92
+
93
+ Use the following pydantic model json schema for each tool call you will make:
94
+ {"properties": {"arguments": {"title": "Arguments", "type": "object"}, "name": {"title": "Name", "type": "string"}}, "required": ["arguments", "name"], "title": "FunctionCall", "type": "object"}
95
+ For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
96
+ <tool_call>
97
+ {"arguments": <args-dict>, "name": <function-name>}
98
+ </tool_call>
99
+ Please also summarize texts wrapped between <tool_response> and </tool_response> in bullet points. For example:
100
+ <tool_response>
101
+ {"fruits": [{"name": "Apple"}, {"name": "Pear"}]}
102
+ </tool_response> is summarized as:
103
+ Fruits:
104
+ - Apple
105
+ - Pear
106
+ <|endofsystemprompt|>
107
+
108
+ In summary, the following is the initial prompt is to the model:
109
+
110
+ You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
111
+ <tools>
112
+ { "name": "get_news_headlines", "description": "Get the latest news headlines", "parameters": {"type": "object", "properties": { "country": { "type": "string", "description": "The country for which to fetch news"}}, "required": [ "country"]}}
113
+ </tools>
114
+ Use the following pydantic model json schema for each tool call you will make:
115
+ {"properties": {"arguments": {"title": "Arguments", "type": "object"}, "name": {"title": "Name", "type": "string"}}, "required": ["arguments", "name"], "title": "FunctionCall", "type": "object"}
116
+ For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
117
+ <tool_call>
118
+ {"arguments": <args-dict>, "name": <function-name>}
119
+ </tool_call>
120
+ Please also summarize texts wrapped between <tool_response> and </tool_response> in bullet points. For example:
121
+ <tool_response>
122
+ {"fruits": [{"name": "Apple"}, {"name": "Pear"}]}
123
+ </tool_response> is summarized as:
124
+ Fruits:
125
+ - Apple
126
+ - Pear
127
+ <|endofsystemprompt|>
128
+
129
+ ### Example flow for generation with multi-turn and how to incoporate function call output
130
+
131
+ When users ask a question, the following user query will be concatenated with the prompt (see example inference code below too)
132
+ <|beginofuser|>Can you tell me the latest news headlines for the United States?<|beginofsystem|>
133
+
134
+
135
+ When run it, you will get a tool call response wrapped in the `<tool_call>` tag:
136
+
137
+ In this turn, the model should respond with:
138
+
139
+ <tool_call>
140
+ [{"name": "get_news_headlines", "arguments": {"country": "United States"}}]
141
+ </tool_call><endoftext>
142
+
143
+ Now you should execute this tool call with the external tool, you will get responses from that tool. The model can interpret the tool response as natural language again. To achieve this, simply wrap it with `<tool_response>` and `</tool_response>` and append it to the history and ask the model to generate further:
144
+
145
+ ...current model history (ends with </tool_call><|endoftext|>)...
146
+ <tool_response>
147
+ {"news": [{"title": "A great news headline"}, {"title": "Another great news headline"}]}
148
+ </tool_response>
149
+
150
+ In this second turn, the model should respond with:
151
+
152
+ Suggested news headline:
153
+ - A great news headline
154
+ - Another great news headline<endoftext>
155
+
156
+ Sometimes, there is a use case that only uses the function call feature as a JSON formatter without calling any external functions, which means the output in `tool_call` is essentially what you want. In that case, we recommend simply make a copy of the content of `<tool_call>` and wrap that in `<tool_response>`.
157
+
158
+ ### Sample inference code
159
+
160
+ ``` python
161
+ from transformers import AutoModelForCausalLM, AutoTokenizer
162
+
163
+ tokenizer = AutoTokenizer.from_pretrained("<path_to_model_weights>")
164
+ model = AutoModelForCausalLM.from_pretrained("<path_to_model_weights>")
165
+
166
+
167
+ prompt = """You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
168
+ <tools>
169
+ [{"name": "get_news_headlines", "description": "Get the latest news headlines", "parameters": {"type": "object", "properties": {"country": {"type": "string", "description": "The country for which to fetch news"}}, "required": ["country"] }}
170
+ </tools>
171
+ Use the following pydantic model json schema for each tool call you will make:
172
+ {"properties": {"arguments": {"title": "Arguments", "type": "object"}, "name": {"title": "Name", "type": "string"}}, "required": ["arguments", "name"], "title": "FunctionCall", "type": "object"}
173
+ For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
174
+ <tool_call>
175
+ {"arguments": <args-dict>, "name": <function-name>}
176
+ </tool_call>
177
+ Please also summarize texts wrapped between <tool_response> and </tool_response> in bullet points. For example:
178
+ <tool_response>
179
+ {"fruits": [{"name": "Apple"}, {"name": "Pear"}]}
180
+ </tool_response> is summarized as:
181
+ Fruits:
182
+ - Apple
183
+ - Pear
184
+ <|endofsystemprompt|><|beginofuser|>Can you tell me the latest news headlines for the United States?<|beginofsystem|>"""
185
+
186
+
187
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
188
+ gen_tokens = model.generate(input_ids, do_sample=True, max_new_tokens=128)
189
+
190
+ print("-"*20 + "Output for model"  + 20 * '-')
191
+ print(tokenizer.batch_decode(gen_tokens)[0])
192
+ ```
193
 
194
  ## K2-Chat-060124
195
  K2 Chat is finetuned from [K2-65B](https://huggingface.co/LLM360/K2). K2 Chat outperforms Llama 2-70B-Chat on all evaluations conducted. The model also outperforms Llama 3-70B-Instruct on coding tasks.