Safetensors
qwen2
linqq9 commited on
Commit
9c80bb9
·
verified ·
1 Parent(s): eb5a61f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +143 -7
README.md CHANGED
@@ -34,8 +34,6 @@ In addition, we evaluated the Hammer 2.1 models on other academic benchmarks to
34
 
35
  Hammer 2.1 models showcase highly stable performance, suggesting the robustness of Hammer 2.1 series. In contrast, the baseline approaches display varying levels of effectiveness.
36
 
37
- ## Tuning Details
38
- Thanks so much for your attention, a report with all the technical details leading to our models will be published soon.
39
 
40
 
41
 
@@ -43,15 +41,153 @@ Thanks so much for your attention, a report with all the technical details leadi
43
  The code of Hammer 2.1 models have been in the latest Hugging face transformers and we advise you to install `transformers>=4.34.0`.
44
 
45
  ## How to Use
46
- Hammer2.1 models offer flexibility in deployment and usage, fully supporting both vLLM deployment and Hugging Face Transformers tool calling. For more detailed examples and use cases, please refer to the [examples/README_USING.md](https://github.com/MadeAgents/Hammer/tree/main/examples/README_USING.md) in our repository.
47
- This is a simple example of how to use our model.
48
- ~~~python
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  import torch
50
  from transformers import AutoModelForCausalLM, AutoTokenizer
51
 
52
 
53
- tokenizer = AutoTokenizer.from_pretrained("MadeAgents/Hammer2.1-0.5b")
54
- model = AutoModelForCausalLM.from_pretrained("MadeAgents/Hammer2.1-0.5b", torch_dtype=torch.bfloat16, device_map="auto")
55
 
56
  # Example conversation
57
  messages = [
 
34
 
35
  Hammer 2.1 models showcase highly stable performance, suggesting the robustness of Hammer 2.1 series. In contrast, the baseline approaches display varying levels of effectiveness.
36
 
 
 
37
 
38
 
39
 
 
41
  The code of Hammer 2.1 models have been in the latest Hugging face transformers and we advise you to install `transformers>=4.34.0`.
42
 
43
  ## How to Use
44
+ Hammer models offer flexibility in deployment and usage, fully supporting both **vLLM** deployment and **Hugging Face Transformers** tool calling. Below are the specifics on how to make use of these features:
45
+
46
+ ### Using vLLM
47
+ #### Option 1: Using Hammer client
48
+ vLLM offers efficient serving with lower latency. To serve the model with vLLM:
49
+ ```
50
+ vllm serve MadeAgents/Hammer2.1-1.5b --host 0.0.0.0 --port 8000 --tensor-parallel-size 1
51
+ ```
52
+ Once the model is served, you can use the following Hammer client to interact with it for function calling:
53
+ ~~~
54
+ from client import HammerChatCompletion,HammerConfig
55
+ config = HammerConfig(base_url="http://localhost:8000/v1/", model="MadeAgents/Hammer2.1-1.5b")
56
+ llm = HammerChatCompletion.from_config(config)
57
+
58
+ # Example conversation
59
+ messages = [
60
+ {"role": "user", "content": "What's the weather like in New York?"},
61
+ {"role": "assistant","content": '```\n{"name": "get_weather", "arguments": {"location": "New York, NY ", "unit": "celsius"}\n```'},
62
+ {"role": "tool", "name": "get_weather", "content": '{"temperature": 72, "description": "Partly cloudy"}'},
63
+ {"role": "user", "content": "Now, search for the weather in San Francisco."}
64
+ ]
65
+
66
+ # Example function definition (optional)
67
+ tools = [
68
+ {
69
+ "name": "get_weather",
70
+ "description": "Get the current weather for a location",
71
+ "parameters": {
72
+ "type": "object",
73
+ "properties": {
74
+ "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
75
+ "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature to return"}
76
+ },
77
+ "required": ["location"]
78
+ }
79
+ },
80
+ {
81
+ "name": "respond",
82
+ "description": "When you are ready to respond, use this function. This function allows the assistant to formulate and deliver appropriate replies based on the input message and the context of the conversation. Generate a concise response for simple questions, and a more detailed response for complex questions.",
83
+ "parameters": {
84
+ "type": "object",
85
+ "properties": {
86
+ "message": {"type": "string", "description": "The content of the message to respond to."}
87
+ },
88
+ "required": ["message"]
89
+ }
90
+ }
91
+ ]
92
+
93
+ response = llm.completion(messages, tools=tools)
94
+ print(response)
95
+ ~~~
96
+
97
+
98
+ #### Option 2: Using vLLM’s built-in tool calling
99
+ Hammer2.1 supports vllm’s built-in tool calling. This functionality requires vllm>=0.6. If you want to enable this functionality, please start vllm’s OpenAI-compatible service with:
100
+ ~~~
101
+ vllm serve MadeAgents/Hammer2.1-1.5b --enable-auto-tool-choice --tool-call-parser hermes
102
+ ~~~
103
+ And then use it in the same way you use GPT’s tool calling:
104
+ ~~~
105
+ tools = [
106
+ {
107
+ "type": "function",
108
+ "function": {
109
+ "name": "get_current_weather",
110
+ "description": "Get the current weather",
111
+ "parameters": {
112
+ "type": "object",
113
+ "properties": {
114
+ "location": {
115
+ "type": "string",
116
+ "description": "The city and state, e.g. San Francisco, CA",
117
+ },
118
+ "format": {
119
+ "type": "string",
120
+ "enum": ["celsius", "fahrenheit"],
121
+ "description": "The temperature unit to use. Infer this from the users location.",
122
+ "default": "celsius"
123
+ },
124
+ },
125
+ "required": ["location","format"],
126
+ },
127
+ }
128
+ },
129
+ {
130
+ "type": "function",
131
+ "function": {
132
+ "name": "get_n_day_weather_forecast",
133
+ "description": "Get an N-day weather forecast",
134
+ "parameters": {
135
+ "type": "object",
136
+ "properties": {
137
+ "location": {
138
+ "type": "string",
139
+ "description": "The city and state, e.g. San Francisco, CA",
140
+ },
141
+ "format": {
142
+ "type": "string",
143
+ "enum": ["celsius", "fahrenheit"],
144
+ "description": "The temperature unit to use. Infer this from the users location.",
145
+ "default": "celsius"
146
+ },
147
+ "num_days": {
148
+ "type": "integer",
149
+ "description": "The number of days to forecast",
150
+ "default": 1
151
+ }
152
+ },
153
+ "required": ["location", "format", "num_days"]
154
+ },
155
+ }
156
+ },
157
+ ]
158
+
159
+
160
+ from openai import OpenAI
161
+ openai_api_key = "None"
162
+ openai_api_base = "http://localhost:8000/v1"
163
+
164
+ client = OpenAI(
165
+ api_key=openai_api_key,
166
+ base_url=openai_api_base,
167
+ )
168
+
169
+ query = """What's the weather like today in San Francisco"""
170
+
171
+ chat_response = client.chat.completions.create(
172
+ model="MadeAgents/Hammer2.1-1.5b",
173
+ messages=[
174
+ {"role": "user", "content": query},],
175
+ tools = tools,
176
+ temperature=0
177
+ )
178
+ print(chat_response.choices[0].message.content)
179
+ ~~~
180
+
181
+
182
+ ### Using Hugging Face Transformers
183
+ Hammer2.1’s chat template also includes a tool calling template, meaning that you can use Hugging Face transformers’ tool calling support. This is a simple example of how to use our model using Transformers.
184
+ ~~~
185
  import torch
186
  from transformers import AutoModelForCausalLM, AutoTokenizer
187
 
188
 
189
+ tokenizer = AutoTokenizer.from_pretrained("MadeAgents/Hammer2.1-1.5b")
190
+ model = AutoModelForCausalLM.from_pretrained("MadeAgents/Hammer2.1-1.5b", torch_dtype=torch.bfloat16, device_map="auto")
191
 
192
  # Example conversation
193
  messages = [