munish0838 commited on
Commit
243838c
·
verified ·
1 Parent(s): 803d670

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +162 -0
README.md ADDED
@@ -0,0 +1,162 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - text-generation-inference
5
+ - transformers
6
+ - unsloth
7
+ - trl
8
+ - llama
9
+ language:
10
+ - en
11
+ base_model: hiieu/Meta-Llama-3-8B-Instruct-function-calling-json-mode
12
+ ---
13
+
14
+ # QuantFactory/Meta-Llama-3-8B-Instruct-function-calling-json-mode-GGUF
15
+ This is quantized version of [hiieu/Meta-Llama-3-8B-Instruct-function-calling-json-mode](https://huggingface.co/hiieu/Meta-Llama-3-8B-Instruct-function-calling-json-mode) created using llama.cpp
16
+
17
+ ## Model Description
18
+
19
+ This model was fine-tuned on meta-llama/Meta-Llama-3-8B-Instruct for function calling and json mode.
20
+
21
+ ## Usage
22
+ ### JSON Mode
23
+ ```python
24
+ from transformers import AutoTokenizer, AutoModelForCausalLM
25
+ import torch
26
+
27
+ model_id = "hiieu/Meta-Llama-3-8B-Instruct-function-calling-json-mode"
28
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
29
+ model = AutoModelForCausalLM.from_pretrained(
30
+ model_id,
31
+ torch_dtype=torch.bfloat16,
32
+ device_map="auto",
33
+ )
34
+
35
+ messages = [
36
+ {"role": "system", "content": "You are a helpful assistant, answer in JSON with key \"message\""},
37
+ {"role": "user", "content": "Who are you?"},
38
+ ]
39
+
40
+ input_ids = tokenizer.apply_chat_template(
41
+ messages,
42
+ add_generation_prompt=True,
43
+ return_tensors="pt"
44
+ ).to(model.device)
45
+
46
+ terminators = [
47
+ tokenizer.eos_token_id,
48
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
49
+ ]
50
+
51
+ outputs = model.generate(
52
+ input_ids,
53
+ max_new_tokens=256,
54
+ eos_token_id=terminators,
55
+ do_sample=True,
56
+ temperature=0.6,
57
+ top_p=0.9,
58
+ )
59
+ response = outputs[0][input_ids.shape[-1]:]
60
+ print(tokenizer.decode(response, skip_special_tokens=True))
61
+ # >> {"message": "I am a helpful assistant, with access to a vast amount of information. I can help you with tasks such as answering questions, providing definitions, translating text, and more. Feel free to ask me anything!"}
62
+ ```
63
+
64
+ ### Function Calling
65
+ Function calling requires two step inferences, below is the example:
66
+
67
+ ## Step 1:
68
+
69
+ ```python
70
+ functions_metadata = [
71
+ {
72
+ "type": "function",
73
+ "function": {
74
+ "name": "get_temperature",
75
+ "description": "get temperature of a city",
76
+ "parameters": {
77
+ "type": "object",
78
+ "properties": {
79
+ "city": {
80
+ "type": "string",
81
+ "description": "name"
82
+ }
83
+ },
84
+ "required": [
85
+ "city"
86
+ ]
87
+ }
88
+ }
89
+ }
90
+ ]
91
+
92
+ messages = [
93
+ { "role": "system", "content": f"""You are a helpful assistant with access to the following functions: \n {str(functions_metadata)}\n\nTo use these functions respond with:\n<functioncall> {{ "name": "function_name", "arguments": {{ "arg_1": "value_1", "arg_1": "value_1", ... }} }} </functioncall>\n\nEdge cases you must handle:\n - If there are no functions that match the user request, you will respond politely that you cannot help."""},
94
+ { "role": "user", "content": "What is the temperature in Tokyo right now?"}
95
+ ]
96
+
97
+ input_ids = tokenizer.apply_chat_template(
98
+ messages,
99
+ add_generation_prompt=True,
100
+ return_tensors="pt"
101
+ ).to(model.device)
102
+
103
+ terminators = [
104
+ tokenizer.eos_token_id,
105
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
106
+ ]
107
+
108
+ outputs = model.generate(
109
+ input_ids,
110
+ max_new_tokens=256,
111
+ eos_token_id=terminators,
112
+ do_sample=True,
113
+ temperature=0.6,
114
+ top_p=0.9,
115
+ )
116
+ response = outputs[0][input_ids.shape[-1]:]
117
+ print(tokenizer.decode(response, skip_special_tokens=True))
118
+ # >> <functioncall> {"name": "get_temperature", "arguments": '{"city": "Tokyo"}'} </functioncall>"""}
119
+ ```
120
+ ## Step 2:
121
+
122
+ ```python
123
+ messages = [
124
+ { "role": "system", "content": f"""You are a helpful assistant with access to the following functions: \n {str(functions_metadata)}\n\nTo use these functions respond with:\n<functioncall> {{ "name": "function_name", "arguments": {{ "arg_1": "value_1", "arg_1": "value_1", ... }} }} </functioncall>\n\nEdge cases you must handle:\n - If there are no functions that match the user request, you will respond politely that you cannot help."""},
125
+ { "role": "user", "content": "What is the temperature in Tokyo right now?"},
126
+ # You will get the previous prediction, extract it will the tag <functioncall>
127
+ # execute the function and append it to the messages like below:
128
+ { "role": "assistant", "content": """<functioncall> {"name": "get_temperature", "arguments": '{"city": "Tokyo"}'} </functioncall>"""},
129
+ { "role": "user", "content": """<function_response> {"temperature":30 C} </function_response>"""}
130
+ ]
131
+
132
+ input_ids = tokenizer.apply_chat_template(
133
+ messages,
134
+ add_generation_prompt=True,
135
+ return_tensors="pt"
136
+ ).to(model.device)
137
+
138
+ terminators = [
139
+ tokenizer.eos_token_id,
140
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
141
+ ]
142
+
143
+ outputs = model.generate(
144
+ input_ids,
145
+ max_new_tokens=256,
146
+ eos_token_id=terminators,
147
+ do_sample=True,
148
+ temperature=0.6,
149
+ top_p=0.9,
150
+ )
151
+ response = outputs[0][input_ids.shape[-1]:]
152
+ print(tokenizer.decode(response, skip_special_tokens=True))
153
+ # >> The current temperature in Tokyo is 30 degrees Celsius.
154
+ ```
155
+
156
+ # Uploaded model
157
+
158
+ - **Developed by:** hiieu
159
+
160
+ This model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
161
+
162
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)