|
--- |
|
license: llama3.1 |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
**Llama3.1-Typhoon2-8B**: Thai Large Language Model (Instruct) |
|
|
|
**Llama3.1-Typhoon2-8B-instruct** is a instruct Thai 🇹🇭 large language model with 8 billion parameters, and it is based on Llama3.1-8B. |
|
|
|
For release post, please see our [blog](...). |
|
*To acknowledge Meta's effort in creating the foundation model and to comply with the license, we explicitly include "llama-3.1" in the model name. |
|
|
|
|
|
## **Performance** |
|
|
|
**Instruction-Following & Function Call Performance** |
|
|
|
<div align="center"> |
|
<img src="https://storage.googleapis.com/typhoon-public/assets/typhoon2-text/llama7b_general.png" alt="Typhoon2 Llama 8B General Performance" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/> |
|
</div> |
|
|
|
**Specific Domain Performance (Math & Coding)** |
|
|
|
<div align="center"> |
|
<img src="https://storage.googleapis.com/typhoon-public/assets/typhoon2-text/llama7b_specific.png" alt="TTyphoon2 Llama 8B Specific Domain Performance" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/> |
|
</div> |
|
|
|
**Long Context Performance** |
|
|
|
<div align="center"> |
|
<img src="https://storage.googleapis.com/typhoon-public/assets/typhoon2-text/llama7b_long.jpg" alt="Typhoon2 Llama 8B Long Context Performance" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/> |
|
</div> |
|
|
|
**Detail Performance** |
|
|
|
| Model | IFEval - TH | IFEval - EN | MT-Bench TH | MT-Bench EN | Thai Code-Switching(t=0.7) | Thai Code-Switching(t=1.0) | FunctionCall-TH | FunctionCall-EN | GSM8K-TH | GSM8K-EN | MATH-TH | MATH-EN | HumanEval-TH | HumanEval-EN | MBPP-TH | MBPP-EN | |
|
|--------------------------------|-------------|-------------|-------------|-------------|--------------------------------|--------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-------------|-------------|-----------|-----------| |
|
| **Llama3.1 8B Instruct** | 58.04% | **77.64%** | 5.109 | **8.118** | 93% | 11.2% | 36.92% | 66.06% | 45.18% | 62.4% | 24.42% | 48% | 51.8% | 67.7% | **64.6%** | **66.9%** | |
|
| **Typhoon2 Llama3 8B Instruct**| **72.60%** | 76.43% | **5.7417** | 7.584 | **98.8%** | **98%** | **75.12%** | **79.08%** | **71.72%** | **81.0%** | **38.48%** | **49.04%** | **58.5%** | **68.9%** | 60.8% | 63.0% | |
|
|
|
|
|
## **Model Description** |
|
|
|
- **Model type**: A 8B instruct decoder-only model based on Llama architecture. |
|
- **Requirement**: transformers 4.45.0 or newer. |
|
- **Context length**: 90k |
|
- **Primary Language(s)**: Thai 🇹🇭 and English 🇬🇧 |
|
- **License**: [Llama 3.1 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) |
|
|
|
|
|
## Usage Example |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
model_id = "scb10x/llama3.1-typhoon2-8b-instruct" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
) |
|
|
|
messages = [ |
|
{"role": "system", "content": "You are Typhoon, an AI assistant created by SCB 10X, designed to be helpful, harmless, and honest. Typhoon assists with analysis, answering questions, math, coding, creative writing, teaching, role-play, discussions, and more. Typhoon responds directly without affirmations or filler phrases (e.g., “Certainly,” “Of course”). Responses do not start with “Certainly” in any form. Typhoon adheres to these rules in all languages and always replies in the user's language or as requested. Communicate in fluid, conversational prose, showing genuine interest, empathy, and presenting information clearly and visually."}, |
|
{"role": "user", "content": "ขอสูตรไก่ย่าง"}, |
|
] |
|
|
|
input_ids = tokenizer.apply_chat_template( |
|
messages, |
|
add_generation_prompt=True, |
|
return_tensors="pt" |
|
).to(model.device) |
|
|
|
terminators = [ |
|
tokenizer.eos_token_id, |
|
tokenizer.convert_tokens_to_ids("<|eot_id|>") |
|
] |
|
|
|
outputs = model.generate( |
|
input_ids, |
|
max_new_tokens=512, |
|
eos_token_id=terminators, |
|
do_sample=True, |
|
temperature=0.4, |
|
top_p=0.9, |
|
) |
|
response = outputs[0][input_ids.shape[-1]:] |
|
print(tokenizer.decode(response, skip_special_tokens=True)) |
|
``` |
|
|
|
## Inference Server Hosting Example |
|
```bash |
|
pip install vllm |
|
vllm serve scb10x/llama3.1-typhoon2-8b-instruct |
|
# see more information at https://docs.vllm.ai/ |
|
``` |
|
|
|
|
|
## Function-Call Example |
|
```python |
|
import json |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import os |
|
import ast |
|
|
|
model_name = "scb10x/llama3.1-typhoon2-8b-instruct" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, torch_dtype=torch.bfloat16 |
|
) |
|
|
|
get_weather_api = { |
|
"name": "get_weather", |
|
"description": "Get the current weather for a location", |
|
"parameters": { |
|
"type": "object", |
|
"properties": { |
|
"location": { |
|
"type": "string", |
|
"description": "The city and state, e.g. San Francisco, New York", |
|
}, |
|
"unit": { |
|
"type": "string", |
|
"enum": ["celsius", "fahrenheit"], |
|
"description": "The unit of temperature to return", |
|
}, |
|
}, |
|
"required": ["location"], |
|
}, |
|
} |
|
|
|
|
|
search_api = { |
|
"name": "search", |
|
"description": "Search for information on the internet", |
|
"parameters": { |
|
"type": "object", |
|
"properties": { |
|
"query": { |
|
"type": "string", |
|
"description": "The search query, e.g. 'latest news on AI'", |
|
} |
|
}, |
|
"required": ["query"], |
|
}, |
|
} |
|
|
|
get_stock = { |
|
"name": "get_stock_price", |
|
"description": "Get the stock price", |
|
"parameters": { |
|
"type": "object", |
|
"properties": { |
|
"symbol": { |
|
"type": "string", |
|
"description": "The stock symbol, e.g. AAPL, GOOG", |
|
} |
|
}, |
|
"required": ["symbol"], |
|
}, |
|
} |
|
# Tool input are same format with OpenAI tools |
|
openai_format_tools = [get_weather_api, search_api, get_stock] |
|
|
|
messages = [ |
|
{"role": "system", "content": "You are helpful assistance."}, |
|
{"role": "user", "content": "ขอราคาหุ้น Tasla (TLS) และ Amazon (AMZ) ?"}, |
|
] |
|
|
|
final_prompt = tokenizer.apply_chat_template( |
|
messages, tools=openai_format_tools, add_generation_prompt=True, tokenize=False |
|
) |
|
|
|
inputs = tokenizer.apply_chat_template( |
|
messages, tools=openai_format_tools, add_generation_prompt=True, return_tensors="pt" |
|
).to(model.device) |
|
|
|
outputs = model.generate( |
|
inputs, |
|
max_new_tokens=512, |
|
do_sample=True, |
|
temperature=0.7, |
|
num_return_sequences=1, |
|
eos_token_id=[tokenizer.eos_token_id, 128009], |
|
) |
|
response = outputs[0][input_ids.shape[-1]:] |
|
|
|
print("Here Output:", tokenizer.decode(response, skip_special_tokens=True)) |
|
|
|
|
|
# Decoding function utility |
|
def resolve_ast_by_type(value): |
|
if isinstance(value, ast.Constant): |
|
if value.value is Ellipsis: |
|
output = "..." |
|
else: |
|
output = value.value |
|
elif isinstance(value, ast.UnaryOp): |
|
output = -value.operand.value |
|
elif isinstance(value, ast.List): |
|
output = [resolve_ast_by_type(v) for v in value.elts] |
|
elif isinstance(value, ast.Dict): |
|
output = { |
|
resolve_ast_by_type(k): resolve_ast_by_type(v) |
|
for k, v in zip(value.keys, value.values) |
|
} |
|
elif isinstance( |
|
value, ast.NameConstant |
|
): # Added this condition to handle boolean values |
|
output = value.value |
|
elif isinstance( |
|
value, ast.BinOp |
|
): # Added this condition to handle function calls as arguments |
|
output = eval(ast.unparse(value)) |
|
elif isinstance(value, ast.Name): |
|
output = value.id |
|
elif isinstance(value, ast.Call): |
|
if len(value.keywords) == 0: |
|
output = ast.unparse(value) |
|
else: |
|
output = resolve_ast_call(value) |
|
elif isinstance(value, ast.Tuple): |
|
output = tuple(resolve_ast_by_type(v) for v in value.elts) |
|
elif isinstance(value, ast.Lambda): |
|
output = eval(ast.unparse(value.body[0].value)) |
|
elif isinstance(value, ast.Ellipsis): |
|
output = "..." |
|
elif isinstance(value, ast.Subscript): |
|
try: |
|
output = ast.unparse(value.body[0].value) |
|
except: |
|
output = ast.unparse(value.value) + "[" + ast.unparse(value.slice) + "]" |
|
else: |
|
raise Exception(f"Unsupported AST type: {type(value)}") |
|
return output |
|
|
|
|
|
def resolve_ast_call(elem): |
|
func_parts = [] |
|
func_part = elem.func |
|
while isinstance(func_part, ast.Attribute): |
|
func_parts.append(func_part.attr) |
|
func_part = func_part.value |
|
if isinstance(func_part, ast.Name): |
|
func_parts.append(func_part.id) |
|
func_name = ".".join(reversed(func_parts)) |
|
args_dict = {} |
|
for arg in elem.keywords: |
|
output = resolve_ast_by_type(arg.value) |
|
args_dict[arg.arg] = output |
|
return {func_name: args_dict} |
|
|
|
|
|
def ast_parse(input_str, language="Python"): |
|
if language == "Python": |
|
cleaned_input = input_str.strip("[]'") |
|
parsed = ast.parse(cleaned_input, mode="eval") |
|
extracted = [] |
|
if isinstance(parsed.body, ast.Call): |
|
extracted.append(resolve_ast_call(parsed.body)) |
|
else: |
|
for elem in parsed.body.elts: |
|
assert isinstance(elem, ast.Call) |
|
extracted.append(resolve_ast_call(elem)) |
|
return extracted |
|
else: |
|
raise NotImplementedError(f"Unsupported language: {language}") |
|
|
|
|
|
def parse_nested_value(value): |
|
""" |
|
Parse a potentially nested value from the AST output. |
|
|
|
Args: |
|
value: The value to parse, which could be a nested dictionary, which includes another function call, or a simple value. |
|
|
|
Returns: |
|
str: A string representation of the value, handling nested function calls and nested dictionary function arguments. |
|
""" |
|
if isinstance(value, dict): |
|
# Check if the dictionary represents a function call (i.e., the value is another dictionary or complex structure) |
|
if all(isinstance(v, dict) for v in value.values()): |
|
func_name = list(value.keys())[0] |
|
args = value[func_name] |
|
args_str = ", ".join( |
|
f"{k}={parse_nested_value(v)}" for k, v in args.items() |
|
) |
|
return f"{func_name}({args_str})" |
|
else: |
|
# If it's a simple dictionary, treat it as key-value pairs |
|
return ( |
|
"{" |
|
+ ", ".join(f"'{k}': {parse_nested_value(v)}" for k, v in value.items()) |
|
+ "}" |
|
) |
|
return repr(value) |
|
|
|
|
|
def decoded_output_to_execution_list(decoded_output): |
|
""" |
|
Convert decoded output to a list of executable function calls. |
|
|
|
Args: |
|
decoded_output (list): A list of dictionaries representing function calls. |
|
|
|
Returns: |
|
list: A list of strings, each representing an executable function call. |
|
""" |
|
execution_list = [] |
|
for function_call in decoded_output: |
|
for key, value in function_call.items(): |
|
args_str = ", ".join( |
|
f"{k}={parse_nested_value(v)}" for k, v in value.items() |
|
) |
|
execution_list.append(f"{key}({args_str})") |
|
return execution_list |
|
|
|
|
|
def default_decode_ast_prompting(result, language="Python"): |
|
result = result.strip("`\n ") |
|
if not result.startswith("["): |
|
result = "[" + result |
|
if not result.endswith("]"): |
|
result = result + "]" |
|
decoded_output = ast_parse(result, language) |
|
return decoded_output |
|
|
|
|
|
fc_result = default_decode_ast_prompting(tokenizer.decode(response, skip_special_tokens=True)) |
|
print(fc_result) # [{'Function': {'arguments': '{"symbol": "TLS"}', 'name': 'get_stock_price'}}, {'Function': {'arguments': '{"symbol": "AMZ"}', 'name': 'get_stock_price'}}] |
|
``` |
|
|
|
## **Intended Uses & Limitations** |
|
|
|
This model is an instructional model. However, it’s still undergoing development. It incorporates some level of guardrails, but it still may produce answers that are inaccurate, biased, or otherwise objectionable in response to user prompts. We recommend that developers assess these risks in the context of their use case. |
|
|
|
## **Follow us** |
|
|
|
**https://twitter.com/opentyphoon** |
|
|
|
## **Support** |
|
|
|
**https://discord.gg/CqyBscMFpg** |
|
|
|
## **Citation** |
|
|
|
- If you find Typhoon2 useful for your work, please cite it using: |
|
``` |
|
@article{pipatanakul2023typhoon, |
|
title={Typhoon: Thai Large Language Models}, |
|
author={Kunat Pipatanakul and Phatrasek Jirabovonvisut and Potsawee Manakul and Sittipong Sripaisarnmongkol and Ruangsak Patomwong and Pathomporn Chokchainant and Kasima Tharnpipitchai}, |
|
year={2023}, |
|
journal={arXiv preprint arXiv:2312.13951}, |
|
url={https://arxiv.org/abs/2312.13951} |
|
} |
|
``` |