🤖 Super AI Engineer Development Program Season 4 - Pangpuriye Table-based Question Answering Model

logo

This model was fine-tuned from the original OpenThaiGPT-1.0.1-7b. The model is set under Apache license 2.0.

Example inference using huggingface transformers.

The following code is an exmaple of how to inference our model.

from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer
import pandas as pd

def get_prediction(raw_prediction):
    if "[/INST]" in raw_prediction:
        index = raw_prediction.index("[/INST]")
        return raw_prediction[index + 7:]

    return raw_prediction

tokenizer = LlamaTokenizer.from_pretrained("AIAT/Pangpuriye-openthaigpt-1.0.0-7b-chat", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("AIAT/Pangpuriye-openthaigpt-1.0.0-7b-chat", trust_remote_code=True)

schema = """your SQL schema"""
query = "หาจำนวนลูกค้าที่เป็นเพศชาย"

prompt = f"""
    [INST] <<SYS>>
    You are a question answering assistant. Answer the question as truthful and helpful as possible คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด
    <</SYS>>
    {schema}### (sql extract) {query} [/INST]
"""

tokens = tokenizer(prompt, return_tensors="pt")
output = model.generate(tokens["input_ids"], max_new_tokens=20, eos_token_id=tokenizer.eos_token_id)
print(get_prediction(tokenizer.decode(output[0], skip_special_tokens=True)))

Acknowledgements

The model collaborated by the members of Panguriye's house during the LLMs hackathon in Super AI Engineer Development Program Season 4.

We thank the organizers of this hackathon, OpenThaiGPT, AIAT, NECTEC and ThaiSC for this challenging task and opportunity to be a part of developing Thai large language model.

Citation Information

If our work is useful for future development, please cite our model as follows:

@misc {artificial_intelligence_association_of_thailand_2024,
    author       = { {Artificial Intelligence Association of Thailand} },
    title        = { Pangpuriye-openthaigpt-1.0.0-7b-chat (Revision 21f9a62) },
    year         = 2024,
    url          = { https://huggingface.co/AIAT/Pangpuriye-openthaigpt-1.0.0-7b-chat },
    doi          = { 10.57967/hf/2193 },
    publisher    = { Hugging Face }
}
Downloads last month
93
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train AIAT/Pangpuriye-openthaigpt-1.0.0-7b-chat

Collection including AIAT/Pangpuriye-openthaigpt-1.0.0-7b-chat