Inferencing the model

#2
by saivineetha - opened

Hi,

I'm trying to inference the model using pipeline but I'm getting blank response. I have used the example prompt given in paper as Open Question Answering Task.
I'm attaching the code I used for inference

model = AutoModelForCausalLM.from_pretrained('abhinand/tamil-llama-7b-base-v0.1')
tokenizer = AutoTokenizer.from_pretrained('abhinand/tamil-llama-7b-base-v0.1')
txt = "ஐபிஎல் ெதாடைர ெசன் ைன சூப் பர் கிங் ஸ் (சிஎஸ் -
ேக) ெவன் றது என் ற தைலப் பில் ஒரு சிறு ெசய் திக் கட் டுைர-
ைய எழுதுங் கள் ." # Taken from paper

sequences = pipeline(
txt,
do_sample=True,
top_k=10,
temperature = 0.2,
max_length=1024,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")

I'm getting the blank response. Only the prompt passed is shown in output. I'm attaching the image for the same

image.png

Can anyone help me with this.

You have mentioned Open Question Answering Task, and you are using the base model.

In simple terms, there are two types of LMs: (if you aren't aware)

  1. Base Model: Trained on huge amounts of text data and are suitable for CLM (next word prediction) tasks.
  2. Fine-tuned Model: The base model is finetuned on an instruction or a chat dataset making it suitable for interaction with humans.

So in your case you need to use the instruct model (unless you are willing to do a massive domain adaptation or finetuning on diverse datasets).

tokenizer = AutoTokenizer.from_pretrained("abhinand/tamil-llama-7b-instruct-v0.1")
model = AutoModelForCausalLM.from_pretrained(
    "abhinand/tamil-llama-7b-instruct-v0.1",
   # OTHER MODEL ARGUMENTS HERE
)
model.eval()

generation_config = GenerationConfig(
    temperature=0.3,
    top_k=50,
    top_p=0.90,
    repetition_penalty=1.1,
    max_length=512,
    eos_token_id=tokenizer.eos_token_id,
    do_sample=True,
    max_new_tokens=128,
)

def format_instruction(system_prompt, question, input=None):
    if input is not None:
        return f"""{system_prompt}

### Instruction:
{question}

### Input:
{input}

### Response:
"""
    else:
        return f"""{system_prompt}

### Instruction:
{question}

### Response:
"""

device = "cuda" if torch.cuda.is_available() else "cpu"

def run_inference(prompt):
    input_ids = tokenizer.encode(prompt, return_tensors="pt").to(device)
    output = model.generate(input_ids, generation_config=generation_config, pad_token_id=18610)

    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

    return generated_text


SYS_PROMPT1 = "நீங்கள் தமிழில் பதிலளிக்கும் AI உதவியாளர். பயனர் உங்களுக்கு ஒரு பணியை வழங்குவார். உங்களால் முடிந்தவரை உண்மையாக பணியை முடிப்பதே உங்கள் குறிக்கோள். பணியைச் செய்யும்போது, ​​படிப்படியாக சிந்தித்து, உங்கள் நடவடிக்கைகளை நியாயப்படுத்தவும்."

instruction = format_instruction(
    system_prompt=SYS_PROMPT1,
    question="""DNA மற்றும் RNA இடையே உள்ள வேறுபாட்டை ஒரு வரியில் விளக்கவும்"""
)

output = run_inference(instruction)

print(output)

Thank you for the reply. It was working.

I want to infer the base model so that I could see how the base model works and later use code to my own dataset i.e., to do domain-adaptation.

How can I do inference on base model "abhinand/tamil-llama-7b-base-v0.1".
How to do text generation using base model.
Can I use the code
generator = pipeline(task="text_generation", model="abhinand/tamil-llama-7b-base-v0.1")

Sure! You can use the pipeline and test out the model for your needs.

Below is an example:

image.png

Thanks a lot!

saivineetha changed discussion status to closed

Sign up or log in to comment