Boring answer
#6
by
foowaa
- opened
When using the demo code, the answer is not the same as inference API in HuggingFace which is boring.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = '/data0/LiteLlama'
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
model.eval()
prompt = 'Q: Hello, what is your name?\nA:'
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
tokens = model.generate(input_ids, max_length=20)
print( tokenizer.decode(tokens[0].tolist(), skip_special_tokens=True) )
Result is:
Q: What is the largest bird?
A: The largest bird is the largest bird.
I had the same result.