Can anyone get actual quality output from the model ?

#3
by ClaudioCasellato - opened

I tried the model on my 1080ti (no flash attention) and for the simple question:
Create a for loop in python that prints numbers up to 10

I get the following output garbage:
'0 to 985 to 315 to 664 to 27645077550007 to 65005080006.5 to upto -1500-1445+20545345565 to -25%205 %5 0%2050%2 5 80+ 315 + 3+317-7631775- 727642+33190553319+225+275 85 to -15%-145-125-235% to -8.5%-13%8.0% 19.0 %15.0 to upto to-uptonot to0%5 to-25% 8,505+8' 25 05 Β±5 Γ·8, 500, 0 Β₯8$+$5+5 = $55%-22%+65+22%80%20 + 5,7=80%,50%=22 %,5%2 βˆ—2βˆ’3Γ— β‰ˆ βˆ’0Β·1555 ← β—€ βˆ’\u2009β†’ βˆ’\u2009↑ … … … ——…–—\xadβ€” – Β© Copyright by the authors, all rights reserved. This is an open access article distributed under terms and conditions of Creative Commons Attribution (CC BY) license deed which permits unrestricti\non use of such material in any commercial publication without reviseing this permission note.\nThis study was conducted on the basis o f a questionnaire survey, with a total respondet rate of sixty-five percent. The respondents were all from the public sector universities, except two of them who came fro m private institutions. The data analysed included the number of students enlist at each university, their gender, their academic backgrounds including their parents educational levels as well as their financial statuses before entering into tertianary education programmes; and then the students are categorized among five subgroups according ti p e r o n l y o f t h i s s c b :\n(A):\nStudying for one year or less at t he time when they entered tuition. They include those who did not graduate fr om high schools but had already ta

I am having the same problem, the output is completely random symbols. And even the inference demo in the huggingface page is having the same issue,
image.png

I had the same problem. It seems to have been caused by the tokenizer automatically adding special tokens. I set add_special_tokens=False, and now it works better.

model_name = "TinyLlama/TinyLlama_v1.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
def predict(input_text, model, tokenizer, device, max_length = 64):
    model.to(device)
    inputs = tokenizer(input_text, add_special_tokens=False, return_tensors='pt').to(device)
    input_length = inputs.input_ids.shape[-1]
    total_length = input_length + max_length
    output_ids = model.generate(
        **inputs,
        max_length=total_length,
        do_sample=False,
        eos_token_id=tokenizer.eos_token_id
    )
    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    return output_text

# test1: 
example_inputs = "Create a for loop in python that prints numbers up to 10:"
predict(example_inputs, model, tokenizer, device) # output: `Create a for loop in python that prints numbers up to 10:\nfor i in range(1,11):\n    print(i)\n\n\nA: You can use a for loop to do this.\nfor i in range(1,11):\n    print(i)\n\n\nA: You can use a for loop to do this.`

# test2:
example_inputs = "My name is Julia and I am "
predict(example_inputs, model, tokenizer, device) # output :`My name is Julia and I am 21 years old. I am a student at the University of California, Berkeley. I am a member of the Berkeley chapter of the National Organization for Women (NOW). I am also a member of the Berkeley chapter of the Asian Pacific American Labor Alliance (APALA). I am a member`

Hi @Djunnn , thank you! That fixed the issue. The following code uses the standard model pipeline.

from transformers import AutoTokenizer
import transformers 
import torch
model = "TinyLlama/TinyLlama_v1.1"
tokenizer = AutoTokenizer.from_pretrained(model)
pipe = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
    tokenizer=tokenizer,
)
generation_kwargs = {
    "max_length":64,
    "add_special_tokens":False,
    "do_sample":False, 
    "eos_token_id":tokenizer.eos_token_id,
}
pipe("Create a for loop in python that prints numbers up to 10:", **generation_kwargs)

One last thing, if I don't specify the max_length parameter or if I set it to a large number (i.e 600), the output repeats itself and seem to ignore/not generate the eos token. Do you think this is a model issue? Do you have any suggestion on how to fix it ?

Never mind, I found the following issue. The sentence repetition is a common pitfall of maximum likelihood training.

ClaudioCasellato changed discussion status to closed

Sign up or log in to comment