Can anyone get actual quality output from the model ?

by ClaudioCasellato - opened Oct 21, 2024

Oct 21, 2024

I tried the model on my 1080ti (no flash attention) and for the simple question:
Create a for loop in python that prints numbers up to 10

I get the following output garbage:
'0 to 985 to 315 to 664 to 27645077550007 to 65005080006.5 to upto -1500-1445+20545345565 to -25%205 %5 0%2050%2 5 80+ 315 + 3+317-7631775- 727642+33190553319+225+275 85 to -15%-145-125-235% to -8.5%-13%8.0% 19.0 %15.0 to upto to-uptonot to0%5 to-25% 8,505+8' 25 05 ±5 ÷8, 500, 0 ¥8$+$5+5 = $55%-22%+65+22%80%20 + 5,7=80%,50%=22 %,5%2 ∗2−3× ≈ −0·1555 ← ◀ −\u2009→ −\u2009↑ … … … ——…–—\xad— – © Copyright by the authors, all rights reserved. This is an open access article distributed under terms and conditions of Creative Commons Attribution (CC BY) license deed which permits unrestricti\non use of such material in any commercial publication without reviseing this permission note.\nThis study was conducted on the basis o f a questionnaire survey, with a total respondet rate of sixty-five percent. The respondents were all from the public sector universities, except two of them who came fro m private institutions. The data analysed included the number of students enlist at each university, their gender, their academic backgrounds including their parents educational levels as well as their financial statuses before entering into tertianary education programmes; and then the students are categorized among five subgroups according ti p e r o n l y o f t h i s s c b :\n(A):\nStudying for one year or less at t he time when they entered tuition. They include those who did not graduate fr om high schools but had already ta

icoicqico

Dec 9, 2024

•

edited Dec 9, 2024

I am having the same problem, the output is completely random symbols. And even the inference demo in the huggingface page is having the same issue,

Djunnn

Dec 16, 2024

I had the same problem. It seems to have been caused by the tokenizer automatically adding special tokens. I set add_special_tokens=False, and now it works better.

model_name = "TinyLlama/TinyLlama_v1.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
def predict(input_text, model, tokenizer, device, max_length = 64):
    model.to(device)
    inputs = tokenizer(input_text, add_special_tokens=False, return_tensors='pt').to(device)
    input_length = inputs.input_ids.shape[-1]
    total_length = input_length + max_length
    output_ids = model.generate(
        **inputs,
        max_length=total_length,
        do_sample=False,
        eos_token_id=tokenizer.eos_token_id
    )
    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    return output_text

# test1: 
example_inputs = "Create a for loop in python that prints numbers up to 10:"
predict(example_inputs, model, tokenizer, device) # output: `Create a for loop in python that prints numbers up to 10:\nfor i in range(1,11):\n    print(i)\n\n\nA: You can use a for loop to do this.\nfor i in range(1,11):\n    print(i)\n\n\nA: You can use a for loop to do this.`

# test2:
example_inputs = "My name is Julia and I am "
predict(example_inputs, model, tokenizer, device) # output :`My name is Julia and I am 21 years old. I am a student at the University of California, Berkeley. I am a member of the Berkeley chapter of the National Organization for Women (NOW). I am also a member of the Berkeley chapter of the Asian Pacific American Labor Alliance (APALA). I am a member`

ClaudioCasellato

Dec 20, 2024

Hi @Djunnn , thank you! That fixed the issue. The following code uses the standard model pipeline.

from transformers import AutoTokenizer
import transformers 
import torch
model = "TinyLlama/TinyLlama_v1.1"
tokenizer = AutoTokenizer.from_pretrained(model)
pipe = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
    tokenizer=tokenizer,
)
generation_kwargs = {
    "max_length":64,
    "add_special_tokens":False,
    "do_sample":False, 
    "eos_token_id":tokenizer.eos_token_id,
}
pipe("Create a for loop in python that prints numbers up to 10:", **generation_kwargs)

One last thing, if I don't specify the max_length parameter or if I set it to a large number (i.e 600), the output repeats itself and seem to ignore/not generate the eos token. Do you think this is a model issue? Do you have any suggestion on how to fix it ?

ClaudioCasellato

Dec 20, 2024

Never mind, I found the following issue. The sentence repetition is a common pitfall of maximum likelihood training.

ClaudioCasellato changed discussion status to closed Dec 20, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment