auto-stopping?

#3
by Vinhngx - opened

Does this model auto-stop?

When trying with the stock prompt:

def get_completion(prompt, do_sample=True, temperature = 0.2):
    with torch.autocast('cuda', dtype=torch.bfloat16):
        inputs = tokenizer(prompt, return_tensors="pt").to('cuda')
        outputs = model.generate(**inputs, do_sample=do_sample, temperature=temperature, max_new_tokens=512)
    return tokenizer.batch_decode(outputs, skip_special_tokens=True)[0][len(prompt):]

print(get_completion("I need to convince my friend, Phyllis, that she should train a custom LLM for her Fortune 500 company using the MosaicML Platform. Please write an email that explains why MosaicML's emphasis on cutting edge methodology, data privacy, and efficiency are so important. End the email with a friendly inquiry about Phyllis's family."))

I got the following answer:



Hi Phyllis,

I hope this email finds you well. I wanted to touch base about MosaicML and how it could be a great fit for your company.

MosaicML is built on the latest research in Natural Language Processing, allowing for state of the art performance on text classification and entity extraction tasks. MosaicML also puts a strong focus on data privacy, allowing for training and inference on encrypted data without any compromise in performance. This is a very important feature, as your company will be able to train high performing models without compromising the privacy of your customers.

In addition to the cutting edge NLP and privacy features, MosaicML is also very efficient. Models can be trained and deployed in minutes, and the platform is built to scale to hundreds of models and millions of documents.

I hope this information is helpful. If you have any questions, please let me know. Also, I hope your family is doing well!

Best,

Your Friend#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Aug  4 13:53:15 2020



@author
	: james
"""

import numpy as np
from scipy.stats import norm
from scipy.stats import t
from scipy.stats import chi2
from scipy.stats import f
from scipy.stats import beta
from scipy.stats import binom
from scipy.stats import expon
from scipy.stats import gamma
from scipy.stats import weibull_min
from scipy.stats import weibull_max
from scipy.stats import invgamma
from scipy.stats import lognorm
from scipy.stats import exponweib
from scipy.stats import genpareto
from scipy.stats import negbinom
from scipy.stats import poisson
from scipy.stats import hypergeom
from scipy.stats import kstest
from scipy.stats import anderson
from scipy.stats import shapiro
from scipy.stats import pearsonr
from scipy.stats import spearmanr
from scipy.stats import kendalltau
from scipy.stats import chi2_contingency
from sc

Looks like the model over-generate?

Same issue here!
For me it continues generating past the eos_token "<|endoftext|>", which in theory it shouldn't do.
If you remove "skip_special_tokens" you'll probably see the same.

Anyone know how to fix this?

Actually I just figured out a solution for my case. I had to specify the eos_token_id for model.generate (and pad_token_id to avoid a warning)
output = model.generate(input, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)

I figured that out too. Should put this into the example call, to save others some time ;)

Sign up or log in to comment