IndexError caused by suppress_tokens = [] on transformers=4.43.3

#6
by AlienKevin - opened

Running the sample code in the model card results in error on transformers=4.43.3:

Traceback (most recent call last):
  File "/workspace/cantonese_asr_eval/run.py", line 42, in <module>
    transcriptions = model.generate([
  File "/workspace/cantonese_asr_eval/asr_models/whisper_model.py", line 15, in generate
    results = self.pipe(input)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 284, in __call__
    return super().__call__(inputs, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1235, in __call__
    outputs = list(final_iterator)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
    item = next(self.iterator)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/pt_utils.py", line 269, in __next__
    processed = self.infer(next(self.iterator), **self.params)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1161, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 504, in _forward
    tokens = self.model.generate(
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py", line 624, in generate
    decoder_input_ids, kwargs = self._prepare_decoder_input_ids(
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py", line 1604, in _prepare_decoder_input_ids
    prev_start_of_text = suppress_tokens[-2] if suppress_tokens is not None else None
IndexError: index -2 is out of bounds for dimension 0 with size 0

Seems that transformers expects you to set the suppress_tokens to None, rather than the empty list:

from transformers import pipeline
MODEL_NAME = "alvanlii/whisper-small-cantonese" 
lang = "zh"
pipe = pipeline(
    task="automatic-speech-recognition",
    model=MODEL_NAME,
    chunk_length_s=30,
    device=device,
)
pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language=lang, task="transcribe")
### fix ###
pipe.model.generation_config.suppress_tokens = None
### fix ###
text = pipe(file)["text"]

thanks, changed it

alvanlii changed discussion status to closed

Sign up or log in to comment