Spaces:

openai
/

whisper

Running on L4

App Files Files Community

126

Transcribed a random video I was watching using my phone mic. It didn't even try in the second half.

#50

by IDontKnowWhatToNameMyself - opened Nov 16, 2022

Discussion

IDontKnowWhatToNameMyself

Nov 16, 2022

Input audio:

Transcription:

I am 100% turning me to a shark. I'm very excited about this.

sanchit-gandhi

Nov 24, 2022

•

edited Nov 24, 2022

This is expected - in the recording, there is a large gap when no speech is spoken between the first sentence and the second. The model predicts the "end of sequence" token when it gets to this gap of no speech, causing it to stop the transcription process.

The model is trained to predict the "end of sequence" token when it hears such gaps in speech.

IDontKnowWhatToNameMyself

Nov 24, 2022

This is expected - in the recording, there is a large gap when no speech is spoken between the first sentence and the second. The model predicts the "end of sequence" token when it gets to this gap of no speech, causing it to stop the transcription process.

The model is trained to predict the "end of sequence" token when it hears such gaps in speech.

Oh I see. That makes a lot more sense now

IDontKnowWhatToNameMyself changed discussion status to closed Nov 24, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment