May I ask how could I use pipeline and output the transcript as SRT with timestamp?

by ziweithunder - opened Nov 12, 2024

Nov 12, 2024

Thanks for fine tuning the Cantonese model. May I ask how could I use pipeline and output the transcript as SRT with timestamp? I could run the audio file and output a paragraph of Cantonese, but couldn't find the way to output like SRT format using whisper.

alvanlii

Owner Nov 12, 2024

I usually use WhisperX to get word/sentence-level timestamps. I am not sure exactly how to export it to SRT files but I have now uploaded the CTS version of this model so you dont need to convert it

ziweithunder

Nov 13, 2024

•

edited Nov 13, 2024

Thanks for your reply. I am a beginner in using whisper and your model. May I ask what does CTS version mean and stand for?
do you mean I can use whisperx.load_model("alvanlii/whisper-small-cantonese") to load this model file directly instead of using pipeline method?
Thanks.

alvanlii

Owner Nov 13, 2024

•

edited Nov 13, 2024

hmmm it might be easier for you to use this one

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment