May I ask how could I use pipeline and output the transcript as SRT with timestamp?

#9
by ziweithunder - opened

Thanks for fine tuning the Cantonese model. May I ask how could I use pipeline and output the transcript as SRT with timestamp? I could run the audio file and output a paragraph of Cantonese, but couldn't find the way to output like SRT format using whisper.

I usually use WhisperX to get word/sentence-level timestamps. I am not sure exactly how to export it to SRT files but I have now uploaded the CTS version of this model so you dont need to convert it

Thanks for your reply. I am a beginner in using whisper and your model. May I ask what does CTS version mean and stand for?
do you mean I can use whisperx.load_model("alvanlii/whisper-small-cantonese") to load this model file directly instead of using pipeline method?
Thanks.

hmmm it might be easier for you to use this one

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment