Computing timestamps are not supported for canary_model?

#5
by Nguyen667201 - opened
This comment has been hidden (marked as Spam)
NVIDIA org

Hi @Nguyen667201

Thanks for giving our models a try!

Please make sure that you use the latest NeMo main branch to run the above inference code. Timestamp support for Canary-Flash models was added in this PR.

Let us know if this doesn’t resolve the issue.

You may also refer to this discussion on the same issue resolved for another user.

Hi @ankitapasad

Thanks for reply!

I have tried the latest version of NeMo, but as you can see, the timestamp issue may still not be fixed. Every time I pass the 'timestamps' parameter into transcribe, I get an error: 'Computing timestamps is not supported for this model yet.' Please release a new version to resolve this issue."

image.png

NVIDIA org

Hi @Nguyen667201

Please use the NeMo main branch, it supports the timestamp feature for Canary-Flash models.

The error you are facing is because v2.2.1, as in the screenshot above, does not include the timestamp feature. We will include the feature in the next stable release version, but until then, please use the main branch, it will resolve the issue.

Let us know if you are still facing issues.

Hi @Nguyen667201

Please use the NeMo main branch, it supports the timestamp feature for Canary-Flash models.

The error you are facing is because v2.2.1, as in the screenshot above, does not include the timestamp feature. We will include the feature in the next stable release version, but until then, please use the main branch, it will resolve the issue.

Let us know if you are still facing issues.

Thank @ankitapasad , I got it.
Could you explain the effect of "source_lang" and "target_lang" for me?
I think only "target_lang" is needed for inference, because even if you set any "source_lang", it might not affect the transcript as long as "target_lang = en".

When i have trained from scratch, i got an error "[rank0]: assert lang is not None, "Expected 'lang' to be set for AggregateTokenizer."
[rank0]: AssertionError: Expected 'lang' to be set for AggregateTokenizer." How can i solve it ?

image.png
following this tutorial, the format of manifest for an example looks like:
{"audio_filepath": "datasets/LibriLight/librispeech_finetuning/1h/2/clean/5778/12761/5778-12761-0000.flac", "duration": 14.56, "text": "continuation of fremont's account of the passage through the mountains we had hard and doubtful labor yet before us as the snow appeared to be heavier where the timber began further down with few open spots", "target_lang": "en", "source_lang": "en", "pnc": "False"}

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment