Finetuning guide? Supported audio formats while finetuning?

by neurlang - opened Mar 23

Mar 23

please tell me about finetuning this system, what is the VRAM requirement? In what format (preferably CSV,TSV) we provide audio paths+transcripts?
How to set the language for transcripts?

ankitapasad

NVIDIA org 28 days ago

•

edited 28 days ago

Hi @neurlang

The Canary training script takes dataset manifest as an input in a jsonl format. Our tutorial has details on how to create the manifest file and how to finetune the canary-flash models.

The Canary-180M-Flash was trained on 32 A100 80GB GPUs. Based on the size of your GPU, you can scale the batch size. The effective batch size can be controlled using trainer.accumulate_grad_batches and the number of GPUs. Be sure to tune the learning rate accordingly.

Hope that helps, please feel free to reach out if you have more questions!

halbefn

21 days ago

Is it possible to finetune just the vocabulary and language, comparable to training KenLM/n-gram language models in the older CTC models? It was quite neat to train on text only instead of audio and text.

There is the class BeamSearchSequenceGeneratorWithLanguageModel for example.
Could this be utilized to quickly fine tune the transcriptions to an expert domain?

artbataev

NVIDIA org 14 days ago

@halbefn you can try decoding with n-gram LM
It's available in the main branch.
For details (building and using LM), please see the description of the PR https://github.com/NVIDIA/NeMo/pull/12730

halbefn

9 days ago

•

edited 9 days ago

@artbataev Thank you, it works quite well.
For anyone reading this, changing e.g. multitask_decoding.strategy="beam" to decoding.strategy="beam" allows you to use the KenLM models on longer audio files with https://github.com/NVIDIA/NeMo/blob/main/examples/asr/asr_chunked_inference/aed/speech_to_text_aed_chunked_infer.py

Edit: However, if you add "timestamps=True" to speech_to_text_aed_chunked_infer.py you get nonsense transcripts.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment