--- license: mit language: - mr pipeline_tag: automatic-speech-recognition library_name: nemo --- ## IndicConformer IndicConformer is a Hybrid RNNT conformer model built for Marathi. ## AI4Bharat NeMo: To load, train, fine-tune or play with the model you will need to install [AI4Bharat NeMo](https://github.com/AI4Bharat/NeMo). We recommend you install it using the command shown below ``` git clone https://github.com/AI4Bharat/NeMo.git && cd NeMo && git checkout nemo-v2 && bash reinstall.sh ``` ## Usage ```bash $ python inference.py --help usage: inference.py [-h] -c CHECKPOINT -f AUDIO_FILEPATH -d (cpu,cuda) -l LANGUAGE_CODE options: -h, --help show this help message and exit -c CHECKPOINT, --checkpoint CHECKPOINT Path to .nemo file -f AUDIO_FILEPATH, --audio_filepath AUDIO_FILEPATH Audio filepath -d (cpu,cuda), --device (cpu,cuda) Device (cpu/gpu) -l LANGUAGE_CODE, --language_code LANGUAGE_CODE Language Code (eg. hi) ``` ## Example command ``` python inference.py -c indicconformer_stt_mr_hybrid_rnnt_large.nemo -f hindi-16khz.wav -d cuda -l hi ``` Expected output - ``` Loading model.. ... Transcibing.. ---------- Transcript: Took ** seconds. ---------- ``` ### Input This model accepts 16000 KHz Mono-channel Audio (wav files) as input. ### Output This model provides transcribed speech as a string for a given audio sample. ## Model Architecture This model is a conformer-Large model, consisting of 120M parameters, as the encoder, with a hybrid CTC-RNNT decoder. The model has 17 conformer blocks with 512 as the model dimension.