reazon-research
/

reazonspeech-k2-v2

Automatic Speech Recognition

Model card Files Files and versions Community

reazonspeech-k2-v2

reazonspeech-k2-v2 is an automatic speech recognition (ASR) model trained on ReazonSpeech v2.0 corpus.

This model provides end-to-end Japanese speech recognition based on Next-gen Kaldi.

Model Architecture

Character-based RNN-T model. The total parameter count is 159.34M.
This model utilizes an enhanced Transformer architecture called Zipformer.
The training recipe is available on k2-fsa/icefall.

Note that this model can process Japanese audio clips up to ~30 seconds.

Usage

We recommend to use this model through our reazonspeech library.

from reazonspeech.k2.asr import load_model, transcribe, audio_from_path

audio = audio_from_path("speech.wav")
model = load_model()
ret = transcribe(model, audio)
print(ret.text)

License

Apaceh Licence 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Automatic Speech Recognition

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Spaces using reazon-research/reazonspeech-k2-v2 5

Collection including reazon-research/reazonspeech-k2-v2

ReazonSpeech ASR

Official releases of ReazonSpeech ASR models • 5 items • Updated Jan 20