nithinraok
commited on
Commit
•
ee0a614
1
Parent(s):
e7d95d9
Update README.md
Browse files
README.md
CHANGED
@@ -176,7 +176,7 @@ img {
|
|
176 |
| [![Language](https://img.shields.io/badge/Language-en-lightgrey#model-badge)](#datasets)
|
177 |
|
178 |
|
179 |
-
`parakeet-
|
180 |
It is an XXL version of Hybrid FastConformer [1] TDT-CTC [2] (around 1.1B parameters) model. This model has been trained with Local Attention and Global token hence this model can transcribe **11 hrs** of audio in one single pass. And for reference this model can transcibe 90mins of audio in <16 sec on A100.
|
181 |
See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) for complete architecture details.
|
182 |
|
|
|
176 |
| [![Language](https://img.shields.io/badge/Language-en-lightgrey#model-badge)](#datasets)
|
177 |
|
178 |
|
179 |
+
`parakeet-tdt_ctc-1.1b` is an ASR model that transcribes speech with Punctuations and Capitalizations of English alphabet. This model is jointly developed by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo) and [Suno.ai](https://www.suno.ai/) teams.
|
180 |
It is an XXL version of Hybrid FastConformer [1] TDT-CTC [2] (around 1.1B parameters) model. This model has been trained with Local Attention and Global token hence this model can transcribe **11 hrs** of audio in one single pass. And for reference this model can transcibe 90mins of audio in <16 sec on A100.
|
181 |
See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) for complete architecture details.
|
182 |
|