khanhld3 commited on
Commit
06f1304
·
1 Parent(s): 7e37373

[test] init

Browse files
Files changed (2) hide show
  1. .gitignore +0 -2
  2. README.md +38 -10
.gitignore CHANGED
@@ -1,2 +0,0 @@
1
- push_hf.py
2
- pytorch_model.pt
 
 
 
README.md CHANGED
@@ -67,26 +67,28 @@ model-index:
67
 
68
  # **ChunkFormer-Large-Vie: Large-Scale Pretrained ChunkFormer for Vietnamese Automatic Speech Recognition**
69
  [![License: CC BY-NC 4.0](https://img.shields.io/badge/License-CC%20BY--NC%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc/4.0/)
70
- [![Hugging Face](https://img.shields.io/badge/HuggingFace-ChunkFormer-orange)](https://huggingface.co/your-username/chunkformer)
71
  [![Paper](https://img.shields.io/badge/Paper-ICASSP%202025-green)](https://your-paper-link)
72
 
73
- <!-- ### Table of contents
74
  1. [Model Description](#description)
75
- 2. [Implementation](#implementation)
76
- 3. [Benchmark Result](#benchmark)
77
- 4. [Example Usage](#example)
78
- 5. [Evaluation](#evaluation)
79
  6. [Citation](#citation)
80
- 7. [Contact](#contact) -->
 
81
 
82
  <a name = "description" ></a>
83
- ChunkFormer-Large-Vie is a large-scale Vietnamese Automatic Speech Recognition (ASR) model based on the innovative ChunkFormer architecture, introduced at ICASSP 2025. The model has been fine-tuned on approximately 2000 hours of Vietnamese speech data sourced from diverse datasets.
 
 
84
  <a name = "implementation" ></a>
85
  ### Documentation and Implementation
86
- We provide the documentation and implementation of ChunkFormer, check it out [HERE]().
87
 
88
  <a name = "benchmark" ></a>
89
- ### Benchmark WER Result
90
  | STT | Model | Vios | Common Voice | VLSP - Task 1 | Avg. |
91
  |-----|--------------|------|--------------|---------------|------|
92
  | 1 | ChunkFormer | x | x | x | x |
@@ -94,6 +96,9 @@ We provide the documentation and implementation of ChunkFormer, check it out [HE
94
  | 3 | X | x | x | x | x |
95
  | 4 | Y | x | x | x | x |
96
 
 
 
 
97
  <a name = "usage" ></a>
98
  ### Usage
99
 
@@ -125,4 +130,27 @@ python decode.py \
125
  --right_context_size 128
126
  ```
127
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
 
 
67
 
68
  # **ChunkFormer-Large-Vie: Large-Scale Pretrained ChunkFormer for Vietnamese Automatic Speech Recognition**
69
  [![License: CC BY-NC 4.0](https://img.shields.io/badge/License-CC%20BY--NC%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc/4.0/)
70
+ [![GitHub](https://img.shields.io/badge/GitHub-ChunkFormer-blue)](https://github.com/khanld/chunkformer)
71
  [![Paper](https://img.shields.io/badge/Paper-ICASSP%202025-green)](https://your-paper-link)
72
 
73
+ ### Table of contents
74
  1. [Model Description](#description)
75
+ 2. [Documentation and Implementation](#implementation)
76
+ 3. [Benchmark Results](#benchmark)
77
+ 4. [Usage](#usage)
 
78
  6. [Citation](#citation)
79
+ 7. [Contact](#contact)
80
+ ---
81
 
82
  <a name = "description" ></a>
83
+ ### Model Description
84
+ **ChunkFormer-Large-Vie** is a large-scale Vietnamese Automatic Speech Recognition (ASR) model based on the innovative **ChunkFormer** architecture, introduced at **ICASSP 2025**. The model has been fine-tuned on approximately **2000 hours** of Vietnamese speech data sourced from diverse datasets.
85
+
86
  <a name = "implementation" ></a>
87
  ### Documentation and Implementation
88
+ The [documentation](#) and [implementation](#) of ChunkFormer are publicly available.
89
 
90
  <a name = "benchmark" ></a>
91
+ ### Benchmark Results
92
  | STT | Model | Vios | Common Voice | VLSP - Task 1 | Avg. |
93
  |-----|--------------|------|--------------|---------------|------|
94
  | 1 | ChunkFormer | x | x | x | x |
 
96
  | 3 | X | x | x | x | x |
97
  | 4 | Y | x | x | x | x |
98
 
99
+ ---
100
+
101
+
102
  <a name = "usage" ></a>
103
  ### Usage
104
 
 
130
  --right_context_size 128
131
  ```
132
 
133
+ ---
134
+
135
+ <a name = "citation" ></a>
136
+ ### Citation
137
+ If you use this work in your research, please cite:
138
+
139
+ ```bibtex
140
+ @inproceedings{your_paper,
141
+ title={ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription},
142
+ author={Khanh Le, Tuan Vu Ho, Dung Tran and Duc Thanh Chau},
143
+ booktitle={ICASSP},
144
+ year={2025}
145
+ }
146
+ ```
147
+
148
+ <a name = "contact"></a>
149
+ ### Contact
150
151
+ - [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/)
152
+ - [![LinkedIn](https://img.shields.io/badge/linkedin-%230077B5.svg?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/khanhld257/)
153
+
154
+
155
+
156