metadata

language: en
datasets:
  - legacy-datasets/common_voice
  - vlsp2020_vinai_100h
  - AILAB-VNUHCM/vivos
  - doof-ferb/vlsp2020_vinai_100h
  - doof-ferb/fpt_fosd
  - doof-ferb/infore1_25hours
  - linhtran92/viet_bud500
  - doof-ferb/LSVSC
  - doof-ferb/vais1000
  - doof-ferb/VietMed_labeled
  - NhutP/VSV-1100
  - doof-ferb/Speech-MASSIVE_vie
  - doof-ferb/BibleMMS_vie
  - capleaf/viVoice
metrics:
  - wer
pipeline_tag: automatic-speech-recognition
tags:
  - transcription
  - audio
  - speech
  - chunkformer
  - asr
  - automatic-speech-recognition
  - long-form
license: cc-by-nc-4.0
model-index:
  - name: ChunkFormer Large Vietnamese
    results:
      - task:
          name: Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common-voice-vietnamese
          type: common_voice
          args: vi
        metrics:
          - name: Test WER
            type: wer
            value: x
      - task:
          name: Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: VIVOS
          type: vivos
          args: vi
        metrics:
          - name: Test WER
            type: wer
            value: x
      - task:
          name: Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: VLSP - Task 1
          type: vlsp
          args: vi
        metrics:
          - name: Test WER
            type: wer
            value: x

ChunkFormer: Masked Chunking Conformer for Long-Form Speech Transcription

khanhld
/

chunkformer-large-vie

ChunkFormer: Masked Chunking Conformer for Long-Form Speech Transcription

Introduction

Installation