Automatic Speech Recognition
Ukrainian
Eval Results
File size: 2,896 Bytes
8d04353
 
 
 
 
8f92047
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8d04353
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c807be4
8d04353
 
 
 
 
 
 
d4a8a4f
8d04353
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ece82e3
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
---
license: apache-2.0
language:
- uk
pipeline_tag: automatic-speech-recognition
datasets:
- mozilla-foundation/common_voice_10_0
- Yehor/openstt-uk
metrics:
- wer
model-index:
  - name: w2v-bert-uk-v2.1
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_10_0
          type: common_voice_10_0
          config: uk
          split: test
          args: uk
        metrics:
          - name: WER
            type: wer
            value: 9.0777
          - name: CER
            type: cer
            value: 1.9839
---

# Flashlight for Ukrainian

## Community

- Discord: https://bit.ly/discord-uds
- Speech Recognition: https://t.me/speech_recognition_uk
- Speech Synthesis: https://t.me/speech_synthesis_uk

See other Ukrainian models: https://github.com/egorsmkv/speech-recognition-uk

## Overview

This repository contains the acoustic model for Ukrainian trained on Flashlight framework: https://github.com/flashlight/flashlight/tree/main/flashlight/app/asr

- Architecture: Conformer (300m params)
- Data in train: Common Voice 10 & Voice of America
- Trained epochs: 410
- Train time: around a week (RTX A4000)

## Quality

- WER: 9.0777% (id est the quality is 90.92%)
- CER: 1.9839%

## How to test?

### Run a container with Flashlight running with CPU

```bash
docker-compose up

# and in another termianl
docker exec -it flashlight_cpu bash
```

### Run

Just with an AM:

```
/root/flashlight/build/bin/asr/fl_asr_test --am /models/uk_am.bin --datadir ''  --emission_dir '' --uselexicon false \
 --test /data/rows.lst --tokens /models/tokens.txt --lexicon /models/lexicon.txt --show
 ```
 
 With an LM:
 
 ```
 /root/flashlight/build/bin/asr/fl_asr_decode \
  --am=/models/uk_am.bin \
  --test=/data/labels_absolute.lst \
  --maxload=3477 \
  --nthread_decoder=2 \
  --show \
  --showletters \
  --lexicon=/models/lexicon.txt \
  --uselexicon=false \
  --lm=/models/lm_4gram_500k.binary \
  --lmtype=kenlm \
  --decodertype=wrd \
  --beamsize=200 \
  --beamsizetoken=200 \
  --beamthreshold=20 \
  --lmweight=0.75 \
  --wordscore=0 \
  --eosscore=0 \
  --silscore=0 \
  --unkscore=0 \
  --smearing=max
 ```

- **labels_absolute.lst** is from https://github.com/egorsmkv/cv10-uk-testset-clean
- **lm_4gram_500k.binary** is from https://huggingface.co/Yehor/kenlm-ukrainian/tree/main/news/lm-4gram-500k

## How to fine-tune on own data?

```
/root/flashlight/build/bin/asr/fl_asr_train continue /models/ --flagsfile /models/train.flags
```

`/models/` must contain .bin files

## Cite this work

```
@misc {smoliakov_2025,
	author       = { {Smoliakov} },
	title        = { flashlight-uk (Revision 1ac154b) },
	year         = 2025,
	url          = { https://huggingface.co/Yehor/flashlight-uk },
	doi          = { 10.57967/hf/4577 },
	publisher    = { Hugging Face }
}
```