Model Description

OpenAI์˜ whisper-base ๋ชจ๋ธ์„ ์•„๋ž˜ ์„ธ๊ฐ€์ง€ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

train_steps: 20000
warmup_steps: 2000
lr scheduler: linear warmup cosine decay
max learning rate: 1e-4
batch size: 256
max_grad_norm: 1.0
adamw_beta1: 0.9
adamw_beta2: 0.98

Evaluation

https://github.com/rtzr/Awesome-Korean-Speech-Recognition

์œ„ ๋ ˆํฌ์ง€ํ† ๋ฆฌ์—์„œ ์ฃผ์š” ์˜์—ญ๋ณ„ ํšŒ์˜ ์Œ์„ฑ์„ ์ œ์™ธํ•œ ํ…Œ์ŠคํŠธ์…‹ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค. ์•„๋ž˜ ํ…Œ์ด๋ธ”์—์„œ whisper_base_komix๊ฐ€ ๋ณธ ๋ชจ๋ธ ์„ฑ๋Šฅ์ž…๋‹ˆ๋‹ค.

+--------------------------+----------+-----------+---------------+------------+---------------+--------------+-------------+-------------+
|          Model           | cv_15_ko | fleurs_ko | kcall_testset | kconf_test | kcounsel_test | klec_testset | kspon_clean | kspon_other |
+--------------------------+----------+-----------+---------------+------------+---------------+--------------+-------------+-------------+
|       whisper_base       |  21.16   |   11.89   |     42.56     |   27.62    |     22.24     |    28.65     |    30.41    |    27.02    |
|    whisper_base_kspon    |  26.63   |   13.95   |     42.05     |   29.61    |     26.21     |    28.72     |    12.58    |    13.48    |
|    whisper_base_komix    |  15.42   |    7.16   |     20.86     |   14.24    |     12.64     |    13.44     |    12.26    |    12.12    |
|      whisper_turbo       |   5.38   |    3.95   |      5.89     |    9.77    |      4.21     |     9.27     |    16.49    |    13.54    |
+--------------------------+----------+-----------+---------------+------------+---------------+--------------+-------------+-------------+
Downloads last month
9
Safetensors
Model size
72.6M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.