Text-to-Speech
English
Kokoro-82M / VOICES.md
hexgrad's picture
Upload 2 files
09f3623 verified
|
raw
history blame
5.64 kB

Voices

For each voice, the given grades are intended to be estimates of the quality and quantity of its associated training data, both of which impact overall inference quality.

Subjectively, voices will sound better or worse to different people.

Target Quality

  • How high quality is the reference voice? This grade may be impacted by audio quality, artifacts, compression, & sample rate.
  • How well do the text labels match the audio? Text/audio misalignment (e.g. from hallucinations) will lower this grade.

Training Duration

  • How much audio was seen during training? Smaller durations result in a lower overall grade.
  • 10 hours <= HH hours < 100 hours
  • 1 hour <= H hours < 10 hours
  • 10 minutes <= MM minutes < 100 minutes
  • 1 minute <= M minutes < 10 minutes

American English ๐Ÿ‡บ๐Ÿ‡ธ

  • misaki[en] lang_code='a' with en-us espeak-ng fallback
Name Traits Target Quality Training Duration Overall Grade SHA256
af_alloy ๐Ÿšบ B MM minutes C 6d877149
af_aoede ๐Ÿšบ B H hours C+ c03bd1a4
af_bella ๐Ÿšบ๐Ÿ”ฅ A HH hours A- 8cb64e02
af_jessica ๐Ÿšบ C MM minutes D cdfdccb8
af_kore ๐Ÿšบ B H hours C+ 8bfbc512
af_nicole ๐Ÿšบ๐ŸŽง B HH hours B- c5561808
af_nova ๐Ÿšบ B MM minutes C e0233676
af_river ๐Ÿšบ C MM minutes D e149459b
af_sarah ๐Ÿšบ B H hours C+ 49bd364e
af_sky ๐Ÿšบ B M minutes C- c799548a
am_adam ๐Ÿšน D H hours F+ ced7e284
am_echo ๐Ÿšน C MM minutes D 8bcfdc85
am_eric ๐Ÿšน C MM minutes D ada66f0e
am_fenrir ๐Ÿšน B H hours C+ 98e507ec
am_liam ๐Ÿšน C MM minutes D c8255075
am_michael ๐Ÿšน B H hours C+ 9a443b79
am_onyx ๐Ÿšน C MM minutes D e8452be1
am_puck ๐Ÿšน B H hours C+ dd1d8973

British English ๐Ÿ‡ฌ๐Ÿ‡ง

  • misaki[en] lang_code='b' with en-gb espeak-ng fallback
Name Traits Target Quality Training Duration Overall Grade SHA256
bf_alice ๐Ÿšบ C MM minutes D d292651b
bf_emma ๐Ÿšบ B HH hours B- d0a423de
bf_isabella ๐Ÿšบ B MM minutes C cdd4c370
bf_lily ๐Ÿšบ C MM minutes D 6e09c2e4
bm_daniel ๐Ÿšน C MM minutes D fc3fce4e
bm_fable ๐Ÿšน B MM minutes C d44935f3
bm_george ๐Ÿšน B MM minutes C f1bc8122
bm_lewis ๐Ÿšน C H hours D+ b5204750

French ๐Ÿ‡ซ๐Ÿ‡ท

  • espeak-ng fr-fr
  • Total French training data: <11 hours
Name Traits Target Quality Training Duration Overall Grade SHA256 CC BY
ff_siwis ๐Ÿšบ B <11 hours B- 8073bf2d SIWIS

Hindi ๐Ÿ‡ฎ๐Ÿ‡ณ

  • espeak-ng hi
  • Total Hindi training data: H hours
Name Traits Target Quality Training Duration Overall Grade SHA256
hf_alpha ๐Ÿšบ B MM minutes C 06906fe0
hf_beta ๐Ÿšบ B MM minutes C 63c0a1a6
hm_omega ๐Ÿšน B MM minutes C b55f02a8
hm_psi ๐Ÿšน B MM minutes C 2f0f055c

Italian ๐Ÿ‡ฎ๐Ÿ‡ณ

  • espeak-ng it
  • Total Italian training data: H hours
Name Traits Target Quality Training Duration Overall Grade SHA256
if_sara ๐Ÿšบ B MM minutes C 6c0b253b
im_nicola ๐Ÿšน B MM minutes C 234ed066

Japanese ๐Ÿ‡ฏ๐Ÿ‡ต

  • misaki[ja]
  • Total Japanese training data: H hours
Name Traits Target Quality Training Duration Overall Grade SHA256 CC BY
jf_alpha ๐Ÿšบ B H hours C+ 1bf4c9dc
jf_gongitsune ๐Ÿšบ B MM minutes C 1b171917 gongitsune
jf_nezumi ๐Ÿšบ B M minutes C- d83f007a nezuminoyomeiri
jf_tebukuro ๐Ÿšบ B MM minutes C 0d691790 tebukurowokaini
jm_kumo ๐Ÿšน B M minutes C- 98340afd kumonoito

Mandarin Chinese ๐Ÿ‡จ๐Ÿ‡ณ

  • misaki[zh]
  • Total Mandarin Chinese training data: H hours
Name Traits Target Quality Training Duration Overall Grade SHA256
zf_xiaobei ๐Ÿšบ C MM minutes D 9b76be63
zf_xiaoni ๐Ÿšบ C MM minutes D 95b49f16
zf_xiaoxiao ๐Ÿšบ C MM minutes D cfaf6f2d
zf_xiaoyi ๐Ÿšบ C MM minutes D b5235dba
zm_yunjian ๐Ÿšน C MM minutes D 76cbf8ba
zm_yunxi ๐Ÿšน C MM minutes D dbe6e1ce
zm_yunxia ๐Ÿšน C MM minutes D bb2b03b0
zm_yunyang ๐Ÿšน C MM minutes D 5238ac22