Voices
For each voice, the given grades are intended to be estimates of the quality and quantity of its associated training data, both of which impact overall inference quality.
Subjectively, voices will sound better or worse to different people.
Target Quality
- How high quality is the reference voice? This grade may be impacted by audio quality, artifacts, compression, & sample rate.
- How well do the text labels match the audio? Text/audio misalignment (e.g. from hallucinations) will lower this grade.
Training Duration
- How much audio was seen during training? Smaller durations result in a lower overall grade.
American ๐บ๐ธ
American G2P: misaki[en]
with en-us
espeak-ng fallback
Name | Traits | Target Quality | Training Duration | Overall Grade |
---|---|---|---|---|
af_alloy | ๐บ | B | MM minutes | C |
af_aoede | ๐บ | B | H hours | C+ |
af_bella | ๐บ๐ฅ | A | HH hours | A- |
af_jessica | ๐บ | C | MM minutes | D |
af_kore | ๐บ | B | H hours | C+ |
af_nicole | ๐บ๐ง | B | HH hours | B- |
af_nova | ๐บ | B | MM minutes | C |
af_river | ๐บ | C | MM minutes | D |
af_sarah | ๐บ | B | H hours | C+ |
af_sky | ๐บ | B | M minutes | C- |
am_adam | ๐น | D | H hours | F+ |
am_echo | ๐น | C | MM minutes | D |
am_eric | ๐น | C | MM minutes | D |
am_fenrir | ๐น | B | H hours | C+ |
am_liam | ๐น | C | MM minutes | D |
am_michael | ๐น | B | H hours | C+ |
am_onyx | ๐น | C | MM minutes | D |
am_puck | ๐น | B | H hours | C+ |
British ๐ฌ๐ง
British G2P: misaki[en]
with en-gb
espeak-ng fallback
Name | Traits | Target Quality | Training Duration | Overall Grade |
---|---|---|---|---|
bf_alice | ๐บ | C | MM minutes | D |
bf_emma | ๐บ | B | HH hours | B- |
bf_isabella | ๐บ | B | MM minutes | C |
bf_lily | ๐บ | C | MM minutes | D |
bm_daniel | ๐น | C | MM minutes | D |
bm_fable | ๐น | B | MM minutes | C |
bm_george | ๐น | B | MM minutes | C |
bm_lewis | ๐น | C | H hours | D+ |
French ๐ซ๐ท
French G2P: espeak-ng fr-fr
Name | Traits | Target Quality | Training Duration | Overall Grade |
---|---|---|---|---|
ff_siwis | ๐บ | B | <11 hours | B- |
This table lists all French training data seen by Kokoro.
Hindi ๐ฎ๐ณ
Hindi G2P: espeak-ng hi
Name | Traits | Target Quality | Training Duration | Overall Grade |
---|---|---|---|---|
hf_alpha | ๐บ | B | MM minutes | C |
hf_beta | ๐บ | B | MM minutes | C |
hm_omega | ๐น | B | MM minutes | C |
hm_psi | ๐น | B | MM minutes | C |
This table lists all Hindi training data seen by Kokoro, which totals about 6 hours.
Japanese ๐ฏ๐ต
Japanese G2P: misaki[ja]
Name | Traits | Target Quality | Training Duration | Overall Grade |
---|---|---|---|---|
jf_alpha | ๐บ | B | H hours | C+ |
Mandarin Chinese ๐จ๐ณ
Mandarin Chinese G2P: misaki[zh]
Name | Traits | Target Quality | Training Duration | Overall Grade |
---|---|---|---|---|
zf_xiaobei | ๐บ | C | MM minutes | D |
zf_xiaoni | ๐บ | C | MM minutes | D |
zf_xiaoxiao | ๐บ | C | MM minutes | D |
zf_xiaoyi | ๐บ | C | MM minutes | D |
zm_yunjian | ๐น | C | MM minutes | D |
zm_yunxi | ๐น | C | MM minutes | D |
zm_yunxia | ๐น | C | MM minutes | D |
zm_yunyang | ๐น | C | MM minutes | D |