xvapitch_expresso / README.md
Pendrokar's picture
Legally gray
8347992 verified
metadata
license: cc-by-nc-4.0
language:
  - en
  - de
  - es
  - it
  - nl
  - pt
  - pl
  - ro
  - sv
  - da
  - fi
  - hu
  - el
  - fr
  - ru
  - uk
  - tr
  - ar
  - hi
  - jp
  - ko
  - zh
  - vi
  - la
  - ha
  - sw
  - yo
  - wo
thumbnail: https://raw.githubusercontent.com/DanRuta/xVA-Synth/master/assets/x-icon.png
library: xvasynth
tags:
  - emotion
  - audio
  - text-to-speech
  - tts
pipeline_tag: text-to-speech
datasets:
  - ylacombe/expresso
base_model: Pendrokar/xvapitch

xVASynth's xVAPitch (v3) type of voice models based on the Expresso dataset. Without enunciated, laughing, whispering and singing styles. From the confused style, only questions were used.

These models can also do emphasis on words by using colons :, rather than the typical quotemarks " which are skipped by the xVASynth text pre-processor:

  • What :exactly: is it?
  • Well :normally: we just let it run.

ex01 male:

ex02 female:

ex03 male:

ex04 female:

Legal note: Although these datasets are licensed as CC BY 4.0, the base v3 model that these models are fine-tuned from, was pre-trained on non-permissive data.

v3 base model: https://huggingface.co/Pendrokar/xvapitch