f5-tts-mlx-german / README.md
eamag's picture
add durationv1
51e7912
---
license: cc-by-nc-4.0
pipeline_tag: text-to-speech
library_name: f5-tts
datasets:
- amphion/Emilia-Dataset
language:
- de
tags:
- tts
- audio
- german
- mlx
---
Copied from https://huggingface.co/marduk-ra/F5-TTS-German, added trained duration model on emilia dataset using https://github.com/eamag/f5-tts-duration
Inference with https://github.com/lucasnewman/f5-tts-mlx
```bash
python -m f5_tts_mlx.generate --model "eamag/f5-tts-mlx-german" \
--text "The quick brown fox jumped over the lazy dog." \
--ref-audio /path/to/audio.wav \
--ref-text "This is the caption for the reference audio."
```
Github: https://github.com/SWivid/F5-TTS
Paper: [F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching](https://huggingface.co/papers/2410.06885)
> **_NOTE:_** You can set the number of nfe steps to 64 to produce better quality sound.