File size: 950 Bytes
433b042
 
 
 
 
1855368
433b042
 
ec8e907
433b042
ec8e907
433b042
ec8e907
 
 
433b042
58e6134
 
7c6798a
433b042
ec8e907
 
5814956
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
license: mit
language:
- en
base_model:
- coqui/XTTS-v2
---

# Fine-Tuned Xtts Model

This project fine-tunes a TTS (Text-to-Speech) model using an mp3 file extracted from a YouTube video. The training was conducted on a Hugging Face Space running locally via Docker. A GPU is recommended for faster training.

### Training Data
- **Source Video**: [YouTube Video](https://www.youtube.com/watch?v=u6J20_Aem3Y)
- **Training Audio**: The mp3 file used for training is included in the `files` directory.

### dockerimage
Fine tuned with this docker image
[FineTune Xtts Docker image](https://hub.docker.com/r/athomasson2/fine_tune_xtts)

### Notes
- Ensure you have a GPU available for optimal performance during training.
- The Docker image pulls the latest version each time it's run.

This model is based on xtts v2 which cannot be used commercially as per the [xtts license which is in a limbo state](https://github.com/coqui-ai/TTS/issues/3490)