Update README.md
Browse files
README.md
CHANGED
@@ -17,23 +17,23 @@ Trained over 73+ hours of "train" split of ylacombe/cml-tts dataset
|
|
17 |
with 8xRTX4090, still in progress, using gradio finetuning app using following settings:
|
18 |
```
|
19 |
exp_name"F5TTS_Base"
|
20 |
-
|
21 |
-
|
22 |
-
batch_size_type"frame"
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
file_checkpoint_train""
|
32 |
-
tokenizer_type"char"
|
33 |
-
tokenizer_file""
|
34 |
-
mixed_precision"fp16"
|
35 |
-
logger"wandb"
|
36 |
-
|
37 |
```
|
38 |
|
39 |
# Pre processing
|
@@ -46,9 +46,16 @@ I'm only talking about Italian data on cml-tts, I don't know if other languages
|
|
46 |
|
47 |
|
48 |
# Current most trained model
|
49 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
|
52 |
### checkpoints folder
|
53 |
Contains the weight of the checkpoints at specific steps, the higher the number, the further it went into training.
|
54 |
-
Weights in this folder can be used as starting point to continue training.
|
|
|
|
17 |
with 8xRTX4090, still in progress, using gradio finetuning app using following settings:
|
18 |
```
|
19 |
exp_name"F5TTS_Base"
|
20 |
+
learning_rate=0.00001
|
21 |
+
batch_size_per_gpu=10000
|
22 |
+
batch_size_type="frame"
|
23 |
+
max_samples=64
|
24 |
+
grad_accumulation_steps=1
|
25 |
+
max_grad_norm=1
|
26 |
+
epochs=300
|
27 |
+
num_warmup_updates=2000
|
28 |
+
save_per_updates=600
|
29 |
+
last_per_steps=300
|
30 |
+
finetune=true
|
31 |
+
file_checkpoint_train=""
|
32 |
+
tokenizer_type="char"
|
33 |
+
tokenizer_file=""
|
34 |
+
mixed_precision="fp16"
|
35 |
+
logger="wandb"
|
36 |
+
bnb_optimizer=false
|
37 |
```
|
38 |
|
39 |
# Pre processing
|
|
|
46 |
|
47 |
|
48 |
# Current most trained model
|
49 |
+
model_159600.safetensors (~290 Epoch)
|
50 |
+
|
51 |
+
## known problems
|
52 |
+
- catastrophic failure (being Italian only, lost english skill). A proper multilanguage dataset should be used instead of single language.
|
53 |
+
- not perfect pronunciation
|
54 |
+
- numbers must be converter in letters to be pronunced in italian
|
55 |
+
- a better dataset with more diverse voices would help improving zero-shot cloning
|
56 |
|
57 |
|
58 |
### checkpoints folder
|
59 |
Contains the weight of the checkpoints at specific steps, the higher the number, the further it went into training.
|
60 |
+
Weights in this folder can be used as starting point to continue training.
|
61 |
+
Ping me back if you can further finetune it to reach a better result
|