Blog and Demo space/fine-tuning

#4
by srinivasbilla - opened

Hey! I had a lot of fun experimenting with this model. And I wrote a blog to showcase some capabilities and also made a demo space that's free to use. Just wanted to share! Also I wanted to ask. I tried fine-tuning it on 10k rows of data with Lora finetune but it didn't really work and the loss was kinda stuck at 3 and just ruined the model. Do you have any tips?

https://huggingface.co/blog/srinivasbilla/llasa-tts

https://huggingface.co/spaces/srinivasbilla/llasa-3b-tts

HKUST Audio org

Thank you so much for writing this blog—it’s amazing! Regarding your fine-tuning issue, I’ve tried full-parameter fine-tuning with a learning rate of 5e-6 on LJSpeech for 5 epochs, and it worked fine, but the loss stayed in the 5-6 range. I suspect the problem might be a code error or overfitting due to your loss stuck at 3.

Ah I did lora finetuning for 10 epoch. What im trying to do is to introduce/make it better for a particular speaker. I will try full finetuning and let you know how it goes.

srinivasbilla changed discussion status to closed
srinivasbilla changed discussion status to open
srinivasbilla changed discussion status to closed

Sign up or log in to comment