technical question about the model
what steps would be necessary if I wanted to translate this model into my language?
by the way, you're doing a great job :)
Basically, rebuild WizardLM just like I did.
I have a guide on https://erichartford.com/uncensored-models
Step one is to download the dataset. You have a choice to download the censored dataset or the original dataset.
https://huggingface.co/datasets/ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered
https://huggingface.co/datasets/victor123/evol_instruct_70k
Next you need to translate the dataset to your language.
Finally you need to fine-tune llama with your translate dataset.
The official procedure is here:
https://github.com/nlpxucan/WizardLM
I followed it exactly.
thank you very much for your reply :)
https://huggingface.co/datasets/ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered
is it recommended to run wizardlm_clean.py here?
You don't have to run the script if you are using my filtered dataset,
That's the script to get to my dataset from the original dataset
But you can change it if you like and generate your own custom dataset that way