WelkinFang
commited on
Commit
·
8f4b4eb
1
Parent(s):
408ce0f
add guidance of vocoder and contentvec
Browse files
README.md
CHANGED
@@ -27,27 +27,38 @@ We provide a [DiffWaveNetSVC](https://github.com/open-mmlab/Amphion/tree/main/eg
|
|
27 |
|
28 |
To make these singers sing the songs you want to listen to, just run the following commands:
|
29 |
|
30 |
-
### Step1: Download the checkpoint
|
31 |
```bash
|
32 |
git lfs install
|
33 |
git clone https://huggingface.co/amphion/singing_voice_conversion
|
34 |
```
|
35 |
|
36 |
-
### Step2:
|
|
|
|
|
|
|
|
|
|
|
37 |
```bash
|
38 |
git clone https://github.com/open-mmlab/Amphion.git
|
39 |
```
|
40 |
|
41 |
-
###
|
42 |
-
|
|
|
|
|
|
|
43 |
|
44 |
```bash
|
45 |
cd Amphion
|
46 |
-
mkdir ckpts/svc
|
47 |
-
ln -s ../singing_voice_conversion/vocalist_l1_contentvec+whisper ckpts/svc/vocalist_l1_contentvec+whisper
|
|
|
48 |
```
|
49 |
|
50 |
-
|
|
|
|
|
51 |
|
52 |
You can follow [this recipe](https://github.com/open-mmlab/Amphion/tree/main/egs/svc/MultipleContentsSVC#4-inferenceconversion) to conduct the conversion. For example, if you want to make Taylor Swift sing the songs in the `[Your Audios Folder]`, just run:
|
53 |
|
@@ -57,6 +68,7 @@ sh egs/svc/MultipleContentsSVC/run.sh --stage 3 --gpu "0" \
|
|
57 |
--infer_expt_dir "ckpts/svc/vocalist_l1_contentvec+whisper" \
|
58 |
--infer_output_dir "ckpts/svc/vocalist_l1_contentvec+whisper/result" \
|
59 |
--infer_source_audio_dir [Your Audios Folder] \
|
|
|
60 |
--infer_target_speaker "vocalist_l1_TaylorSwift" \
|
61 |
--infer_key_shift "autoshift"
|
62 |
```
|
|
|
27 |
|
28 |
To make these singers sing the songs you want to listen to, just run the following commands:
|
29 |
|
30 |
+
### Step1: Download the acoustics model checkpoint
|
31 |
```bash
|
32 |
git lfs install
|
33 |
git clone https://huggingface.co/amphion/singing_voice_conversion
|
34 |
```
|
35 |
|
36 |
+
### Step2: Download the vocoder checkpoint
|
37 |
+
```bash
|
38 |
+
git clone https://huggingface.co/amphion/BigVGAN_singing_bigdata
|
39 |
+
```
|
40 |
+
|
41 |
+
### Step3: Clone the Amphion's Source Code of GitHub
|
42 |
```bash
|
43 |
git clone https://github.com/open-mmlab/Amphion.git
|
44 |
```
|
45 |
|
46 |
+
### Step4: Download ContentVec Checkpoint
|
47 |
+
You could download **ContentVec** Checkpoint from [this repo](https://github.com/auspicious3000/contentvec). In this pretrained model, we used `checkpoint_best_legacy_500.pt`, which is the legacy ContentVec with 500 classes.
|
48 |
+
|
49 |
+
### Step5: Specify the checkpoints' path
|
50 |
+
Use the soft link to specify the downloaded checkpoints:
|
51 |
|
52 |
```bash
|
53 |
cd Amphion
|
54 |
+
mkdir -p ckpts/svc
|
55 |
+
ln -s "$(realpath ../singing_voice_conversion/vocalist_l1_contentvec+whisper)" ckpts/svc/vocalist_l1_contentvec+whisper
|
56 |
+
ln -s "$(realpath ../BigVGAN_singing_bigdata/bigvgan_singing)" pretrained/bigvgan_singing
|
57 |
```
|
58 |
|
59 |
+
Also, you need to move `checkpoint_best_legacy_500.pt` you downloaded at **Step4** into `Amphion/pretrained/contentvec`.
|
60 |
+
|
61 |
+
### Step6: Conversion
|
62 |
|
63 |
You can follow [this recipe](https://github.com/open-mmlab/Amphion/tree/main/egs/svc/MultipleContentsSVC#4-inferenceconversion) to conduct the conversion. For example, if you want to make Taylor Swift sing the songs in the `[Your Audios Folder]`, just run:
|
64 |
|
|
|
68 |
--infer_expt_dir "ckpts/svc/vocalist_l1_contentvec+whisper" \
|
69 |
--infer_output_dir "ckpts/svc/vocalist_l1_contentvec+whisper/result" \
|
70 |
--infer_source_audio_dir [Your Audios Folder] \
|
71 |
+
--infer_vocoder_dir "pretrained/bigvgan_singing" \
|
72 |
--infer_target_speaker "vocalist_l1_TaylorSwift" \
|
73 |
--infer_key_shift "autoshift"
|
74 |
```
|