Spaces:
Running
Running
# data_prep | |
This directory contains the following data preparation scripts: | |
1. MFA data preparation: Code for extracting phone alignments by MontrΓ©al Forced Aligner (MFA) | |
2. Style prompt data preparation: Code for preparing synthetic annotations of style prompts. | |
## 0. Download LibriTTS_R | |
Before running any scripts, be sure to put the [LibriTTS-R](https://www.openslr.org/141/) dataset to `./LibriTTS_R`. You must have the following directory structure: | |
``` | |
LibriTTS_R/ | |
βββ BOOKS.txt | |
βββ CHAPTERS.txt | |
βββ LICENSE.txt | |
βββ NOTE.txt | |
βββ README_librispeech.txt | |
βββ README_libritts.txt | |
βββ README_libritts_r.txt | |
βββ SPEAKERS.txt | |
βββ dev-clean | |
βββ dev-other | |
βββ reader_book.tsv | |
βββ speakers.tsv | |
βββ test-clean | |
βββ test-other | |
βββ train-clean-100 | |
βββ train-clean-360 | |
βββ train-other-500 | |
``` | |
## 1. MFA data preparation | |
### Setup for MFA | |
``` | |
conda install -c conda-forge montreal-forced-aligner | |
``` | |
``` | |
mfa model download dictionary english_us_arpa | |
mfa model download acoustic english_us_arpa | |
``` | |
### Usage | |
Please check `runall_mfa.sh` for the usage. | |
Note that running MFA for all the utterances in LibriTTS-R takes a long time (likely a few days). | |
### Directory structure | |
After all the data preparation steps, the following directories will be created: | |
- `libritts_r_per_spk_cleaned` | |
- `${spk}` | |
- `textgrid`: text grid files | |
- `wav24k`: 24kHz wav files | |
``` | |
βββ 100 | |
βΒ Β βββ textgrid | |
βΒ Β βββ wav24k | |
βββ 1001 | |
βΒ Β βββ textgrid | |
βΒ Β βββ wav24k | |
βββ 1006 | |
βΒ Β βββ textgrid | |
βΒ Β βββ wav24k | |
... | |
``` | |
## 2. Style prompt data preparation | |
Code for estimating per-utterance style tags (e.g., low pitch, normal pitch and high pitch) from the data statistics. | |
### Usage | |
Please check `runall_style_prompt_tags.sh` for the usage. | |