Local tool for recording yourself to train a Piper text to speech voice.
See a video tutorial by Thorsten Müller
docker run -it -p 8000:8000 -v '/path/to/output:/app/output' rhasspy/piper-recording-studio
Visit http://localhost:8000 to select a language and start recording.
Add --help
to see more options.
docker build . -t rhasspy/piper-recording-studio
git clone https://github.com/rhasspy/piper-recording-studio.git
cd piper-recording-studio/
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install -r requirements.txt
python3 -m piper_recording_studio
Visit http://localhost:8000 to select a language and start recording.
Prompts are in the prompts/
directory with the following format:
- Language directories are named
<language name>_<language code>
- Each
.txt
in a language directory contains lines with:<id>\t<text>
ortext
(id is automatically assigned based on line number)
Output audio is written to output/
See --debug
for more options.
Install ffmpeg:
sudo apt-get install ffmpeg
Install exporting dependencies:
python3 -m pip install -r requirements_export.txt
Export recordings for a language to a Piper-compatible dataset (LJSpeech format):
python3 -m export_dataset output/<language>/ /path/to/dataset
Requires a non-Docker install. If you used Docker to record your dataset, you may need to adjust the permissions of the output directory:
sudo chown -R "$(id -u):$(id -u)" output/
See --help
for more options. You may need to adjust the silence detection parameters to correctly remove button clicks and keypresses.
python3 -m piper_recording_studio --multi-user
Now a "login code" will be required to record. A directory output/user_<code>/<language>
must exist for each user and language.