Spaces:
Sleeping
Sleeping
## Overview | |
This document aims to introduce how to use our Text-to-Speech API, including making requests via GET and POST methods. This API supports converting text into the voice of specified characters and supports different languages and emotional expressions. | |
## Character and Emotion List | |
To obtain the supported characters and their corresponding emotions, please visit the following URL: | |
- URL: `http://127.0.0.1:5000/character_list` | |
- Returns: A JSON format list of characters and corresponding emotions | |
- Method: `GET` | |
``` | |
{ | |
"Hanabi": [ | |
"default", | |
"Normal", | |
"Yandere", | |
], | |
"Hutao": [ | |
"default" | |
] | |
} | |
``` | |
## Regarding Aliases | |
From version 2.2.4, an alias system was added. Detailed allowed aliases can be found in `Inference/params_config.json`. | |
## Text-to-Speech | |
- URL: `http://127.0.0.1:5000/tts` | |
- Returns: Audio on success. Error message on failure. | |
- Method: `GET`/`POST` | |
### GET Method | |
#### Format | |
``` | |
http://127.0.0.1:5000/tts?character={{characterName}}&text={{text}} | |
``` | |
- Parameter explanation: | |
- `character`: The name of the character folder, pay attention to case sensitivity, full/half width, and language (Chinese/English). | |
- `text`: The text to be converted, URL encoding is recommended. | |
- Optional parameters include `text_language`, `format`, `top_k`, `top_p`, `batch_size`, `speed`, `temperature`, `emotion`, `save_temp`, and `stream`, detailed explanations are provided in the POST section below. | |
- From version 2.2.4, an alias system was added, with detailed allowed aliases found in `Inference/params_config.json`. | |
### POST Method | |
#### JSON Package Format | |
##### All Parameters | |
``` | |
{ | |
"method": "POST", | |
"body": { | |
"character": "${chaName}", | |
"emotion": "${Emotion}", | |
"text": "${speakText}", | |
"text_language": "${textLanguage}", | |
"batch_size": ${batch_size}, | |
"speed": ${speed}, | |
"top_k": ${topK}, | |
"top_p": ${topP}, | |
"temperature": ${temperature}, | |
"stream": "${stream}", | |
"format": "${Format}", | |
"save_temp": "${saveTemp}" | |
} | |
} | |
``` | |
You can omit one or more items. From version 2.2.4, an alias system was introduced, detailed allowed aliases can be found in `Inference/params_config.json`. | |
##### Minimal Data: | |
``` | |
{ | |
"method": "POST", | |
"body": { | |
"text": "${speakText}" | |
} | |
} | |
``` | |
##### Parameter Explanation | |
- **text**: The text to be converted, URL encoding is recommended. | |
- **character**: Character folder name, pay attention to case sensitivity, full/half width, and language. | |
- **emotion**: Character emotion, must be an actually supported emotion of the character, otherwise, the default emotion will be used. | |
- **text_language**: Text language (auto / zh / en / ja), default is multilingual mixed. | |
- **top_k**, **top_p**, **temperature**: GPT model parameters, no need to modify if unfamiliar. | |
- **batch_size**: How many batches at a time, can be increased for faster processing if you have a powerful computer, integer, default is 1. | |
- **speed**: Speech speed, default is 1.0. | |
- **save_temp**: Whether to save temporary files, when true, the backend will save the generated audio, and subsequent identical requests will directly return that data, default is false. | |
- **stream**: Whether to stream, when true, audio will be returned sentence by sentence, default is false. | |
- **format**: Format, default is WAV, allows MP3/ WAV/ OGG. | |