Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Server api to return available model names and speaker id; load and unload downloaded model #37

Open
chigkim opened this issue May 30, 2024 · 11 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@chigkim
Copy link

chigkim commented May 30, 2024

Right now, you can request server to return audio by fetching

http://localhost:5002/api/tts?text={text}&speaker_id={speaker}

Can we also have api points for the server to return available model names and speaker ids (for multi speakers) as well as load and unload downloaded model?

Thanks for your consideration!

@eginhard eginhard added enhancement New feature or request good first issue Good for newcomers labels May 30, 2024
@eginhard
Copy link
Member

Seems sensible, at least listing models/languages/speakers should be straightforward. I won't implement this myself since we don't use this server, but would merge a PR.

@Roy6250
Copy link

Roy6250 commented Jul 24, 2024

Hi @eginhard, can I work on this issue?
Thanks

@eginhard
Copy link
Member

@Roy6250 Sure, thank you! I'd suggest to leave out the part about (un)loading models for now to keep it simple. We could discuss it at a later stage.

@Roy6250
Copy link

Roy6250 commented Jul 24, 2024

Sure, Thanks.

@Roy6250
Copy link

Roy6250 commented Jul 27, 2024

Hi @eginhard, went through repo. Made the setup. Before proceeding, would like to verify if I am in the correct path.

Requirement: List available model_names, languages and speakerIds.

Solution: From .models.json from TTS directory I will get the model names and languages. But not able to find out speakerIds. It would be helpful if you point me in the correct direction.

Thanks

@Roy6250
Copy link

Roy6250 commented Jul 31, 2024

Hi @eginhard please let me know your views. Thanks

@eginhard
Copy link
Member

@Roy6250 You don't need to parse the .models.json file yourself. There are helper functions for this already in the ModelManager class. Also see how the CLI is implemented, e.g. to get speaker IDs for a model:

if args.list_speaker_idxs:

@Roy6250
Copy link

Roy6250 commented Aug 2, 2024

Thanks for the help @eginhard. Using the helper functions, I was able to fetch all the models and languages.

Using this I can fetch the speaker names for a particular model.

speakers=synthesizer.tts_model.speaker_manager.name_to_id

`For this I have to download the model. This approach doesn't seem viable. Shall I preprocess and store the speaker names in json format and then show it from there during GET request?

@eginhard
Copy link
Member

eginhard commented Aug 2, 2024

@Roy6250 I would only return the speaker names for the currently loaded model and not for any arbitrary one to keep it simple.

@Roy6250
Copy link

Roy6250 commented Aug 2, 2024

@eginhard Sure, got it. Just one final query, about the API structure:

Request Type: GET
Params: None,

Response :{
model_name:[...] # List of all Model_names
languages:[...] # List of Languages,
speaker_ids:[...] # List of Speaker_ids, if any model is loaded, also will mention that particular model
}

@eginhard
Copy link
Member

eginhard commented Aug 5, 2024

@Roy6250 I'd suggest to create separate endpoints for each of these. Also check what is already available, e.g. I see that there is

@app.route("/locales", methods=["GET"])
and
@app.route("/voices", methods=["GET"])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants