Skip to content

Commit

Permalink
Update TTS Tasks page. (#347)
Browse files Browse the repository at this point in the history
Wee update to the models listed, just making it a bit more current.
  • Loading branch information
Vaibhavs10 authored Nov 22, 2023
1 parent 7197754 commit 0f83213
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 4 deletions.
3 changes: 2 additions & 1 deletion packages/tasks/src/text-to-speech/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response

output = query({"text_inputs": "This is a test"})
output = query({"text_inputs": "Max is the best doggo."})
```

You can also use libraries such as [espnet](https://huggingface.co/models?library=espnet&pipeline_tag=text-to-speech&sort=downloads) or [transformers](https://huggingface.co/models?pipeline_tag=text-to-speech&library=transformers&sort=trending) if you want to handle the Inference directly.
Expand Down Expand Up @@ -56,6 +56,7 @@ await inference.textToSpeech({

## Useful Resources

- [Hugging Face Audio Course](https://huggingface.co/learn/audio-course/chapter6/introduction)
- [ML for Audio Study Group - Text to Speech Deep Dive](https://www.youtube.com/watch?v=aLBedWj-5CQ)
- [An introduction to SpeechT5, a multi-purpose speech recognition and synthesis model](https://huggingface.co/blog/speecht5).
- [A guide on Fine-tuning Whisper For Multilingual ASR with 🤗Transformers](https://huggingface.co/blog/fine-tune-whisper)
Expand Down
6 changes: 3 additions & 3 deletions packages/tasks/src/text-to-speech/data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ const taskData: TaskDataCustom = {
id: "suno/bark",
},
{
description: "An application that contains multiple speech synthesis models for various languages and accents.",
id: "coqui/CoquiTTS",
description: "XTTS is a Voice generation model that lets you clone voices into different languages.",
id: "coqui/xtts",
},
{
description: "An application that synthesizes speech for various speaker types.",
Expand All @@ -62,7 +62,7 @@ const taskData: TaskDataCustom = {
],
summary:
"Text-to-Speech (TTS) is the task of generating natural sounding speech given text input. TTS models can be extended to have a single model that generates speech for multiple speakers and multiple languages.",
widgetModels: ["microsoft/speecht5_tts"],
widgetModels: ["suno/bark"],
youtubeId: "NW62DpzJ274",
};

Expand Down

0 comments on commit 0f83213

Please sign in to comment.