-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to configure a text-to-speech model forced_token_ids? #210
Comments
Not possible yet, see #187. :) |
@rhcarvalho you can customize |
@jonatanklosko 👏 thanks for the pointer! I think the argument types have changed since then as the original example in the comment throws an error. This is what worked for me, in case someone ends up checking this issue for a solution: diff --git examples/phoenix/speech_to_text.exs examples/phoenix/speech_to_text.exs
index 99f72cb..94e8989 100644
--- examples/phoenix/speech_to_text.exs
+++ examples/phoenix/speech_to_text.exs
@@ -314,6 +314,15 @@ end
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-tiny"})
{:ok, generation_config} = Bumblebee.load_generation_config({:hf, "openai/whisper-tiny"})
+generation_config = %{
+ generation_config
+ | forced_token_ids: [
+ {1, Bumblebee.Tokenizer.token_to_id(tokenizer, "<|pt|>")},
+ {2, Bumblebee.Tokenizer.token_to_id(tokenizer, "<|transcribe|>")},
+ {3, Bumblebee.Tokenizer.token_to_id(tokenizer, "<|notimestamps|>")}
+ ]
+}
+
serving =
Bumblebee.Audio.speech_to_text(model_info, featurizer, tokenizer, generation_config,
compile: [batch_size: 10], |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for Bumblebee and the provided examples! I'm trying out the example at https://github.com/elixir-nx/bumblebee/blob/main/examples/phoenix/speech_to_text.exs.
It works well for audio input in English. For audio input in other languages, it seems to be automatically translating the output to English.
I read https://huggingface.co/openai/whisper-tiny#usage, and, if I understood it well, I'd need to use
forced_token_ids
to specify the desired/input language and task to be transcribe and not translate. Like in:How to do that with Bumblebee?
The text was updated successfully, but these errors were encountered: