Implementing `Dynamic Speed-UP Mode`? #1208

bobqianic · 2023-08-25T16:32:26Z

bobqianic
Aug 25, 2023
Collaborator

Often, I've noticed that even when a speaker is talking slowly, background music can disrupt things. This tends to make our current speed-up feature diminish the quality of the voice transcription. What if we could detect sections where the background music is causing interference and keep the transcription at normal speed, while speeding up the rest? Moreover, it would be great if we could gauge the speaker's pace and avoid speeding up when they're already speaking quickly. What's your take on this? Any thoughts or suggestions?

ggerganov · 2023-08-27T16:38:01Z

ggerganov
Aug 27, 2023
Maintainer

I feel like such type of rule-based algorithms are bound to fail in the general case. The beauty of Whisper is that it does not have and conditional statements. When we start adding ifs and elses, it will become messy.

In any case, examples of such approaches can be added for demonstration purposes. The only limitation is that they will likely not become part of the core whisper.cpp library

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing `Dynamic Speed-UP Mode`? #1208

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Implementing Dynamic Speed-UP Mode? #1208

bobqianic Aug 25, 2023 Collaborator

Replies: 1 comment

ggerganov Aug 27, 2023 Maintainer

Implementing `Dynamic Speed-UP Mode`? #1208

bobqianic
Aug 25, 2023
Collaborator

ggerganov
Aug 27, 2023
Maintainer