-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there any way to get rid of [Blank Audio] in transcript? #2420
Comments
|
Maybe I should've specified... I'm doing subtitles... and all these are going in the SRT file. I do not wanna have to go line by line and remove the group of 4 lines with this each time it's in the srt. Then also the index numbers are no longer sequential. But thanks for the option... |
You know, what the problem is normally with removing silence. It can be done with replacing it with another sound, but I this will lead Whisper to recognize is another sound. I think, currently this is an unfixable issue with the whisper models we currently have. You can try whisper distill (for english only). At least, this should be doing good work for you. Please close this issue and here is my script, if you would like to try it out: `@echo off REM Delete all .srt files in the input folder REM Check if the "temp" folder exists REM Look for all media files in the input folder
) REM Check if the "temp" folder exists echo Process completed. Just put you audio or video file in the input folder, then go into the software folder and put in there content of this: https://github.com/ggerganov/whisper.cpp/releases/tag/v1.5.4 Then also please put in there ffmpeg.exe from here: https://www.gyan.dev/ffmpeg/builds Then the script should work. THIS ONLY WORKS FOR WINDOWS. |
I 100% disagree. It appears in some audio, and in others it does not. This is not a problem that is “unfixable”. Unless you can tell me why some places have blank audio in transcript and others do not - and then fix it in whisper.cpp - then it is not “by definition” unfixable. Also, a windows-only solution is not a solution for all the platforms whisper.cpp runs on. And neither is requiring an old version of whisper.cpp. So, no. With all due respect, do not close this issue. Jann |
I am using this server to run with home assistant. I was also wondering if it is possible to remove the [BLANK_AUDIO] outupts because it messes up the home assistant command. |
It might be possible with a python script running, that will remove that. |
@janngobble I do not want to upset you, but when I say this is "unfixable", then I mean, it's "unfixable". Whisper.cpp could implement that, what you want, but this will ultimately lead to the problem below: You know, what happens, when there is a silence: There will be some place it will transcribe as "Thank You" "Subbed by xyz" or whatever. And you know, why this happens? This happens, because when OpenAi trained Whisper on a lot of video, which very transcribed by a lot of humans. You had the video and the subtitle files. Then according to what happens to the audio, it will try match in text form. However, some people, that are transcribing their videos, they will at the end or a silent place in their video edit the subtitle file with something like "Thank you" "Thanks for watching" "Subscribe to my channel" "This has been subbed by SubTitleFreak123". Now what does this means? This means, that OpenAi trained Whisper on a faulty dataset and this behavior cannot be fixed by Whisper.cpp, because the model has been corrupted by those errors. However, there still is hope, there are guys, who try to fix this behavior, by retraining whisper model with a good dataset with accurate subtitles, so the model tries relearn, that silence is actually silence. Those guys are: https://github.com/huggingface/distil-whisper |
I used to use ffmpeg to remove silence but it didn't always work....this seems to work quite well though
|
I'm seeing this using medium.en:
I don't know why it would list "[Blank_Audio]" instead of just not putting a timestamp in the file...
Can I suppress this token?
thanks!
Jann
The text was updated successfully, but these errors were encountered: