-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect timestamps #2279
base: master
Are you sure you want to change the base?
Incorrect timestamps #2279
Conversation
Fixes ggerganov#2271 - Adds consecutive timestamps after end of last segment as the new starting ts - Add these timestamp to output when "print-special" enabled - Fixes fflush usage in live reporting I was not able to test this with the special "token_timestamps" option.
I tested this and it works. current whisper.cpp logscd /tmp
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
wget "https://github.com/ggerganov/whisper.cpp/assets/61390950/bbf9d9c4-3d60-4693-832d-e48135edf379" -O audio.wav
cmake -B build .
cmake --build build
ffmpeg -i audio.wav -ar 16000 -ac 1 -c:a pcm_s16le normal.wav
./build/bin/main -f ./normal.wav -m "/Users/user/Library/Application Support/github.com.thewh1teagle.vibe/ggml-medium.bin"
# Result
# [00:00:00.000 --> 00:00:06.000] I-I-I just wanna tell you how I'm feelin'
# [00:00:06.000 --> 00:00:08.700] Gotta make you understand that
# [00:00:08.700 --> 00:00:18.080] Never gonna give you up, never gonna let you down
# [00:00:18.080 --> 00:00:25.280] Never gonna run around and PR log# Test new PR
cd /tmp
git clone https://github.com/bviksoe/whisper.cpp -b master whisper1.cpp
cd whisper1.cpp
cmake -B build .
cmake --build build
./build/bin/main -f ../whisper.cpp/normal.wav -m "/Users/user/Library/Application Support/github.com.thewh1teagle.vibe/ggml-medium.bin"
# Result
# [00:00:00.000 --> 00:00:06.000] I-I-I just wanna tell you how I'm feelin'
# [00:00:06.000 --> 00:00:08.700] Gotta make you understand that
# [00:00:14.080 --> 00:00:18.080] Never gonna give you up, never gonna let you down
# [00:00:22.600 --> 00:00:25.280] Never gonna run around and Notice that the third timestamp is correct in the PR log. |
If you uncomment the line Line 142 in c118733
you compile with extended debug trace. Then you should be able to see that the model actually produces extra timestamp tokens that this library was ignoring. I was actually looking into why |
I am unable to build it on Mac M1. It gives many errors, including stuff like mmintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture" |
@manumaan For general build problems, open a new Issue or ask in Discussions. |
This really helped with timing of my sentences. A segment would start long before it was actually spoken specially when music is played in between segments. However my word-level timestamps still suffer from going out of sync. |
@bviksoe |
I'm thinking of trying to write a script to change any sub-timing end that is ahead of any sub-timing on the next line to become it. These are three examples with whisper.cpp 1.6.2 and having subs stay on screen for many minutes or even seconds when new subs come on it's pretty distracting. Hope to try out this PR someday or hope it gets merged. 06:22:32.254 --> 06:22:39.764 06:22:39.764 --> 06:22:42.254 (<--for instance this would become 06:22:41.260) 06:22:41.260 --> 06:22:46.660 06:22:46.660 --> 06:22:53.380 06:22:53.380 --> 06:22:58.940 06:22:58.940 --> 06:23:04.580 40:03:11.653 --> 40:03:16.213 40:03:16.213 --> 40:12:32.523 (<-- would become 40:03:24.933) 40:03:24.933 --> 40:03:29.573 40:03:30.373 --> 40:03:34.133 40:03:34.133 --> 40:03:34.773 57:02:18.560 --> 57:02:22.560 57:02:22.560 --> 57:09:52.750 (<-- would become 57:02:30.560) 57:02:30.560 --> 57:02:34.560 57:02:34.560 --> 57:02:38.560 57:02:38.560 --> 57:02:40.560 |
script fixes overlapping subs for vtt and srt
|
Fixes #2271
I was not able to test this with the special "token_timestamps" option.
NB: This is my first Github PR so go easy on me.