Windowing heuristic #161
tom-huntington
started this conversation in
General
Replies: 1 comment 1 reply
-
The code that you have quoted is related to what I call "processors". This is a functionality that was requested by someone to split the audio into chunks and process the chunks separately using a single model in memory. The hope was that there will be benefit from this approach on multi-core server machines. See the following PR for more info: #110 The actual sliding window logic that you are referring to is implemented here: Lines 2670 to 2723 in eab36eb Basically, we sample the best token, and when the token is a timestamp, we remember it in |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Seems like you just stride by the window length to produce the segments
whisper.cpp/whisper.cpp
Lines 2917 to 2918 in 2065572
Seems like this wont handle words split across segments very well.
Beta Was this translation helpful? Give feedback.
All reactions