Replies: 1 comment 1 reply
-
I tested it on the iPhone, focusing on processing speed. On iPhone 12, I tested 3 seconds and 15 seconds input, and Core ML without Metal was best for test device. As the input length increases, the decode time with Metal increases rapidly. Below are my test result. (Sorry don't know how to use table for markup) hello how are you (3s) | with Metal hello how are you (3s) | without Metal mel 14 14 13 12 14 14 14 14 13 13.55555556 15s | with Metal mel 27 28 27 27 27 27 27 27 28 27 27.2 15s | without Metal mel 27 27 28 27 27 27 27 38 28 28 28 28.5 |
Beta Was this translation helpful? Give feedback.
-
Been busy but finally managed to download the laster updates from whisper.cpp and looking for better performance,
and honestly don't understand where to find the best performance, at the mo i have GGML_USE_ACCELERATE and GGML_USE_METAL both on but i don't see a huge difference with older versions of whisper.cpp, especially with bigger models and especially the METAL usage seems to make almost no difference, am I overlooking something?
Beta Was this translation helpful? Give feedback.
All reactions