Will ever be possible in the future to run this on Vulkan via the KomputeProject/Kompute or alike? #1139
gabrielesilinic
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It's not like you have to or anything, If you don't I will figure it out at some point (it's just that is going to take a very long time because I am new to the field).
But doing GPU acceleration via Vulkan and Kompute could enable (kind of) consistent cross-platform GPU acceleration and enable android phones to make the model run much faster and make the project easier to maintain due to being able to cut some GPU acceleration backends and unifying everything into Vulkan (probably even those pesky macs via MoltenVK).
I am building an application at gabrielesilinic/VolMan that uses sandrohanea/whisper.net (which are bindings for whisper.cpp) and while my code is particularly stunning (yet, I am very much in the early stages of development) it's really not the main reason performance is not great (I know because the same thing still on CPU alone runs much faster on my laptop, yeah, works on windows but I don't yet know how to let people sideload a binary)
I can notice the burden that my poor Galaxy M21 goes through while transcribing via the CPU alone, a working Kompute backend would just be a blessing for this use case, it would enable the same hardware to use the part that is properly compatible for this task, the GPU, also apparently since the memory between GPU and CPU on android phones seems to be shared phones may be able to run much larger models with ease, imagine having a virtual assistant that works everywhere even without internet connection...
Beta Was this translation helpful? Give feedback.
All reactions