-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse-quantized model runs without VNNI acceleration #1
Comments
Dear Dr. Leprince, I'll have to check if they added support for TConv. |
Ohhh so this is a duplicate of clementpoiret#22, silly me Anyway, thanks for the reply! In the meantime I will deploy the non-sparse models as a default in NeuroSpin |
Np :) It's always a pleasure to read a message from Dr. Leprince 😁 Anyway, in all apps, I think sparse/optimized networks should always be optional as they rely on very recent hardware, which most do not have... |
Little update on the issue. |
Also, to quote ONNXRuntime:
|
Describe the bug
Hi Dr @clementpoiret! Now that you have graduated 🎉 here is a technical issue to keep you busy 😉
On a workstation with AVX512 and VNNI CPU capabilities, I am getting the following message:
The performance is indeed worse than the non-sparse model (although I am not sure how it is counting CPU-time here w.r.t. HyperThreading):
segmentation=bagging_sq hardware=deepsparse
hardware=onnxruntime model=bagging_accurate hardware.engine_settings.execution_providers="['CPUExecutionProvider']
).Environment
segmentation=bagging_sq hardware=deepsparse
The text was updated successfully, but these errors were encountered: