-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add overflow protection for quantization bias to reduce quantization precision loss #21645
Add overflow protection for quantization bias to reduce quantization precision loss #21645
Conversation
@yihonglyu @adrianlizarraga, could you take a look and start the CI pipelines? |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models |
/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline |
Azure Pipelines successfully started running 9 pipeline(s). |
1 similar comment
Azure Pipelines successfully started running 9 pipeline(s). |
/azp run Windows GPU CUDA CI Pipeline Windows GPU DML CI Pipeline Windows GPU Doc Gen CI Pipeline Linux Android Emulator QNN CI Pipeline |
No pipelines are associated with this pull request. |
/azp run Windows GPU CUDA CI Pipeline, Windows GPU DML CI Pipeline, Windows GPU Doc Gen CI Pipeline, Linux Android Emulator QNN CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
23a739f
to
3ff8d99
Compare
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models |
/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline |
/azp run Windows GPU CUDA CI Pipeline, Windows GPU DML CI Pipeline, Windows GPU Doc Gen CI Pipeline, Linux Android Emulator QNN CI Pipeline |
Azure Pipelines successfully started running 3 pipeline(s). |
Azure Pipelines successfully started running 4 pipeline(s). |
Azure Pipelines successfully started running 9 pipeline(s). |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models |
/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline |
/azp run Windows GPU CUDA CI Pipeline, Windows GPU DML CI Pipeline, Windows GPU Doc Gen CI Pipeline, Linux Android Emulator QNN CI Pipeline |
Azure Pipelines successfully started running 3 pipeline(s). |
Azure Pipelines successfully started running 4 pipeline(s). |
Azure Pipelines successfully started running 9 pipeline(s). |
1 similar comment
Azure Pipelines successfully started running 9 pipeline(s). |
981ed28
to
f810324
Compare
Commenter does not have sufficient privileges for PR 21645 in repo microsoft/onnxruntime |
@fajin-corp, could you help to start the CI pipelines? |
/azp run Big Models,Linux Android Emulator QNN CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline |
Azure Pipelines successfully started running 9 pipeline(s). |
/azp run ONNX Runtime Web CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,Windows x64 QNN CI Pipeline |
/azp run onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 8 pipeline(s). |
Azure Pipelines successfully started running 4 pipeline(s). |
@yufenglee, @fajin-corp, @adrianlizarraga, the |
/azp run orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 1 pipeline(s). |
@yufenglee, @fajin-corp, @adrianlizarraga, all the checks have passed. Could you help to merge? |
@duanshengliu What's the difference between clip + cast and cast only? Could you add a test for it? |
@yihonglyu, for the values within the
If we do
Obviously, |
Description
When the scale of the bias is too small, the quantized bias may exceed the range of
int32
, leading to significant loss of precision. Therefore, before converting quantized bias toint32
, it needs to be clipped within the range ofint32
to reduce the loss of quantization precision.Motivation and Context
Fix the issue #21000