New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

dense model using tensorrt infernece on the A30 got wrong #86

Open

zhaohb opened this issue Sep 1, 2023 · 0 comments

zhaohb commented Sep 1, 2023 •

edited

Loading

Environment

TensorRT 8.6.1
Versions of CUDA(12.1), CUBLAS(12.1.3.1)
Container used (tensorrt:23.07-py3)
NVIDIA driver version (510.47.03)

Model：

https://drive.google.com/file/d/1Sn7vJYf_dUqRyBn-KNSnH851D7atHu0o/view?usp=drive_link

Reproduction Steps

A30

polygraph run gs_concat.onnx --onnxrt --trt --tf32 --atol 1e-4 --pool-imit workspace:10G

output：

Expected Behavior

The result of inference using tensorrt is correct

Actual Behavior

It can be seen that there is a certain gap between the output of trt and onnx, and the inference result is wrong.

Additional Notes

We also tested it on the A6000 and found it correct,

polygraph run gs_concat.onnx --onnxrt --trt --tf32 --atol 1e-4 --pool-imit workspace:10G

output:

So maybe it's hardware related.

bug已经经过导师确认，nv内部的bug id是4259240.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment