Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Yoloxpose模型导出为 TensorRT 格式错误 #2500

Closed
3 tasks done
wenkaiH opened this issue Oct 17, 2023 · 8 comments
Closed
3 tasks done

[Bug] Yoloxpose模型导出为 TensorRT 格式错误 #2500

wenkaiH opened this issue Oct 17, 2023 · 8 comments

Comments

@wenkaiH
Copy link

wenkaiH commented Oct 17, 2023

Checklist

  • I have searched related issues but cannot get the expected help.
  • 2. I have read the FAQ documentation but cannot get the expected help.
  • 3. The bug has not been fixed in the latest version.

Describe the bug

10/17 15:23:31 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
10/17 15:23:31 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
10/17 15:23:33 - mmengine - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
10/17 15:23:34 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
10/17 15:23:34 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
/home/ccnu-train/anaconda3/envs/mmdeploy/lib/python3.8/site-packages/mmpose/datasets/datasets/utils.py:102: UserWarning: The metainfo config file "configs/base/datasets/coco.py" does not exist. A matched config file "/home/ccnu-train/anaconda3/envs/mmdeploy/lib/python3.8/site-packages/mmpose/.mim/configs/base/datasets/coco.py" will be used instead.
warnings.warn(
Loads checkpoint by http backend from path: https://download.openmmlab.com/mmpose/v1/body_2d_keypoint/yolox_pose/yoloxpose_m_8xb32-300e_coco-640-84e9a538_20230829.pth
10/17 15:23:42 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future.
10/17 15:23:42 - mmengine - INFO - Export PyTorch model to ONNX: mmdeploy-model/yoloxpose-trt/end2end.onnx.
10/17 15:23:43 - mmengine - WARNING - Can not find torch.nn.functional.scaled_dot_product_attention, function rewrite will not be applied
10/17 15:23:43 - mmengine - WARNING - Can not find torch._C._jit_pass_onnx_autograd_function_process, function rewrite will not be applied
10/17 15:23:43 - mmengine - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied
10/17 15:23:43 - mmengine - WARNING - Can not find mmdet.models.utils.transformer.PatchMerging.forward, function rewrite will not be applied
/home/ccnu-train/anaconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/home/ccnu-train/hwk/mmdeploy/mmdeploy/core/optimizers/function_marker.py:160: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
ys_shape = tuple(int(s) for s in ys.shape)
/home/ccnu-train/hwk/mmdeploy/mmdeploy/mmcv/ops/nms.py:475: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
int(scores.shape[-1]),
/home/ccnu-train/hwk/mmdeploy/mmdeploy/mmcv/ops/nms.py:149: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
out_boxes = min(num_boxes, after_topk)
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
/home/ccnu-train/anaconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/onnx/symbolic_opset9.py:2815: UserWarning: Exporting aten::index operator of advanced indexing in opset 11 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.
warnings.warn("Exporting aten::index operator of advanced indexing in opset " +
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of mmdeploy::TRTBatchedNMS type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
10/17 15:23:58 - mmengine - INFO - Execute onnx optimize passes.
10/17 15:23:58 - mmengine - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx
10/17 15:24:01 - mmengine - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in subprocess
10/17 15:24:01 - mmengine - WARNING - Could not load the library of tensorrt plugins. Because the file does not exist:
[10/17/2023-15:24:02] [TRT] [I] [MemUsageChange] Init CUDA: CPU +14, GPU +0, now: CPU 101, GPU 18991 (MiB)
[10/17/2023-15:24:05] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +546, GPU +118, now: CPU 702, GPU 19109 (MiB)
[10/17/2023-15:24:05] [TRT] [I] ----------------------------------------------------------------
[10/17/2023-15:24:05] [TRT] [I] Input filename: mmdeploy-model/yoloxpose-trt/end2end.onnx
[10/17/2023-15:24:05] [TRT] [I] ONNX IR version: 0.0.7
[10/17/2023-15:24:05] [TRT] [I] Opset version: 11
[10/17/2023-15:24:05] [TRT] [I] Producer name: pytorch
[10/17/2023-15:24:05] [TRT] [I] Producer version: 1.10
[10/17/2023-15:24:05] [TRT] [I] Domain:
[10/17/2023-15:24:05] [TRT] [I] Model version: 0
[10/17/2023-15:24:05] [TRT] [I] Doc string:
[10/17/2023-15:24:05] [TRT] [I] ----------------------------------------------------------------
[10/17/2023-15:24:05] [TRT] [W] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[10/17/2023-15:24:05] [TRT] [W] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[10/17/2023-15:24:06] [TRT] [I] No importer registered for op: TRTBatchedNMS. Attempting to import as plugin.
[10/17/2023-15:24:06] [TRT] [I] Searching for plugin: TRTBatchedNMS, plugin_version: 1, plugin_namespace:
[10/17/2023-15:24:06] [TRT] [E] ModelImporter.cpp:726: While parsing node number 656 [TRTBatchedNMS -> "1741"]:
[10/17/2023-15:24:06] [TRT] [E] ModelImporter.cpp:727: --- Begin node ---
[10/17/2023-15:24:06] [TRT] [E] ModelImporter.cpp:728: input: "1740"
input: "1722"
output: "1741"
output: "1742"
output: "1743"
name: "TRTBatchedNMS_656"
op_type: "TRTBatchedNMS"
attribute {
name: "background_label_id"
i: -1
type: INT
}
attribute {
name: "clip_boxes"
i: 0
type: INT
}
attribute {
name: "iou_threshold"
f: 0.65
type: FLOAT
}
attribute {
name: "is_normalized"
i: 0
type: INT
}
attribute {
name: "keep_topk"
i: 100
type: INT
}
attribute {
name: "num_classes"
i: 1
type: INT
}
attribute {
name: "return_index"
i: 1
type: INT
}
attribute {
name: "score_threshold"
f: 0.5
type: FLOAT
}
attribute {
name: "topk"
i: 5000
type: INT
}
domain: "mmdeploy"

[10/17/2023-15:24:06] [TRT] [E] ModelImporter.cpp:729: --- End node ---
[10/17/2023-15:24:06] [TRT] [E] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:5428 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
Process Process-3:
Traceback (most recent call last):
File "/home/ccnu-train/anaconda3/envs/mmdeploy/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/ccnu-train/anaconda3/envs/mmdeploy/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/ccnu-train/hwk/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in call
ret = func(*args, **kwargs)
File "/home/ccnu-train/hwk/mmdeploy/mmdeploy/apis/utils/utils.py", line 98, in to_backend
return backend_mgr.to_backend(
File "/home/ccnu-train/hwk/mmdeploy/mmdeploy/backend/tensorrt/backend_manager.py", line 127, in to_backend
onnx2tensorrt(
File "/home/ccnu-train/hwk/mmdeploy/mmdeploy/backend/tensorrt/onnx2tensorrt.py", line 79, in onnx2tensorrt
from_onnx(
File "/home/ccnu-train/hwk/mmdeploy/mmdeploy/backend/tensorrt/utils.py", line 185, in from_onnx
raise RuntimeError(f'Failed to parse onnx, {error_msgs}')
RuntimeError: Failed to parse onnx, In node 656 (importFallbackPluginImporter): UNSUPPORTED_NODE: Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

Reproduction

python tools/deploy.py
configs/mmpose/pose-detection_yolox-pose_tensorrt_dynamic-640x640.py
../mmpose/configs/body_2d_keypoint/yoloxpose/coco/yoloxpose_m_8xb32-300e_coco-640.py
https://download.openmmlab.com/mmpose/v1/body_2d_keypoint/yolox_pose/yoloxpose_m_8xb32-300e_coco-640-84e9a538_20230829.pth
demo/resources/human-pose.jpg
--work-dir mmdeploy-model/yoloxpose-trt
--device cuda
--show
--dump-info

Environment

10/17 15:50:12 - mmengine - INFO -

10/17 15:50:12 - mmengine - INFO - **********Environmental information**********
10/17 15:50:13 - mmengine - INFO - sys.platform: linux
10/17 15:50:13 - mmengine - INFO - Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
10/17 15:50:13 - mmengine - INFO - CUDA available: True
10/17 15:50:13 - mmengine - INFO - numpy_random_seed: 2147483648
10/17 15:50:13 - mmengine - INFO - GPU 0,1: NVIDIA RTX A6000
10/17 15:50:13 - mmengine - INFO - CUDA_HOME: /usr
10/17 15:50:13 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.5, V11.5.119
10/17 15:50:13 - mmengine - INFO - GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
10/17 15:50:13 - mmengine - INFO - PyTorch: 1.10.2+cu113
10/17 15:50:13 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

10/17 15:50:13 - mmengine - INFO - TorchVision: 0.11.3+cu113
10/17 15:50:13 - mmengine - INFO - OpenCV: 4.8.1
10/17 15:50:13 - mmengine - INFO - MMEngine: 0.8.4
10/17 15:50:13 - mmengine - INFO - MMCV: 2.0.1
10/17 15:50:13 - mmengine - INFO - MMCV Compiler: GCC 9.3
10/17 15:50:13 - mmengine - INFO - MMCV CUDA Compiler: 11.3
10/17 15:50:13 - mmengine - INFO - MMDeploy: 1.3.0+c4dc10d
10/17 15:50:13 - mmengine - INFO -

10/17 15:50:13 - mmengine - INFO - **********Backend information**********
10/17 15:50:13 - mmengine - INFO - tensorrt:    8.5.3.1
10/17 15:50:13 - mmengine - INFO - tensorrt custom ops: NotAvailable
10/17 15:50:13 - mmengine - INFO - ONNXRuntime: None
10/17 15:50:13 - mmengine - INFO - ONNXRuntime-gpu:     1.15.1
10/17 15:50:13 - mmengine - INFO - ONNXRuntime custom ops:      NotAvailable
10/17 15:50:13 - mmengine - INFO - pplnn:       None
10/17 15:50:13 - mmengine - INFO - ncnn:        None
10/17 15:50:13 - mmengine - INFO - snpe:        None
10/17 15:50:13 - mmengine - INFO - openvino:    None
10/17 15:50:13 - mmengine - INFO - torchscript: 1.10.2+cu113
10/17 15:50:13 - mmengine - INFO - torchscript custom ops:      NotAvailable
10/17 15:50:13 - mmengine - INFO - rknn-toolkit:        None
10/17 15:50:13 - mmengine - INFO - rknn-toolkit2:       None
10/17 15:50:13 - mmengine - INFO - ascend:      None
10/17 15:50:13 - mmengine - INFO - coreml:      None
10/17 15:50:13 - mmengine - INFO - tvm: None
10/17 15:50:13 - mmengine - INFO - vacc:        None
10/17 15:50:13 - mmengine - INFO -

10/17 15:50:13 - mmengine - INFO - **********Codebase information**********
10/17 15:50:13 - mmengine - INFO - mmdet:       3.0.0
10/17 15:50:13 - mmengine - INFO - mmseg:       None
10/17 15:50:13 - mmengine - INFO - mmpretrain:  None
10/17 15:50:13 - mmengine - INFO - mmocr:       None
10/17 15:50:13 - mmengine - INFO - mmagic:      None
10/17 15:50:13 - mmengine - INFO - mmdet3d:     None
10/17 15:50:13 - mmengine - INFO - mmpose:      1.2.0
10/17 15:50:13 - mmengine - INFO - mmrotate:    None
10/17 15:50:13 - mmengine - INFO - mmaction:    None
10/17 15:50:13 - mmengine - INFO - mmrazor:     None
10/17 15:50:13 - mmengine - INFO - mmyolo:      0.5.0

Error traceback

No response

@wenkaiH
Copy link
Author

wenkaiH commented Oct 17, 2023

还有一个小问题,在此之前,我已成功将rtmpose部署导出为TensorRT 格式,且成功测试推理 (所以应该不是环境设置的问题吧),Yoloxpose部署导出onnx格式成功,但只能使用API推理测试,无法使用SDK推理。

[2023-10-17 15:59:07.788] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "mmdeploy-model/yoloxpose-ort/"
[2023-10-17 15:59:08.195] [mmdeploy] [error] [compose.cpp:37] Unable to find Transform creator: BottomupResize. Available transforms: [("CenterCrop", 0), ("Collect", 0), ("Compose", 0), ("DefaultFormatBundle", 0), ("FormatShape", 0), ("ImageToTensor", 0), ("LetterResize", 0), ("Lift", 0), ("LoadImageFromFile", 0), ("Normalize", 0), ("Pad", 0), ("RescaleToHeight", 0), ("Resize", 0), ("ResizeOCR", 0), ("ShortScaleAspectJitter", 0), ("TenCrop", 0), ("ThreeCrop", 0), ("TopDownAffine", 0), ("TopDownGetBboxCenterScale", 0)]
[2023-10-17 15:59:08.196] [mmdeploy] [error] [task.cpp:99] error parsing config: {
"context": {
"device": "",
"model": "",
"stream": ""
},
"input": [
"img"
],
"module": "Transform",
"name": "Preprocess",
"output": [
"prep_output"
],
"transforms": [
{
"type": "LoadImageFromFile"
},
{
"input_size": [
640,
640
],
"pad_val": [
114,
114,
114
],
"type": "BottomupResize"
},
{
"mean": [
0,
0,
0
],
"std": [
1,
1,
1
],
"to_rgb": false,
"type": "Normalize"
},
{
"keys": [
"img"
],
"type": "ImageToTensor"
},
{
"keys": [
"img"
],
"meta_keys": [
"img_shape",
"pad_shape",
"ori_shape",
"img_norm_cfg",
"scale_factor",
"bbox_score",
"center",
"scale"
],
"type": "Collect"
}
],
"type": "Task"
}
[2023-10-17 15:59:08.732] [mmdeploy] [error] [common.h:50] Could not found entry 'UNKNOWN' in mmpose. Available components: [("DeepposeRegressionHeadDecode", 0), ("SimCCLabelDecode", 0), ("TopdownHeatmapBaseHeadDecode", 0), ("TopdownHeatmapMSMUHeadDecode", 0), ("TopdownHeatmapMultiStageHeadDecode", 0), ("TopdownHeatmapSimpleHeadDecode", 0), ("ViPNASHeatmapSimpleHeadDecode", 0)]
[2023-10-17 15:59:08.732] [mmdeploy] [error] [task.cpp:99] error parsing config: {
"component": "UNKNOWN",
"context": {
"device": "",
"model": "",
"stream": ""
},
"input": [
"prep_output",
"infer_output"
],
"module": "mmpose",
"name": "postprocess",
"output": [
"post_output"
],
"params": {
"flip_test": false,
"input_size": [
640,
640
],
"nms_thr": 0.65,
"score_thr": 0.5,
"type": "YOLOXPoseAnnotationProcessor"
},
"type": "Task"
}
段错误 (核心已转储)

@RunningLeon
Copy link
Collaborator

@wenkaiH hi, yolox-pose is not supported in sdk. You can try this PR: #2240

@wenkaiH
Copy link
Author

wenkaiH commented Oct 18, 2023

@wenkaiH hi, yolox-pose is not supported in sdk. You can try this PR: #2240

已经尝试更改过里面提及的二十几个文件并 执行pip install -e mmdeploy,但仍然失败了

@RunningLeon
Copy link
Collaborator

you need to rebuild mmdeploy.

@wenkaiH
Copy link
Author

wenkaiH commented Oct 18, 2023

you need to rebuild mmdeploy.

不是通过 pip install -e {dir}/mmdeploy 重新构建嘛

@RunningLeon
Copy link
Collaborator

@github-actions
Copy link

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

@github-actions github-actions bot added the Stale label Oct 26, 2023
@github-actions
Copy link

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants