Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeCamp2023-553]翻译ort/trt后端编译中文文档 #2310

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 38 additions & 37 deletions docs/zh_cn/05-supported-backends/onnxruntime.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,42 @@
# onnxruntime 支持情况

## Introduction of ONNX Runtime
## ONNX Runtime 介绍

**ONNX Runtime** is a cross-platform inference and training accelerator compatible with many popular ML/DNN frameworks. Check its [github](https://github.com/microsoft/onnxruntime) for more information.
**ONNX Runtime** 是一个跨平台的推理和训练加速器,与许多流行的ML/DNN框架兼容。查看其[github](https://github.com/microsoft/onnxruntime)以获取更多信息。

## Installation
## 安装

*Please note that only **onnxruntime>=1.8.1** of on Linux platform is supported by now.*
*请注意,目前Linux平台只支持 **onnxruntime>=1.8.1** *
RunningLeon marked this conversation as resolved.
Show resolved Hide resolved

### Install ONNX Runtime python package
### 安装ONNX Runtime python包

- CPU Version
- CPU 版本

```bash
pip install onnxruntime==1.8.1 # if you want to use cpu version
pip install onnxruntime==1.8.1 # 如果你想用cpu版本
```

- GPU Version
- GPU 版本

```bash
pip install onnxruntime-gpu==1.8.1 # if you want to use gpu version
pip install onnxruntime-gpu==1.8.1 # 如果你想用gpu版本
```

### Install float16 conversion tool (optional)
### 安装float16转换工具(可选)

If you want to use float16 precision, install the tool by running the following script:
如果你想用float16精度,请执行以下脚本安装工具:

```bash
pip install onnx onnxconverter-common
```

## Build custom ops
## 构建自定义算子

### Download ONNXRuntime Library
### 下载ONNXRuntime库

Download `onnxruntime-linux-*.tgz` library from ONNX Runtime [releases](https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1), extract it, expose `ONNXRUNTIME_DIR` and finally add the lib path to `LD_LIBRARY_PATH` as below:
从ONNX Runtime[发布版本](https://github.com/microsoft/onnxruntime/releases/tag/v1.8.1)下载`onnxruntime-linux-*.tgz`库,并解压,将onnxruntime所在路径添加到`ONNXRUNTIME_DIR`环境变量,最后将lib路径添加到`LD_LIBRARY_PATH`环境变量中,操作如下:

- CPU Version
- CPU 版本

```bash
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
Expand All @@ -47,7 +47,7 @@ export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
```

- GPU Version
- GPU 版本

```bash
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-gpu-1.8.1.tgz
Expand All @@ -58,49 +58,50 @@ export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
```

### Build on Linux
### 在Linux上构建

- CPU Version
- CPU 版本

```bash
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
cd ${MMDEPLOY_DIR} # 进入MMDeploy根目录
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_DEVICES='cpu' -DMMDEPLOY_TARGET_BACKENDS=ort -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} ..
make -j$(nproc) && make install
```

- GPU Version
- GPU 版本

```bash
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
cd ${MMDEPLOY_DIR} # 进入MMDeploy根目录
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_DEVICES='cuda' -DMMDEPLOY_TARGET_BACKENDS=ort -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} ..
make -j$(nproc) && make install
```

## How to convert a model
## 如何转换模型

- You could follow the instructions of tutorial [How to convert model](../02-how-to-run/convert_model.md)
- 你可以按照教程[如何转换模型](../02-how-to-run/convert_model.md)的说明去做

## How to add a new custom op
## 如何添加新的自定义算子

## Reminder
## 提示

- The custom operator is not included in [supported operator list](https://github.com/microsoft/onnxruntime/blob/master/docs/OperatorKernels.md) in ONNX Runtime.
- The custom operator should be able to be exported to ONNX.
- 自定义算子不包含在ONNX Runtime[支持的算子列表](https://github.com/microsoft/onnxruntime/blob/master/docs/OperatorKernels.md)中。
- 自定义算子应该能够导出到ONNX。

#### Main procedures
#### 主要过程

Take custom operator `roi_align` for example.
以自定义操作符`roi_align`为例。

1. Create a `roi_align` directory in ONNX Runtime source directory `${MMDEPLOY_DIR}/csrc/backend_ops/onnxruntime/`
2. Add header and source file into `roi_align` directory `${MMDEPLOY_DIR}/csrc/backend_ops/onnxruntime/roi_align/`
3. Add unit test into `tests/test_ops/test_ops.py`
Check [here](../../../tests/test_ops/test_ops.py) for examples.
1. 在ONNX Runtime源目录`${MMDEPLOY_DIR}/csrc/backend_ops/onnxruntime/`中创建一个`roi_align`目录
2. 添加头文件和源文件到`roi_align`目录`${MMDEPLOY_DIR}/csrc/backend_ops/onnxruntime/roi_align/`
3. 将单元测试添加到`tests/test_ops/test_ops.py`中。
查看[这里](../../../tests/test_ops/test_ops.py)的例子。

**Finally, welcome to send us PR of adding custom operators for ONNX Runtime in MMDeploy.** :nerd_face:
**最后,欢迎发送为MMDeploy添加ONNX Runtime自定义算子的PR。* *: nerd_face:

## References
## 参考

- [如何将具有自定义op的Pytorch模型导出为ONNX并在ONNX Runtime运行](https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md)
- [如何在ONNX Runtime添加自定义算子/内核](https://onnxruntime.ai/docs/reference/operators/add-custom-op.html)

- [How to export Pytorch model with custom op to ONNX and run it in ONNX Runtime](https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md)
- [How to add a custom operator/kernel in ONNX Runtime](https://onnxruntime.ai/docs/reference/operators/add-custom-op.html)
61 changes: 31 additions & 30 deletions docs/zh_cn/05-supported-backends/tensorrt.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,57 @@
# TensorRT 支持情况

## Installation
## 安装

### Install TensorRT
### 安装TensorRT

Please install TensorRT 8 follow [install-guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing).
请按照[安装指南](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing)安装TensorRT8。

**Note**:
**注意**:

- `pip Wheel File Installation` is not supported yet in this repo.
- 此版本不支持`pip Wheel File Installation`

- We strongly suggest you install TensorRT through [tar file](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar)
- 我们强烈建议通过[tar包](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar)的方式安装TensorRT。

- After installation, you'd better add TensorRT environment variables to bashrc by:
- 安装完成后,最好通过以下方式将TensorRT环境变量添加到bashrc:

```bash
cd ${TENSORRT_DIR} # To TensorRT root directory
cd ${TENSORRT_DIR} # 进入TensorRT根目录
echo '# set env for TensorRT' >> ~/.bashrc
echo "export TENSORRT_DIR=${TENSORRT_DIR}" >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=$TENSORRT_DIR/lib:$TENSORRT_DIR' >> ~/.bashrc
source ~/.bashrc
```

### Build custom ops
### 构建自定义算子

Some custom ops are created to support models in OpenMMLab, and the custom ops can be built as follow:
OpenMMLab中创建了一些自定义算子来支持模型,自定义算子可以如下构建:

```bash
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
cd ${MMDEPLOY_DIR} # 进入TensorRT根目录
mkdir -p build && cd build
cmake -DMMDEPLOY_TARGET_BACKENDS=trt ..
make -j$(nproc)
```

If you haven't installed TensorRT in the default path, Please add `-DTENSORRT_DIR` flag in CMake.
如果你没有在默认路径下安装TensorRT,请在CMake中添加`-DTENSORRT_DIR`标志。

```bash
cmake -DMMDEPLOY_TARGET_BACKENDS=trt -DTENSORRT_DIR=${TENSORRT_DIR} ..
make -j$(nproc) && make install
```

## Convert model
## 转换模型

Please follow the tutorial in [How to convert model](../02-how-to-run/convert_model.md). **Note** that the device must be `cuda` device.
请遵循[如何转换模型](../02-how-to-run/convert_model.md)中的教程。**注意**设备必须是`cuda` 设备。

### Int8 Support
### Int8 支持

Since TensorRT supports INT8 mode, a custom dataset config can be given to calibrate the model. Following is an example for MMDetection:
由于TensorRT支持INT8模式,因此可以提供自定义数据集配置来校准模型。MMDetection的示例如下:

```python
# calibration_dataset.py

# dataset settings, same format as the codebase in OpenMMLab
# 数据集设置,格式与OpenMMLab中的代码库相同
dataset_type = 'CalibrationDataset'
data_root = 'calibration/dataset/root'
img_norm_cfg = dict(
Expand Down Expand Up @@ -85,32 +85,32 @@ data = dict(
evaluation = dict(interval=1, metric='bbox')
```

Convert your model with this calibration dataset:
使用此校准数据集转换您的模型:

```python
python tools/deploy.py \
...
--calib-dataset-cfg calibration_dataset.py
```

If the calibration dataset is not given, the data will be calibrated with the dataset in model config.
如果没有提供校准数据集,则使用模型配置中的数据集进行校准。

## FAQs

- Error `Cannot found TensorRT headers` or `Cannot found TensorRT libs`
- 错误 `Cannot found TensorRT headers``Cannot found TensorRT libs`

Try cmake with flag `-DTENSORRT_DIR`:
可以尝试在cmake时使用`-DTENSORRT_DIR`标志:

```bash
cmake -DBUILD_TENSORRT_OPS=ON -DTENSORRT_DIR=${TENSORRT_DIR} ..
make -j$(nproc)
```

Please make sure there are libs and headers in `${TENSORRT_DIR}`.
请确保 `${TENSORRT_DIR}`中有库和头文件。

- Error `error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]`
- 错误 `error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]`

There is an input shape limit in deployment config:
在部署配置中有一个输入形状的限制:

```python
backend_config = dict(
Expand All @@ -126,14 +126,15 @@ If the calibration dataset is not given, the data will be calibrated with the da
# other configs
```

The shape of the tensor `input` must be limited between `input_shapes["input"]["min_shape"]` and `input_shapes["input"]["max_shape"]`.
`input` 张量的形状必须限制在`input_shapes["input"]["min_shape"]``input_shapes["input"]["max_shape"]`之间。

- Error `error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS`
- 错误 `error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS`

TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the default choice for SM version >= 7.0. However, you may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you don't want to upgrade.
TRT 7.2.1切换到使用cuBLASLt(以前是cuBLAS)。cuBLASLt是SM版本>= 7.0的默认选择。但是,您可能需要CUDA-10.2补丁1(2020年8月26日发布)来解决一些cuBLASLt问题。如果不想升级,另一个选择是使用新的TacticSource API并禁用cuBLASLt策略。

Read [this](https://forums.developer.nvidia.com/t/matrixmultiply-failed-on-tensorrt-7-2-1/158187/4) for detail.
请阅读[本文](https://forums.developer.nvidia.com/t/matrixmultiply-failed-on-tensorrt-7-2-1/158187/4)了解详情。

- Install mmdeploy on Jetson
- 在Jetson上安装mmdeploy

我们在[这里](../01-how-to-build/jetsons.md)提供了一个Jetsons入门教程。

We provide a tutorial to get start on Jetsons [here](../01-how-to-build/jetsons.md).
Loading