Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't get scale and zeropoint for onnx::Sigmoid_3384_Gemm #114019

Closed
nassimus26 opened this issue Oct 29, 2024 · 5 comments
Closed

can't get scale and zeropoint for onnx::Sigmoid_3384_Gemm #114019

nassimus26 opened this issue Oct 29, 2024 · 5 comments
Labels
invalid Resolved as invalid, i.e. not a bug mlir

Comments

@nassimus26
Copy link

nassimus26 commented Oct 29, 2024

Hi, I am trying to convert the file that I will attach to this bug report using TPU MLIR converter,

agrd_model.zip

Below the TPU MLIR conversion script

#!/bin/bash

set -e

net_name=agrd
input_w=420
input_h=300

mkdir -p workspace
cd workspace
echo "1.### convert to mlir"
# convert to mlir
model_transform.py \
--model_name ${net_name} \
--model_def /home/mac/data/${net_name}_pt.onnx \
--mean "0.485, 0.456, 0.406" \
--scale "0.229, 0.224, 0.225" \
--keep_aspect_ratio \
--pixel_format rgb \
--channel_format nchw \
--tolerance 0.99,0.99 \
--mlir ${net_name}.mlir \
--input_shapes [[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}]] \
#--test_input ../data.npy \
#--test_result ${net_name}_top_outputs.npz \

echo "2.### export bf16 model"
# export bf16 model
#   not use --quant_input, use float32 for easy coding
model_deploy.py \
--mlir ${net_name}.mlir \
--quantize INT8 \
--processor cv181x \
--model ${net_name}_bf16.cvimodel

echo "3.### calibrate for int8 model"
# export int8 model
run_calibration.py ${net_name}.mlir \
--dataset ../images \
--input_num 200 \
-o ${net_name}_cali_table

echo "convert to int8 model"
# export int8 model
#    add --quant_input, use int8 for faster processing in maix.nn.NN.forward_image
model_deploy.py \
--mlir ${net_name}.mlir \
--quantize INT8 \
--quant_input \
--calibration_table ${net_name}_cali_table \
--processor cv181x \
--test_input ${net_name}_in_f32.npz \
--test_reference ${net_name}_top_outputs.npz \
--tolerance 0.9,0.6 \
--model ${net_name}_int8.cvimodel

I am getting this bug trace :

%3475 = "top.MatMul"(%3472, %3473, %3474) {do_relu = false, hdim_is_batch = false, keep_dims = true, left_transpose = false, output_transpose = false, relu_limit = -1.000000e+00 : f64, right_transpose = false} : (tensor<1x120xf32>, tensor<120x24xf32>, tensor<24xf32>) -> tensor<1x24xf32> loc("onnx::Sigmoid_3384_Gemm")
can't get scale and zeropoint
UNREACHABLE executed at /__w/tpu-mlir/tpu-mlir/lib/Support/Module.cpp:1599!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt fights.mlir --init "--processor-assign=chip=cv181x mode=INT8 num_device=1 num_core=1 addr_mode=auto" --processor-top-optimize "--convert-top-to-tpu= asymmetric=False doWinograd=False ignore_f16_overflow=False q_group_size=0 matmul_perchannel=False" --canonicalize --weight-fold --deinit --mlir-print-debuginfo -o fights_cv181x_int8_sym_tpu.mlir
 #0 0x00005ba3571d2e47 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x848e47)
 #1 0x00005ba3571d0b6e (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x846b6e)
 #2 0x00005ba3571d37ca (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x8497ca)
 #3 0x000079605da3d520 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x42520)
 #4 0x000079605da919fc pthread_kill (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x969fc)
 #5 0x000079605da3d476 gsignal (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x42476)
 #6 0x000079605da237f3 abort (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x287f3)
 #7 0x00005ba3571d0991 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x846991)
 #8 0x00005ba358842981 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1eb8981)
 #9 0x00005ba358823692 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1e99692)
#10 0x00005ba35761d32d (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0xc9332d)
#11 0x00005ba3574986db (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0xb0e6db)
#12 0x00005ba357497f54 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0xb0df54)
#13 0x00005ba35870fb07 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1d85b07)
#14 0x00005ba35870c41f (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1d8241f)
#15 0x00005ba3586d54fc (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1d4b4fc)
#16 0x00005ba3586d234c (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1d4834c)
#17 0x00005ba35732e9b4 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x9a49b4)
#18 0x00005ba358738df4 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1daedf4)
#19 0x00005ba358739421 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1daf421)
#20 0x00005ba35873b8c8 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1db18c8)
#21 0x00005ba3571c44fb (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x83a4fb)
#22 0x00005ba3571c38c4 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x8398c4)
#23 0x00005ba35894ed88 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1fc4d88)
#24 0x00005ba3571bdbca (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x833bca)
#25 0x00005ba3571be094 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x834094)
#26 0x00005ba3571bcada (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x832ada)
#27 0x000079605da24d90 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x29d90)
#28 0x000079605da24e40 __libc_start_main (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x29e40)
#29 0x00005ba3571bbee5 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x831ee5)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/tpu_mlir/python/tools/model_deploy.py", line 467, in <module>
    lowering_patterns = tool.lowering()
  File "/usr/local/lib/python3.10/dist-packages/tpu_mlir/python/tools/model_deploy.py", line 167, in lowering
    patterns = mlir_lowering(self.mlir_file,
  File "/usr/local/lib/python3.10/dist-packages/tpu_mlir/python/utils/mlir_shell.py", line 156, in mlir_lowering
    _os_system(cmd, mute=mute,log_level=log_level)
  File "/usr/local/lib/python3.10/dist-packages/tpu_mlir/python/utils/mlir_shell.py", line 62, in _os_system
    raise RuntimeError("[!Error]: {}".format(cmd_str))
RuntimeError: [!Error]: tpuc-opt fights.mlir --processor-assign="chip=cv181x mode=INT8 num_device=1 num_core=1 addr_mode=auto" --processor-top-optimize --convert-top-to-tpu=" asymmetric=False doWinograd=False ignore_f16_overflow=False q_group_size=0 matmul_perchannel=False" --canonicalize --weight-fold -o fights_cv181x_int8_sym_tpu.mlir

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 29, 2024

@llvm/issue-subscribers-mlir

Author: None (nassimus26)

Hi, I am trying to convert the file that I will attach to this bug report using TPU MLIR converter,

agrd_model.zip

Below the TPU MLIR conversion script

#!/bin/bash

set -e

net_name=agrd
input_w=420
input_h=300

mkdir -p workspace
cd workspace
echo "1.### convert to mlir"
# convert to mlir
model_transform.py \
--model_name ${net_name} \
--model_def /home/mac/data/${net_name}_pt.onnx \
--mean "0.485, 0.456, 0.406" \
--scale "0.229, 0.224, 0.225" \
--keep_aspect_ratio \
--pixel_format rgb \
--channel_format nchw \
--tolerance 0.99,0.99 \
--mlir ${net_name}.mlir \
--input_shapes [[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}],[1,3,${input_h},${input_w}]] \
#--test_input ../data.npy \
#--test_result ${net_name}_top_outputs.npz \

echo "2.### export bf16 model"
# export bf16 model
#   not use --quant_input, use float32 for easy coding
model_deploy.py \
--mlir ${net_name}.mlir \
--quantize INT8 \
--processor cv181x \
--model ${net_name}_bf16.cvimodel

echo "3.### calibrate for int8 model"
# export int8 model
run_calibration.py ${net_name}.mlir \
--dataset ../images \
--input_num 200 \
-o ${net_name}_cali_table

echo "convert to int8 model"
# export int8 model
#    add --quant_input, use int8 for faster processing in maix.nn.NN.forward_image
model_deploy.py \
--mlir ${net_name}.mlir \
--quantize INT8 \
--quant_input \
--calibration_table ${net_name}_cali_table \
--processor cv181x \
--test_input ${net_name}_in_f32.npz \
--test_reference ${net_name}_top_outputs.npz \
--tolerance 0.9,0.6 \
--model ${net_name}_int8.cvimodel

I am getting this bug trace :

%3475 = "top.MatMul"(%3472, %3473, %3474) {do_relu = false, hdim_is_batch = false, keep_dims = true, left_transpose = false, output_transpose = false, relu_limit = -1.000000e+00 : f64, right_transpose = false} : (tensor&lt;1x120xf32&gt;, tensor&lt;120x24xf32&gt;, tensor&lt;24xf32&gt;) -&gt; tensor&lt;1x24xf32&gt; loc("onnx::Sigmoid_3384_Gemm")
can't get scale and zeropoint
UNREACHABLE executed at /__w/tpu-mlir/tpu-mlir/lib/Support/Module.cpp:1599!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt fights.mlir --init "--processor-assign=chip=cv181x mode=INT8 num_device=1 num_core=1 addr_mode=auto" --processor-top-optimize "--convert-top-to-tpu= asymmetric=False doWinograd=False ignore_f16_overflow=False q_group_size=0 matmul_perchannel=False" --canonicalize --weight-fold --deinit --mlir-print-debuginfo -o fights_cv181x_int8_sym_tpu.mlir
 #<!-- -->0 0x00005ba3571d2e47 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x848e47)
 #<!-- -->1 0x00005ba3571d0b6e (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x846b6e)
 #<!-- -->2 0x00005ba3571d37ca (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x8497ca)
 #<!-- -->3 0x000079605da3d520 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x42520)
 #<!-- -->4 0x000079605da919fc pthread_kill (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x969fc)
 #<!-- -->5 0x000079605da3d476 gsignal (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x42476)
 #<!-- -->6 0x000079605da237f3 abort (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x287f3)
 #<!-- -->7 0x00005ba3571d0991 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x846991)
 #<!-- -->8 0x00005ba358842981 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1eb8981)
 #<!-- -->9 0x00005ba358823692 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1e99692)
#<!-- -->10 0x00005ba35761d32d (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0xc9332d)
#<!-- -->11 0x00005ba3574986db (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0xb0e6db)
#<!-- -->12 0x00005ba357497f54 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0xb0df54)
#<!-- -->13 0x00005ba35870fb07 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1d85b07)
#<!-- -->14 0x00005ba35870c41f (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1d8241f)
#<!-- -->15 0x00005ba3586d54fc (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1d4b4fc)
#<!-- -->16 0x00005ba3586d234c (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1d4834c)
#<!-- -->17 0x00005ba35732e9b4 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x9a49b4)
#<!-- -->18 0x00005ba358738df4 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1daedf4)
#<!-- -->19 0x00005ba358739421 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1daf421)
#<!-- -->20 0x00005ba35873b8c8 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1db18c8)
#<!-- -->21 0x00005ba3571c44fb (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x83a4fb)
#<!-- -->22 0x00005ba3571c38c4 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x8398c4)
#<!-- -->23 0x00005ba35894ed88 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x1fc4d88)
#<!-- -->24 0x00005ba3571bdbca (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x833bca)
#<!-- -->25 0x00005ba3571be094 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x834094)
#<!-- -->26 0x00005ba3571bcada (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x832ada)
#<!-- -->27 0x000079605da24d90 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x29d90)
#<!-- -->28 0x000079605da24e40 __libc_start_main (/usr/local/lib/python3.10/dist-packages/tpu_mlir/lib/third_party/libc.so.6+0x29e40)
#<!-- -->29 0x00005ba3571bbee5 (/usr/local/lib/python3.10/dist-packages/tpu_mlir/bin/tpuc-opt+0x831ee5)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/tpu_mlir/python/tools/model_deploy.py", line 467, in &lt;module&gt;
    lowering_patterns = tool.lowering()
  File "/usr/local/lib/python3.10/dist-packages/tpu_mlir/python/tools/model_deploy.py", line 167, in lowering
    patterns = mlir_lowering(self.mlir_file,
  File "/usr/local/lib/python3.10/dist-packages/tpu_mlir/python/utils/mlir_shell.py", line 156, in mlir_lowering
    _os_system(cmd, mute=mute,log_level=log_level)
  File "/usr/local/lib/python3.10/dist-packages/tpu_mlir/python/utils/mlir_shell.py", line 62, in _os_system
    raise RuntimeError("[!Error]: {}".format(cmd_str))
RuntimeError: [!Error]: tpuc-opt fights.mlir --processor-assign="chip=cv181x mode=INT8 num_device=1 num_core=1 addr_mode=auto" --processor-top-optimize --convert-top-to-tpu=" asymmetric=False doWinograd=False ignore_f16_overflow=False q_group_size=0 matmul_perchannel=False" --canonicalize --weight-fold -o fights_cv181x_int8_sym_tpu.mlir

@stellaraccident
Copy link
Contributor

Seems like a big in a vendor compiler. Report with them. Maybe this one? https://github.com/sophgo/tpu-mlir

@EugeneZelenko EugeneZelenko added the invalid Resolved as invalid, i.e. not a bug label Oct 29, 2024
@EugeneZelenko EugeneZelenko closed this as not planned Won't fix, can't repro, duplicate, stale Oct 29, 2024
@nassimus26
Copy link
Author

Hi, I open the issue in both as I am not able to find this message "can't get scale and zeropoint for" in the both repo code

@stellaraccident
Copy link
Contributor

Can't help you. This is the wrong project.

@EugeneZelenko EugeneZelenko closed this as not planned Won't fix, can't repro, duplicate, stale Oct 29, 2024
@nassimus26
Copy link
Author

Hi, after some deep investigation, it seems related to this part https://github.com/llvm/llvm-project/tree/main/mlir

which is a part of this repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid Resolved as invalid, i.e. not a bug mlir
Projects
None yet
Development

No branches or pull requests

4 participants