Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNN模型转换后对比结果错误。 #350

Open
cwp-wind opened this issue Aug 20, 2024 · 6 comments
Open

MNN模型转换后对比结果错误。 #350

cwp-wind opened this issue Aug 20, 2024 · 6 comments
Assignees

Comments

@cwp-wind
Copy link

cwp-wind commented Aug 20, 2024

MNNConvert 版本:2.9.3
使用ResNet34_LM后通过命令:
python wespeaker/bin/export_onnx.py --config model/config.yaml --checkpoint model/model_5.pt --output_model model.onnx
转onnx后再通过命令:
python wespeaker/bin/export_mnn.py --onnx_model model.onnx --output_model model.mnn
提示:
The device supports: i8sdot:0, fp16:0, i8mm: 0, sve2: 0 The device supports: i8sdot:0, fp16:0, i8mm: 0, sve2: 0 Start to Convert Other Model Format To MNN Model..., target version: 2.9 [10:50:32] /root/code/MNN-master/tools/converter/source/onnx/onnxConverter.cpp:46: ONNX Model ir version: 7 [10:50:32] /root/code/MNN-master/tools/converter/source/onnx/onnxConverter.cpp:47: ONNX Model opset version: 14 Start to Optimize the MNN Net... inputTensors : [ feats, ] outputTensors: [ embs, ] Converted Success! Exported MNN model to model.mnn The device supports: i8sdot:0, fp16:0, i8mm: 0, sve2: 0 Model default dimensionFormat is NCHW Model Inputs: [ feats ]: dimensionFormat: NCHW, size: [ 1,-1,80 ], type is float Model Outputs: [ embs ] Model Version: 2.9.3 MNN use high precision

使用生成的模型进行测试,命令为:

./asv_main
--enroll_wav warmup.wav
--test_wav warmup.wav
--threshold 0.5
--speaker_model_path model.mnn
--embedding_size 256
I0820 10:51:01.697679 3446718 asv_main.cc:38] model.mnn
I0820 10:51:01.697815 3446718 asv_main.cc:39] Init model ...
I0820 10:51:01.697841 3446718 speaker_engine.cc:35] Reading model model.mnn
I0820 10:51:01.697861 3446718 speaker_engine.cc:37] Embedding size: 256
I0820 10:51:01.697881 3446718 speaker_engine.cc:39] per_chunk_samples: 32000
I0820 10:51:01.697898 3446718 speaker_engine.cc:41] Sample rate: 16000
The device support i8sdot:0, support fp16:0, support i8mm: 0
I0820 10:51:01.765399 3446718 asv_main.cc:44] embedding size: 256
I0820 10:51:01.773716 3446718 asv_main.cc:53] 81750
I0820 10:51:02.185937 3446718 mnn_speaker_model.cc:65] dynamic shape.
I0820 10:51:02.231452 3446718 asv_main.cc:62] 81750
I0820 10:51:02.668128 3446718 asv_main.cc:65] compute score ...
I0820 10:51:02.668175 3446718 asv_main.cc:67] Cosine socre: 0
I0820 10:51:02.668211 3446718 asv_main.cc:71] Warning! It's a different speaker“

同一个文件,比对后显示2个文件不是同一个发音人,warmup.wav里面只有一个人讲话。
同样的音频用onnx模型执行同样的模型:
”./asv_main --enroll_wav warmup.wav
--test_wav warmup.wav
--threshold 0.5
--speaker_model_path model.onnx
--embedding_size 256
I0820 11:02:23.524395 3484515 asv_main.cc:38] model.onnx
I0820 11:02:23.524457 3484515 asv_main.cc:39] Init model ...
I0820 11:02:23.524466 3484515 speaker_engine.cc:35] Reading model model.onnx
I0820 11:02:23.524472 3484515 speaker_engine.cc:37] Embedding size: 256
I0820 11:02:23.524477 3484515 speaker_engine.cc:39] per_chunk_samples: 32000
I0820 11:02:23.524483 3484515 speaker_engine.cc:41] Sample rate: 16000
I0820 11:02:23.591621 3484515 onnx_speaker_model.cc:60] Ouput name: feats
I0820 11:02:23.591661 3484515 onnx_speaker_model.cc:68] Output name: embs
I0820 11:02:23.591672 3484515 asv_main.cc:44] embedding size: 256
I0820 11:02:23.594732 3484515 asv_main.cc:53] 43306
I0820 11:02:26.240854 3484515 asv_main.cc:62] 43306
I0820 11:02:28.911242 3484515 asv_main.cc:65] compute score ...
I0820 11:02:28.911299 3484515 asv_main.cc:67] Cosine socre: 1
I0820 11:02:28.911329 3484515 asv_main.cc:69] It's the same speaker!“

@cdliang11
Copy link
Collaborator

I0820 11:02:23.594732 3484515 asv_main.cc:53] 43306
I0820 11:02:26.240854 3484515 asv_main.cc:62] 43306

@cwp-wind asv_main.cc这两行显示的的音频长度。 onnx 和 mnn测试用的是不同的输入文件吗?

@cwp-wind
Copy link
Author

是同一个文件。

@cdliang11
Copy link
Collaborator

是同一个文件。

日志里显示的音频长度不一样,mnn是81750,onnx是43306

@cwp-wind
Copy link
Author

抱歉,图搞错了,我重新跑了一遍,以这个为准:
image
image

另外我上传了我转换的模型以及测试的音频

@cwp-wind
Copy link
Author

感谢支持,模型网盘链接为:
链接:https://pan.baidu.com/s/1sZJFuCgqIQh0mdCJAR637w
提取码:j2or

@cdliang11
Copy link
Collaborator

感谢支持,模型网盘链接为: 链接:https://pan.baidu.com/s/1sZJFuCgqIQh0mdCJAR637w 提取码:j2or

使用你给的模型,我这边没问题。

(base) ➜  mnn git:(master) ✗ ./build/bin/asv_main --enroll_wav ../../1.wav --test_wav ../../1.wav --speaker_model_path ../../model.mnn                     
I1031 18:51:49.387418 3852557824 asv_main.cc:38] ../../model.mnn
I1031 18:51:49.387877 3852557824 asv_main.cc:39] Init model ...
I1031 18:51:49.387885 3852557824 speaker_engine.cc:35] Reading model ../../model.mnn
I1031 18:51:49.387892 3852557824 speaker_engine.cc:37] Embedding size: 256
I1031 18:51:49.387897 3852557824 speaker_engine.cc:39] per_chunk_samples: 32000
I1031 18:51:49.387902 3852557824 speaker_engine.cc:41] Sample rate: 16000
hw.cpufamily: 3660830781 , size = 4
The device support i8sdot:1, support fp16:1, support i8mm: 0
I1031 18:51:49.469084 3852557824 asv_main.cc:44] embedding size: 256
I1031 18:51:49.470297 3852557824 asv_main.cc:53] 4200
I1031 18:51:49.485879 3852557824 mnn_speaker_model.cc:65] dynamic shape.
I1031 18:51:49.810196 3852557824 asv_main.cc:62] 4200
I1031 18:51:49.948745 3852557824 asv_main.cc:65] compute score ...
I1031 18:51:49.948784 3852557824 asv_main.cc:67] Cosine socre: 1
I1031 18:51:49.948794 3852557824 asv_main.cc:69] It's the same speaker!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants