How can OnnxRuntime load multi onnx files? #10121

shimoshida · 2021-12-23T14:56:03Z

shimoshida
Dec 23, 2021

Env Settings

keras2onnx 1.7.0
onnx 1.8.204
onnxconverter-common 1.9.0
onnxruntime 1.8.1

Question

For example, the following files are onnx files for GPT-J-6B.

root@52037065c0ed:/mashim/exp-test# ls onnx/genji-jp/
14234  14448  14637  14851  15040                 h.13.ln_1.weight      h.20.ln_1.bias        h.3.attn.bias
14235  14449  14638  14852  15041                 h.13.mlp.fc_in.bias   h.20.ln_1.weight      h.3.ln_1.bias
14236  14450  14639  14853  15042                 h.13.mlp.fc_out.bias  h.20.mlp.fc_in.bias   h.3.ln_1.weight
14262  14451  14665  14854  15068                 h.14.attn.bias        h.20.mlp.fc_out.bias  h.3.mlp.fc_in.bias
14263  14452  14666  14855  15069                 h.14.ln_1.bias        h.21.attn.bias        h.3.mlp.fc_out.bias
14264  14453  14667  14856  15070                 h.14.ln_1.weight      h.21.ln_1.bias        h.4.attn.bias
14265  14479  14668  14882  15071                 h.14.mlp.fc_in.bias   h.21.ln_1.weight      h.4.ln_1.bias
14266  14480  14669  14883  15072                 h.14.mlp.fc_out.bias  h.21.mlp.fc_in.bias   h.4.ln_1.weight
14267  14481  14670  14884  15073                 h.15.attn.bias        h.21.mlp.fc_out.bias  h.4.mlp.fc_in.bias
14293  14482  14696  14885  15099                 h.15.ln_1.bias        h.22.attn.bias        h.4.mlp.fc_out.bias
14294  14483  14697  14886  15100                 h.15.ln_1.weight      h.22.ln_1.bias        h.5.attn.bias
14295  14484  14698  14887  15101                 h.15.mlp.fc_in.bias   h.22.ln_1.weight      h.5.ln_1.bias
14296  14510  14699  14913  h.0.attn.bias         h.15.mlp.fc_out.bias  h.22.mlp.fc_in.bias   h.5.ln_1.weight
14297  14511  14700  14914  h.0.ln_1.bias         h.16.attn.bias        h.22.mlp.fc_out.bias  h.5.mlp.fc_in.bias
14298  14512  14701  14915  h.0.ln_1.weight       h.16.ln_1.bias        h.23.attn.bias        h.5.mlp.fc_out.bias
14324  14513  14727  14916  h.0.mlp.fc_in.bias    h.16.ln_1.weight      h.23.ln_1.bias        h.6.attn.bias
14325  14514  14728  14917  h.0.mlp.fc_out.bias   h.16.mlp.fc_in.bias   h.23.ln_1.weight      h.6.ln_1.bias
14326  14515  14729  14918  h.1.attn.bias         h.16.mlp.fc_out.bias  h.23.mlp.fc_in.bias   h.6.ln_1.weight
14327  14541  14730  14944  h.1.ln_1.bias         h.17.attn.bias        h.23.mlp.fc_out.bias  h.6.mlp.fc_in.bias
14328  14542  14731  14945  h.1.ln_1.weight       h.17.ln_1.bias        h.24.attn.bias        h.6.mlp.fc_out.bias
14329  14543  14732  14946  h.1.mlp.fc_in.bias    h.17.ln_1.weight      h.24.ln_1.bias        h.7.attn.bias
14355  14544  14758  14947  h.1.mlp.fc_out.bias   h.17.mlp.fc_in.bias   h.24.ln_1.weight      h.7.ln_1.bias
14356  14545  14759  14948  h.10.attn.bias        h.17.mlp.fc_out.bias  h.24.mlp.fc_in.bias   h.7.ln_1.weight
14357  14546  14760  14949  h.10.ln_1.bias        h.18.attn.bias        h.24.mlp.fc_out.bias  h.7.mlp.fc_in.bias
14358  14572  14761  14975  h.10.ln_1.weight      h.18.ln_1.bias        h.25.attn.bias        h.7.mlp.fc_out.bias
14359  14573  14762  14976  h.10.mlp.fc_in.bias   h.18.ln_1.weight      h.25.ln_1.bias        h.8.attn.bias
14360  14574  14763  14977  h.10.mlp.fc_out.bias  h.18.mlp.fc_in.bias   h.25.ln_1.weight      h.8.ln_1.bias
14386  14575  14789  14978  h.11.attn.bias        h.18.mlp.fc_out.bias  h.25.mlp.fc_in.bias   h.8.ln_1.weight
14387  14576  14790  14979  h.11.ln_1.bias        h.19.attn.bias        h.25.mlp.fc_out.bias  h.8.mlp.fc_in.bias
14388  14577  14791  14980  h.11.ln_1.weight      h.19.ln_1.bias        h.26.attn.bias        h.8.mlp.fc_out.bias
14389  14603  14792  15006  h.11.mlp.fc_in.bias   h.19.ln_1.weight      h.26.ln_1.bias        h.9.attn.bias
14390  14604  14793  15007  h.11.mlp.fc_out.bias  h.19.mlp.fc_in.bias   h.26.ln_1.weight      h.9.ln_1.bias
14391  14605  14794  15008  h.12.attn.bias        h.19.mlp.fc_out.bias  h.26.mlp.fc_in.bias   h.9.ln_1.weight
14417  14606  14820  15009  h.12.ln_1.bias        h.2.attn.bias         h.26.mlp.fc_out.bias  h.9.mlp.fc_in.bias
14418  14607  14821  15010  h.12.ln_1.weight      h.2.ln_1.bias         h.27.attn.bias        h.9.mlp.fc_out.bias
14419  14608  14822  15011  h.12.mlp.fc_in.bias   h.2.ln_1.weight       h.27.ln_1.bias        ln_f.bias
14420  14634  14823  15037  h.12.mlp.fc_out.bias  h.2.mlp.fc_in.bias    h.27.ln_1.weight      ln_f.weight
14421  14635  14824  15038  h.13.attn.bias        h.2.mlp.fc_out.bias   h.27.mlp.fc_in.bias   model.onnx
14422  14636  14825  15039  h.13.ln_1.bias        h.20.attn.bias        h.27.mlp.fc_out.bias  wte.weight

When onnxruntime loads the model.onnx, error occurs as follows:

In [1]: import onnxruntime as ort
In [2]: ls
Makefile  onnx/

In [3]: ort_sess = ort.InferenceSession('onnx/genji-jp/model.onnx')
---------------------------------------------------------------------------
Fail                                      Traceback (most recent call last)
<ipython-input-3-a3efee5a0b72> in <module>
----> 1 ort_sess = ort.InferenceSession('onnx/genji-jp/model.onnx')

/opt/conda/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py in __init__(self, path_or_bytes, sess_options, providers, provider_options, **kwargs)
    281
    282         try:
--> 283             self._create_inference_session(providers, provider_options, disabled_optimizers)
    284         except ValueError:
    285             if self._enable_fallback:

/opt/conda/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py in _create_inference_session(self, providers, provider_options, disabled_optimizers)
    308         session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    309         if self._model_path:
--> 310             sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
    311         else:
    312             sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)

Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from onnx/genji-jp/model.onnx failed:Type Error: Type parameter (T) of Optype (Einsum) bound to different types (tensor(int64) and tensor(float) in node (Einsum_126).

How can I load such multi onnx files by using onnxruntime?
This questions are relevant to the following issue:
huggingface/transformers#14836 (comment)

skottmckay · 2022-01-11T09:55:13Z

skottmckay
Jan 11, 2022
Collaborator

One InferenceSession per model if you want to load multiple onnx files.

e.g.

session1 = ort.InferenceSession('model1.onnx')
session2 = ort.InferenceSession('model2.onnx')

However your issue seems to be the model is invalid. There is an Einsum node with inputs of different types. The ONNX spec requires that all inputs are the same type.

https://github.com/onnx/onnx/blob/main/docs/Operators.md#einsum

Note that the way the constraints work is that it can be any type from the type constraints list, but only one type for a given node. i.e. T can only be one type for the node, hence you're getting an error.

That would be an issue with the converter that created the model.

0 replies

loganlebanoff · 2022-01-11T16:30:37Z

loganlebanoff
Jan 11, 2022

I had the same problem, but it wasn't a problem with loading multiple files. Instead, it was due to the arguments of torch.einsum having different types (float and int64). At this line https://github.com/huggingface/transformers/blob/master/src/transformers/models/gptj/modeling_gptj.py#L55, change:

sinusoid_inp = torch.einsum("i , j -> i j", torch.arange(seq_len), inv_freq).to(x.device).float()

to

sinusoid_inp = torch.einsum("i , j -> i j", torch.arange(seq_len).float(), inv_freq).to(x.device).float()

Then re-run the onnx export.

1 reply

gogyzzz Jan 18, 2022

@loganlebanoff I had the same problem and solved with your solution.

but this time I got following error

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.7/site-packages/transformers/onnx/__main__.py", line 77, in <module>
    main()
  File "/opt/conda/lib/python3.7/site-packages/transformers/onnx/__main__.py", line 70, in main
    validate_model_outputs(onnx_config, tokenizer, model, args.output, onnx_outputs, args.atol)
  File "/opt/conda/lib/python3.7/site-packages/transformers/onnx/convert.py", line 180, in validate_model_outputs
    onnx_outputs = session.run(onnx_named_outputs, onnx_inputs)
  File "/opt/conda/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 192, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Mul node. Name:'Mul_249' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:497 void onnxruntime::BroadcastIterator::Init(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 64 by 512

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can OnnxRuntime load multi onnx files? #10121

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

How can OnnxRuntime load multi onnx files? #10121

shimoshida Dec 23, 2021

Env Settings

Question

Replies: 2 comments · 1 reply

skottmckay Jan 11, 2022 Collaborator

loganlebanoff Jan 11, 2022

gogyzzz Jan 18, 2022

shimoshida
Dec 23, 2021

Replies: 2 comments 1 reply

skottmckay
Jan 11, 2022
Collaborator

loganlebanoff
Jan 11, 2022