[torchlib] Add the identity nodes back #1703

xadupre · 2024-06-24T14:12:11Z

In the modularization pass in the exporter, a single node like clone can be lifted as a function. If we remove the only Identity node the lifted function will have no nodes. This violates the ONNX standard.

Since removing identity nodes is fast, we are safe to include these identity nodes in the torchlib.

onnxscript/tools/transformers_models/phi_test.py broke after #1613, it is fixed by this change.

Signed-off-by: Xavier Dupre <[email protected]>

codecov · 2024-06-24T14:15:29Z

Codecov Report

Attention: Patch coverage is 92.30769% with 2 lines in your changes missing coverage. Please review.

Project coverage is 76.28%. Comparing base (1aa7a70) to head (0a18177).

Files	Patch %	Lines
onnxscript/function_libs/torch_lib/ops/core.py	91.66%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1703      +/-   ##
==========================================
+ Coverage   75.75%   76.28%   +0.53%     
==========================================
  Files         240      240              
  Lines       25495    25496       +1     
  Branches     4549     4550       +1     
==========================================
+ Hits        19313    19449     +136     
+ Misses       5281     5159     -122     
+ Partials      901      888      -13

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Xavier Dupre <[email protected]>

justinchuby · 2024-06-24T15:27:05Z

Could you explain why the change is needed? Thanks!

xadupre · 2024-06-24T15:29:00Z

Could you explain why the change is needed? Thanks!

It is only a guess. One onnx function was not added to the model. I suspect when one function has no node, it is not added to the final model. I was hoping you could confirm.

justinchuby · 2024-06-24T16:43:18Z

When trying to reproduce, I see the error happens in the optimizer:

justinchu@justinchu-dev-linux2 ~/d/onnxscript (main)> pytest onnxscript/tools/transformers_models/phi_tes
t.py
========================================== test session starts ==========================================
platform linux -- Python 3.11.9, pytest-8.2.1, pluggy-1.5.0
Using --randomly-seed=296786577
rootdir: /home/justinchu/dev/onnxscript
configfile: pyproject.toml
plugins: hypothesis-6.103.0, subtests-0.12.1, randomly-3.15.0, cov-5.0.0, xdist-3.6.1
collected 3 items                                                                                       

onnxscript/tools/transformers_models/phi_test.py Fs.                                              [100%]

=============================================== FAILURES ================================================
___________________________________ TestExportPhi.test_phi_export_cpu ___________________________________
onnxscript/tools/transformers_models/phi_test.py:28: in test_phi_export_cpu
    proto = onnxscript.tools.transformers_models.export_to_onnx(model, *input_tensors)
onnxscript/tools/transformers_models/__init__.py:28: in export_to_onnx
    model_proto = onnxscript.optimizer.optimize(
onnxscript/optimizer/__init__.py:80: in optimize
    inline_functions_with_unused_outputs(model)
onnxscript/optimizer/simple_function_folding.py:231: in inline_functions_with_unused_outputs
    inliner.visit_model(model)
onnxscript/optimizer/simple_function_folding.py:33: in visit_model
    super().visit_model(model)
onnxscript/_legacy_ir/visitor.py:792: in visit_model
    self.visit_graph(model.graph)
onnxscript/_legacy_ir/visitor.py:658: in visit_graph
    replacement = self.visit_node(node)
onnxscript/_legacy_ir/visitor.py:805: in visit_node
    replacement, _ = self.process_function_node(node)
onnxscript/optimizer/simple_function_folding.py:44: in process_function_node
    replacement, new_function = super().process_function_node(node)
onnxscript/_legacy_ir/visitor.py:892: in process_function_node
    replacement = self.visit_node(inner_node)
onnxscript/_legacy_ir/visitor.py:805: in visit_node
    replacement, _ = self.process_function_node(node)
onnxscript/optimizer/simple_function_folding.py:44: in process_function_node
    replacement, new_function = super().process_function_node(node)
onnxscript/_legacy_ir/visitor.py:846: in process_function_node
    actual_input_value_infos = [self.lookup(input) for input in node.input]
onnxscript/_legacy_ir/visitor.py:846: in <listcomp>
    actual_input_value_infos = [self.lookup(input) for input in node.input]
onnxscript/_legacy_ir/visitor.py:447: in lookup
    raise ValueError(
E   ValueError: Undefined variable model_embed_dropout_1.
E   Available variables: SubScope 0:
E     Function transformers_models_phi_modeling_phi_PhiModel_model_1:
E       ir.Values:
E         l_input_ids_: StaticValueInfo(l_input_ids_, shape:[13, 7], dtype:7, no const value.)
E         l_attention_mask_: StaticValueInfo(l_attention_mask_, shape:[13, 7], dtype:1, no const value.)
E         model.embed_tokens.weight: StaticValueInfo(model.embed_tokens.weight, shape:[99, 32], dtype:1, has const value.)
E         model.layers.0.input_layernorm.weight: StaticValueInfo(model.layers.0.input_layernorm.weight, shape:[32], dtype:1, has const value.)
E         model.layers.0.input_layernorm.bias: StaticValueInfo(model.layers.0.input_layernorm.bias, shape:[32], dtype:1, has const value.)
E         model.layers.0.self_attn.q_proj.weight: StaticValueInfo(model.layers.0.self_attn.q_proj.weight, shape:[32, 32], dtype:1, has const value.)
E         model.layers.0.self_attn.q_proj.bias: StaticValueInfo(model.layers.0.self_attn.q_proj.bias, shape:[32], dtype:1, has const value.)
E         model.layers.0.self_attn.k_proj.weight: StaticValueInfo(model.layers.0.self_attn.k_proj.weight, shape:[16, 32], dtype:1, has const value.)
E         model.layers.0.self_attn.k_proj.bias: StaticValueInfo(model.layers.0.self_attn.k_proj.bias, shape:[16], dtype:1, has const value.)
E         model.layers.0.self_attn.v_proj.weight: StaticValueInfo(model.layers.0.self_attn.v_proj.weight, shape:[16, 32], dtype:1, has const value.)
E         model.layers.0.self_attn.v_proj.bias: StaticValueInfo(model.layers.0.self_attn.v_proj.bias, shape:[16], dtype:1, has const value.)
E         model.layers.0.self_attn.rotary_emb.cos_cached: StaticValueInfo(model.layers.0.self_attn.rotary_emb.cos_cached, shape:[512, 4], dtype:1, has const value.)
E         model.layers.0.self_attn.rotary_emb.sin_cached: StaticValueInfo(model.layers.0.self_attn.rotary_emb.sin_cached, shape:[512, 4], dtype:1, has const value.)
E         model.layers.0.self_attn.dense.weight: StaticValueInfo(model.layers.0.self_attn.dense.weight, shape:[32, 32], dtype:1, has const value.)
E         model.layers.0.self_attn.dense.bias: StaticValueInfo(model.layers.0.self_attn.dense.bias, shape:[32], dtype:1, has const value.)
E         model.layers.0.mlp.fc1.weight: StaticValueInfo(model.layers.0.mlp.fc1.weight, shape:[16, 32], dtype:1, has const value.)
E         model.layers.0.mlp.fc1.bias: StaticValueInfo(model.layers.0.mlp.fc1.bias, shape:[16], dtype:1, has const value.)
E         model.layers.0.mlp.fc2.weight: StaticValueInfo(model.layers.0.mlp.fc2.weight, shape:[32, 16], dtype:1, has const value.)
E         model.layers.0.mlp.fc2.bias: StaticValueInfo(model.layers.0.mlp.fc2.bias, shape:[32], dtype:1, has const value.)
E         model.layers.1.input_layernorm.weight: StaticValueInfo(model.layers.1.input_layernorm.weight, shape:[32], dtype:1, has const value.)
E         model.layers.1.input_layernorm.bias: StaticValueInfo(model.layers.1.input_layernorm.bias, shape:[32], dtype:1, has const value.)
E         model.layers.1.self_attn.q_proj.weight: StaticValueInfo(model.layers.1.self_attn.q_proj.weight, shape:[32, 32], dtype:1, has const value.)
E         model.layers.1.self_attn.q_proj.bias: StaticValueInfo(model.layers.1.self_attn.q_proj.bias, shape:[32], dtype:1, has const value.)
E         model.layers.1.self_attn.k_proj.weight: StaticValueInfo(model.layers.1.self_attn.k_proj.weight, shape:[16, 32], dtype:1, has const value.)
E         model.layers.1.self_attn.k_proj.bias: StaticValueInfo(model.layers.1.self_attn.k_proj.bias, shape:[16], dtype:1, has const value.)
E         model.layers.1.self_attn.v_proj.weight: StaticValueInfo(model.layers.1.self_attn.v_proj.weight, shape:[16, 32], dtype:1, has const value.)
E         model.layers.1.self_attn.v_proj.bias: StaticValueInfo(model.layers.1.self_attn.v_proj.bias, shape:[16], dtype:1, has const value.)
E         model.layers.1.self_attn.rotary_emb.cos_cached: StaticValueInfo(model.layers.1.self_attn.rotary_emb.cos_cached, shape:[512, 4], dtype:1, has const value.)
E         model.layers.1.self_attn.rotary_emb.sin_cached: StaticValueInfo(model.layers.1.self_attn.rotary_emb.sin_cached, shape:[512, 4], dtype:1, has const value.)
E         model.layers.1.self_attn.dense.weight: StaticValueInfo(model.layers.1.self_attn.dense.weight, shape:[32, 32], dtype:1, has const value.)
E         model.layers.1.self_attn.dense.bias: StaticValueInfo(model.layers.1.self_attn.dense.bias, shape:[32], dtype:1, has const value.)
E         model.layers.1.mlp.fc1.weight: StaticValueInfo(model.layers.1.mlp.fc1.weight, shape:[16, 32], dtype:1, has const value.)
E         model.layers.1.mlp.fc1.bias: StaticValueInfo(model.layers.1.mlp.fc1.bias, shape:[16], dtype:1, has const value.)
E         model.layers.1.mlp.fc2.weight: StaticValueInfo(model.layers.1.mlp.fc2.weight, shape:[32, 16], dtype:1, has const value.)
E         model.layers.1.mlp.fc2.bias: StaticValueInfo(model.layers.1.mlp.fc2.bias, shape:[32], dtype:1, has const value.)
E         model.final_layernorm.weight: StaticValueInfo(model.final_layernorm.weight, shape:[32], dtype:1, has const value.)
E         model.final_layernorm.bias: StaticValueInfo(model.final_layernorm.bias, shape:[32], dtype:1, has const value.)
E         unsqueeze: StaticValueInfo(unsqueeze, shape:[1, 7], dtype:7, no const value.)
E         _val_39: StaticValueInfo(_val_39, shape:[1], dtype:7, no const value.)
E         _val_43: StaticValueInfo(_val_43, shape:[1], dtype:7, no const value.)
E         _val_47: StaticValueInfo(_val_47, shape:[1], dtype:7, no const value.)
E         _val_51: StaticValueInfo(_val_51, shape:[1], dtype:7, no const value.)
E         slice_3: StaticValueInfo(slice_3, shape:[13, 7], dtype:1, no const value.)
E         aten_unsqueeze_84_dim_0: StaticValueInfo(aten_unsqueeze_84_dim_0, shape:[], dtype:7, no const value.)
E         unsqueeze_3: StaticValueInfo(unsqueeze_3, shape:[13, 1, 7], dtype:1, no const value.)
E         aten_unsqueeze_85_dim_0: StaticValueInfo(aten_unsqueeze_85_dim_0, shape:[], dtype:7, no const value.)
E         unsqueeze_4: StaticValueInfo(unsqueeze_4, shape:[13, 1, 1, 7], dtype:1, no const value.)
E         _val_58: StaticValueInfo(_val_58, shape:[1], dtype:7, no const value.)
E         _val_62: StaticValueInfo(_val_62, shape:[1], dtype:7, no const value.)
E         _val_66: StaticValueInfo(_val_66, shape:[1], dtype:7, no const value.)
E         _val_70: StaticValueInfo(_val_70, shape:[1], dtype:7, no const value.)
E         slice_4: StaticValueInfo(slice_4, shape:[13, 1, 1, 7], dtype:1, no const value.)
E         _val_72: StaticValueInfo(_val_72, shape:[4], dtype:7, no const value.)
E         aten_expand_104_size_1: StaticValueInfo(aten_expand_104_size_1, shape:[4], dtype:7, no const value.)
E         expand_1: StaticValueInfo(expand_1, shape:[13, 1, 7, 7], dtype:1, no const value.)
E         _val_74: StaticValueInfo(_val_74, shape:[], dtype:1, no const value.)
E         rsub: StaticValueInfo(rsub, shape:[13, 1, 7, 7], dtype:1, no const value.)
E         _to_copy: StaticValueInfo(_to_copy, shape:[13, 1, 7, 7], dtype:9, no const value.)
E         _val_77: StaticValueInfo(_val_77, shape:[], dtype:1, no const value.)
E         masked_fill_1: StaticValueInfo(masked_fill_1, shape:[13, 1, 7, 7], dtype:1, no const value.)
E         _to_copy_1: StaticValueInfo(_to_copy_1, shape:[13, 1, 7, 7], dtype:9, no const value.)
E         expand_2: StaticValueInfo(expand_2, shape:[13, 1, 7, 7], dtype:1, no const value.)
E         _val_119: StaticValueInfo(_val_119, shape:[], dtype:1, no const value.)
E         masked_fill_2: StaticValueInfo(masked_fill_2, shape:[13, 1, 7, 7], dtype:1, no const value.)
E         model_layers_0_1: StaticValueInfo(model_layers_0_1, shape:[13, 2, 7, 8], dtype:1, no const value.)
E         model_layers_0_1_1: StaticValueInfo(model_layers_0_1_1, shape:[13, 2, 7, 8], dtype:1, no const value.)
E         model_layers_0_1_2: StaticValueInfo(model_layers_0_1_2, shape:[13, 7, 32], dtype:1, no const value.)
E         model_layers_1_1: StaticValueInfo(model_layers_1_1, shape:[13, 2, 7, 8], dtype:1, no const value.)
E         model_layers_1_1_1: StaticValueInfo(model_layers_1_1_1, shape:[13, 2, 7, 8], dtype:1, no const value.)
E         model_layers_1_1_2: StaticValueInfo(model_layers_1_1_2, shape:[13, 7, 32], dtype:1, no const value.)
E         model_final_layernorm_1: StaticValueInfo(model_final_layernorm_1, shape:[13, 7, 32], dtype:1, no const value.)
E       RefAttributes:

xadupre · 2024-06-24T16:45:40Z

It is but if you try to load the model with onnxruntime, it fails. The error happens before the optimizer because the model is not valid. It is not detected by check_model or infer_shapes. If you look at the onnx model, you'll see a missing function.

justinchuby · 2024-06-24T16:48:35Z

Thanks for the info, I will look deeper. Do you have the name of the missing function and the error stack from onnx runtime?

justinchuby · 2024-06-24T16:56:58Z

E   onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Failed to load model with error: ONNX Schema torch_nn_modules_dropout_Dropout_model_embed_dropout_1: failed validating the check: !(it.GetName().empty())

xadupre · 2024-06-24T16:58:30Z

Something including dropout in its name. But I don't this matters. I just took the file you modified and replaced return self by return op.Identity(self). We can leave it that way or replace only the instances where the function may be empty. But I think we can have more identity node than expected. It is easy to remove anyway and fast.

justinchuby · 2024-06-24T17:00:08Z

Sounds good. The changes look good to me. Thanks!

justinchuby

LGTM. please revert the ci changes thanks!

onnxscript/rewriter/llama_rule_sets_test.py

.github/workflows/main.yaml

Co-authored-by: Justin Chu <[email protected]>

Signed-off-by: Xavier Dupre <[email protected]>

…uginv

investigations

4ca5144

Signed-off-by: Xavier Dupre <[email protected]>

xadupre added 3 commits June 24, 2024 16:19

borken

e859839

Signed-off-by: Xavier Dupre <[email protected]>

Revert one breaking change from microsoft#1613

aee77e6

Signed-off-by: Xavier Dupre <[email protected]>

remove one change

03c2fd1

Signed-off-by: Xavier Dupre <[email protected]>

xadupre marked this pull request as ready for review June 24, 2024 14:28

xadupre changed the title ~~[WIP] investigations~~ Revert one change introduced by #1613 and breaking onnxscript/tools/transformers_models/phi_test.py Jun 24, 2024

verbosity

1d86691

Signed-off-by: Xavier Dupre <[email protected]>

justinchuby approved these changes Jun 24, 2024

View reviewed changes

onnxscript/rewriter/llama_rule_sets_test.py Outdated Show resolved Hide resolved

justinchuby changed the title ~~Revert one change introduced by #1613 and breaking onnxscript/tools/transformers_models/phi_test.py~~ [torchlib] Add the identity nodes back Jun 24, 2024

justinchuby added the topic: torch_lib Related to the torch/aten function lib in development label Jun 24, 2024

justinchuby reviewed Jun 24, 2024

View reviewed changes

.github/workflows/main.yaml Outdated Show resolved Hide resolved

xadupre and others added 3 commits June 24, 2024 23:57

Update .github/workflows/main.yaml

dd8abbc

Co-authored-by: Justin Chu <[email protected]>

issue

de3c0e9

Signed-off-by: Xavier Dupre <[email protected]>

Merge branch 'buginv' of https://github.com/xadupre/onnxscript into b…

0a18177

…uginv

justinchuby merged commit be00339 into microsoft:main Jun 25, 2024
37 of 46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[torchlib] Add the identity nodes back #1703

[torchlib] Add the identity nodes back #1703

xadupre commented Jun 24, 2024 •

edited by justinchuby

Loading

codecov bot commented Jun 24, 2024 •

edited

Loading

justinchuby commented Jun 24, 2024

xadupre commented Jun 24, 2024

justinchuby commented Jun 24, 2024

xadupre commented Jun 24, 2024

justinchuby commented Jun 24, 2024

justinchuby commented Jun 24, 2024

xadupre commented Jun 24, 2024

justinchuby commented Jun 24, 2024

justinchuby left a comment

[torchlib] Add the identity nodes back #1703

[torchlib] Add the identity nodes back #1703

Conversation

xadupre commented Jun 24, 2024 • edited by justinchuby Loading

codecov bot commented Jun 24, 2024 • edited Loading

Codecov Report

justinchuby commented Jun 24, 2024

xadupre commented Jun 24, 2024

justinchuby commented Jun 24, 2024

xadupre commented Jun 24, 2024

justinchuby commented Jun 24, 2024

justinchuby commented Jun 24, 2024

xadupre commented Jun 24, 2024

justinchuby commented Jun 24, 2024

justinchuby left a comment

Choose a reason for hiding this comment

xadupre commented Jun 24, 2024 •

edited by justinchuby

Loading

codecov bot commented Jun 24, 2024 •

edited

Loading