Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE][Reproducing issue] : View converter , bs >1 #585

Draft
wants to merge 436 commits into
base: master
Choose a base branch
from

Conversation

SrivastavaKshitij
Copy link
Contributor

Hi @jaybdub

I recently came across a use case where the conversion fails for view converter when batch size >1 .

Command: python -m torch2trt.test --name=view

Some pointers:

  1. I have disabled all other tests except one to reproduce the error that I am seeing.
  2. One interesting thing that I noticed was that even though I was using an input tensor size of [2,3,3,3] , view converter was still seeing the input size as [1,3,3,3] . Line 15 in the view converter. I couldnt de-bug it though.

jaybdub and others added 30 commits September 10, 2019 19:38
If ceil_mode is False, the default value of layer.padding_mode is
PaddingMode.EXPLICIT_ROUND_DOWN. If ceil_mode is True, padding_mode
should be trt.PaddingMode.EXPLICIT_ROUND_UP.
…actor

- adds avg_pool2d
- adds max_pool2d
- removes AvgPool2d
- removes MaxPool2d
- adds get_arg(...)
- adds torch_dim_to_trt_axes(...)
- adds add_trt_constant(...)
- adds ``torch.chunk`` and ``torch.Tensor.chunk``
- adds ``torch.split`` and ``torch.Tensor.split``
- adds tests for ``squeezenet*`` models
jaybdub and others added 25 commits November 17, 2020 15:06
* Remove duplicate filenames which do not work on Windows by merging files

* Fix

* relu tests

Co-authored-by: Koen van de Sande <[email protected]>
…ations (NVIDIA-AI-IOT#505)

* Initioal version of ne, floordiv, mod and tensor converters. Extend ops for relu and sigmoid.

* Converters for floordiv, mod, ne, and torch::tensor() operations . Extend relu and sigmoid converters to Tensor methods.

* Update CHANGELOG.md
…T#482)

* added passing of torch2trt_kwargs to conversion context

* added passing of torch2trt_kwargs to conversion context
…OT#511)

* added filter to floordiv to only enable for pytorch 1.6+

* enabled soft failure for missing torch method
* increment version to 0.2.0

* realse push docs tagfix
* added conv_functional

* add Tensor flatten

* update changelog for functional conv / flatten

* add site to gitignore
… in the file CLA.md of this project. Signed, John Welsh
@jaybdub
Copy link
Contributor

jaybdub commented Jul 6, 2021

Hi @SrivastavaKshitij ,

Thanks for pointing this out.

This likely has to do with this line

inputs = [tensor.clone()[0:1] for tensor in inputs] # only run single entry

When we pass the example data into the model, we do it with batch size 1.

From my understanding, converters shouldn't modify the batch dimension, so this would be acceptable, but perhaps I'm missing something.

Let me know if this helps.

Best,
John

@SrivastavaKshitij
Copy link
Contributor Author

SrivastavaKshitij commented Jul 7, 2021

Hey John,

Did some more experiments.

when I changed line 517 from

inputs = [tensor.clone()[0:1] for tensor in inputs]

to

inputs = [tensor.clone() for tensor in inputs]

I was able to see the batch dimension. So that's good. However,

trt_tensor = t._trt

removes the batch dimension again and I dont know why is that. What do you think ?

I added a print statement of the tensor shape before and after line 153 and this is what I got

torch.Size([2, 3, 3, 3]) 
(3, 3, 3)

Somehow, batch dim is removed.

Now this may not have impacted other ops but it will impact some of the ops such as view because the volume of the tensor will not match

[TensorRT] ERROR: [SHUFFLE #1] torch.Tensor.view(tensor(shape=[2, 3, 3, 3], dtype=torch.float32), 1, -1): volume mismatch. Input dimensions [3,3,3] have volume 27 and output dimensions [54] have volume 54.

@add_module_test(torch.float32, torch.device('cuda'), [(1, 3, 3, 3)])
#@add_module_test(torch.float32, torch.device('cuda'), [(1, 3)])
#@add_module_test(torch.float32, torch.device('cuda'), [(1, 3, 3)])
@add_module_test(torch.float32, torch.device('cuda'), [(2, 3, 3, 3)],max_batch_size=3)
def test_view_1d():
return View(1, -1)
Copy link
Contributor

@jaybdub jaybdub Jul 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the test case is hard-fixing the batch dimension to 1 here.

In general, TensorRT engines are broadcast across the batch dimension, so operations that change the batch dimension aren't permitted.

Perhaps adding a test case with View(2, ...) would work for batch size 2. Or maybe even View(-1, ...) with other dimensions specified explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants