Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtualMemoryBuffer.cpp error #771

Closed
michael-novitsky opened this issue Jul 25, 2022 · 6 comments
Closed

virtualMemoryBuffer.cpp error #771

michael-novitsky opened this issue Jul 25, 2022 · 6 comments

Comments

@michael-novitsky
Copy link

michael-novitsky commented Jul 25, 2022

Hi, when I execute the following line in my code:
model_trt = torch2trt(net, [example_input], fp16_mode=True, max_batch_size=1)

it takes some time and then I receive the following output:
"
[07/25/2022-14:14:51] [TRT] [E] 2: [virtualMemoryBuffer.cpp::resizePhysical::161] Error Code 2: OutOfMemory (no further information)
[07/25/2022-14:14:51] [TRT] [E] 2: [virtualMemoryBuffer.cpp::resizePhysical::161] Error Code 2: OutOfMemory (no further information)
[07/25/2022-14:14:51] [TRT] [E] 2: [virtualMemoryBuffer.cpp::resizePhysical::161] Error Code 2: OutOfMemory (no further information)
[07/25/2022-14:14:51] [TRT] [E] 2: [virtualMemoryBuffer.cpp::resizePhysical::161] Error Code 2: OutOfMemory (no further information)
[07/25/2022-14:14:51] [TRT] [E] 2: [virtualMemoryBuffer.cpp::resizePhysical::161] Error Code 2: OutOfMemory (no further information)
Segmentation fault (core dumped)
"
I see that my GPU utilization goes up to almost 100% before the crash and it looks like a GPU memory issue.
I've added torch.cuda.empty_cache() before the execution of torch2trt and the problem still happens.
The size of my model is about 1.78 and the size of my GPU is 6GB/

System info:
GPU - NVIDIA GeForce RTX 3060 Laptop
Operating System - ubuntu 22.04
Cuda Version - V11.1.105
Nvidia driver version - 510.73.05
torch - 1.9.1+cu111
torch2trt - 0.3.0

Thank you in advance!

@jaybdub
Copy link
Contributor

jaybdub commented Jul 25, 2022

Hi @michael-novitsky ,

Thanks for reaching out!

I haven't tested, but there may be ways we can improve the memory consumption of the conversion process

  1. Release references to the original model after tracing, but before engine build
  2. Trace the model with the torch.no_grad() context manager to avoid allocating grad buffers
  3. Allow tracing the model with the CPU, rather than GPU

These would require some work and testing, so I can't guarantee anything, but you may be able to try by modifying the project's source code with some of these changes.

Another option (without torch2trt) to try would be to export the model to ONNX, and then use the TensorRT Python API directly in a separate process to optimize the TensorRT engine.

Hope this helps, please let me know if you have any questions.

Best,
John

@michael-novitsky
Copy link
Author

Hi @jaybdub ,

Thank you for your reply!
I tried to trace the model with torch.no_grad() command and executed the following commands in order to trace the model with CPU instead of GPU:

net = net.cpu()
example_input = example_input.cpu()
model_trt = torch2trt(net, [example_input], fp16_mode=True, max_batch_size=1)

But got the following error:

[07/26/2022-15:04:35] [TRT] [E] 4: Tensor: output_0 trying to set to TensorLocation::kHOST but only kDEVICE is supported (only network inputs may be on host)
[07/26/2022-15:04:35] [TRT] [E] 4: [network.cpp::validate::2738] Error Code 4: Internal Error (Tensor: input_0 set to TensorLocation::kHOST but only kDEVICE is supported (only RNNv2 allows host input))
!!!!!!!

Should I do anything else to trace the model with cpu?

@jaybdub
Copy link
Contributor

jaybdub commented Jul 26, 2022

Hi @michael-novitsky ,

Apologies, tracing with CPU is something we currently don't support. I was suggesting it as a potential feature. I'll have to investigate what's needed to support this, but it may be straightforward.

In the meantime, are you able to export your model with ONNX?

torch.onnx.export(net, example_input, 'model.onnx')

Best,
John

@jaybdub
Copy link
Contributor

jaybdub commented Jul 26, 2022

@michael-novitsky ,

I've added the ability to trace models with CPU in this PR.

#773

You can pull that branch and see if it works for you. That said, on my Jetson platform I didn't see significant memory improvements.

@michael-novitsky
Copy link
Author

Hi @jaybdub ,

I've pulled your PR and the model is successfully converted to trt (it still outputs same error even though it works).

Thank you for your help!

Michael

@jaybdub
Copy link
Contributor

jaybdub commented Aug 1, 2022

@michael-novitsky awesome to hear! I'm going to close this issue, please feel free to re-open if the problem persists, or create a new issue.

Best,
John

@jaybdub jaybdub closed this as completed Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants