Multiple GPU #520

MinaGabriel · 2020-06-03T18:02:28Z

I am trying to run training on two GPUs

StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
StreamExecutor device (1): GeForce RTX 2080 Ti, Compute Capability 7.5

I keep on getting the following error, i am assuming that this error is because the weights are on the CPU while Input is on GPU, correct?


Traceback (most recent call last):
  File "/home/lambda/PyTorch-Yolov3/train.py", line 115, in <module>
    loss, outputs = model(imgs, targets)
  File "/home/lambda/anaconda3/envs/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lambda/PyTorch-Yolov3/models.py", line 252, in forward
    x = module(x)
  File "/home/lambda/anaconda3/envs/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lambda/anaconda3/envs/venv/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/lambda/anaconda3/envs/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lambda/anaconda3/envs/venv/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 349, in forward
    return self._conv_forward(input, self.weight)
  File "/home/lambda/anaconda3/envs/venv/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 345, in _conv_forward
    return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 0 does not equal 1 (while checking arguments for cudnn_convolution)

The text was updated successfully, but these errors were encountered:

ScottHoang · 2020-06-11T15:37:23Z

Are you using pytorch distributed package? if so, did you correctly set your default Cuda location for your local process rank? if not, this happens.

jxhno1 · 2020-06-17T08:38:17Z

Can you put some pipeline advice for Multi-gpu training? Thanks a lot!@voodoopotato

genqiaolynn · 2021-01-21T14:08:10Z

Can you success to multi-gpu training? Thanks!

Flova · 2021-08-02T13:32:39Z

I will not add multi GPU training in the near future. If anybody wants to make a pr feel free.

Flova mentioned this issue Jan 22, 2021

Can we use this repo with multi_gpu? #615

Closed

This was referenced Aug 2, 2021

mutil gpus #507

Closed

problem #391

Closed

how should I train models with GPUS #342

Closed

RuntimeError: grad can be implicitly created only for scalar outputs #331

Closed

How to train with mult gus #290

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple GPU #520

Multiple GPU #520

MinaGabriel commented Jun 3, 2020

ScottHoang commented Jun 11, 2020

jxhno1 commented Jun 17, 2020

genqiaolynn commented Jan 21, 2021

Flova commented Aug 2, 2021

Multiple GPU #520

Multiple GPU #520

Comments

MinaGabriel commented Jun 3, 2020

ScottHoang commented Jun 11, 2020

jxhno1 commented Jun 17, 2020

genqiaolynn commented Jan 21, 2021

Flova commented Aug 2, 2021