RuntimeError: cublas runtime error : an access to GPU memory space failed at #153

WNingup · 2021-09-27T08:11:55Z

Thanks to the author for releasing the code. I am a newcomer in the field of machine learning and recently tried to use this model to train my own data set. I have encountered some problems in the process, and I wonder if anyone else has encountered such problems. Much appreciated.
`(Pytorch3.6) D:\1A\TEST\SPADE-master\SPADE-master>python train.py --name Anime --dataset_mode custom --label_dir datasets\Anime\label --image_dir datasets\Anime\train --label_nc 5 --no_instance
----------------- Options ---------------
D_steps_per_G: 1
aspect_ratio: 1.0
batchSize: 1
beta1: 0.0
beta2: 0.9
cache_filelist_read: False
cache_filelist_write: False
checkpoints_dir: ./checkpoints
contain_dontcare_label: False
continue_train: False
crop_size: 256
dataroot: ./datasets/cityscapes/
dataset_mode: custom [default: coco]
debug: False
display_freq: 100
display_winsize: 256
gan_mode: hinge
gpu_ids: 0
image_dir: datasets\Anime\train [default: None]
init_type: xavier
init_variance: 0.02
instance_dir:
isTrain: True [default: None]
label_dir: datasets\Anime\label [default: None]
label_nc: 5 [default: 13]
lambda_feat: 10.0
lambda_kld: 0.05
lambda_vgg: 10.0
load_from_opt_file: False
load_size: 286
lr: 0.0002
max_dataset_size: 9223372036854775807
model: pix2pix
nThreads: 0
n_layers_D: 4
name: Anime [default: label2coco]
ndf: 64
nef: 16
netD: multiscale
netD_subarch: n_layer
netG: spade
ngf: 64
niter: 50
niter_decay: 0
no_TTUR: False
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True [default: False]
no_pairing_check: False
no_vgg_loss: False
norm_D: spectralinstance
norm_E: spectralinstance
norm_G: spectralspadesyncbatch3x3
num_D: 2
num_upsampling_layers: normal
optimizer: adam
output_nc: 3
phase: train
preprocess_mode: resize_and_crop
print_freq: 100
save_epoch_freq: 10
save_latest_freq: 5000
serial_batches: False
tf_log: False
use_vae: False
which_epoch: latest
z_dim: 256
----------------- End -------------------
train.py --name Anime --dataset_mode custom --label_dir datasets\Anime\label --image_dir datasets\Anime\train --label_nc 5 --no_instance
dataset [CustomDataset] of size 6078 was created
Network [SPADEGenerator] was created. Total number of parameters: 92.1 million. To see the architecture, do print(network).
Network [MultiscaleDiscriminator] was created. Total number of parameters: 5.5 million. To see the architecture, do print(network).
create web directory ./checkpoints\Anime\web...
C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\upsampling.py:129: UserWarning: nn.Upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\functional.py:1320: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
(epoch: 1, iters: 100, time: 0.356) GAN: 0.590 GAN_Feat: 13.989 VGG: 0.132 D_Fake: 0.920 D_real: 0.794
(epoch: 1, iters: 200, time: 0.360) GAN: 1.309 GAN_Feat: 21.084 VGG: 41.610 D_Fake: 0.609 D_real: 0.100
(epoch: 1, iters: 300, time: 0.364) GAN: 1.374 GAN_Feat: 21.665 VGG: 3.995 D_Fake: 0.471 D_real: 0.459
(epoch: 1, iters: 400, time: 0.360) GAN: -0.204 GAN_Feat: 20.713 VGG: 0.708 D_Fake: 0.633 D_real: 0.532
(epoch: 1, iters: 500, time: 0.362) GAN: 1.410 GAN_Feat: 21.667 VGG: 0.132 D_Fake: 0.464 D_real: 0.544
(epoch: 1, iters: 600, time: 0.371) GAN: 1.970 GAN_Feat: 13.373 VGG: 0.131 D_Fake: 0.572 D_real: 0.538
(epoch: 1, iters: 700, time: 0.364) GAN: 1.761 GAN_Feat: 34.138 VGG: 0.132 D_Fake: 0.562 D_real: 0.549
(epoch: 1, iters: 800, time: 0.368) GAN: 0.355 GAN_Feat: 29.647 VGG: 0.112 D_Fake: 0.450 D_real: 0.566
(epoch: 1, iters: 900, time: 0.365) GAN: 2.103 GAN_Feat: 21.477 VGG: 7.289 D_Fake: 0.458 D_real: 0.575
(epoch: 1, iters: 1000, time: 0.361) GAN: 2.142 GAN_Feat: 21.897 VGG: 5.843 D_Fake: 1.034 D_real: 0.550
(epoch: 1, iters: 1100, time: 0.363) GAN: 0.164 GAN_Feat: 21.554 VGG: 0.710 D_Fake: 1.296 D_real: 0.575
(epoch: 1, iters: 1200, time: 0.361) GAN: 2.531 GAN_Feat: 21.417 VGG: 0.132 D_Fake: 1.005 D_real: 0.563
(epoch: 1, iters: 1300, time: 0.360) GAN: 2.437 GAN_Feat: 12.851 VGG: 0.131 D_Fake: 0.474 D_real: 2.082
(epoch: 1, iters: 1400, time: 0.366) GAN: 1.888 GAN_Feat: 47.128 VGG: 0.132 D_Fake: 0.454 D_real: 0.131
(epoch: 1, iters: 1500, time: 0.370) GAN: 1.848 GAN_Feat: 35.804 VGG: 0.116 D_Fake: 0.492 D_real: 0.617
(epoch: 1, iters: 1600, time: 0.362) GAN: 3.135 GAN_Feat: 22.408 VGG: 10.954 D_Fake: 0.377 D_real: 0.622
(epoch: 1, iters: 1700, time: 0.362) GAN: 2.834 GAN_Feat: 21.910 VGG: 5.201 D_Fake: 0.456 D_real: 0.213
(epoch: 1, iters: 1800, time: 0.364) GAN: 2.337 GAN_Feat: 22.842 VGG: 0.707 D_Fake: 0.933 D_real: 0.644
(epoch: 1, iters: 1900, time: 0.379) GAN: 2.430 GAN_Feat: 21.782 VGG: 0.132 D_Fake: 0.346 D_real: 0.646
(epoch: 1, iters: 2000, time: 0.366) GAN: 2.908 GAN_Feat: 13.164 VGG: 0.131 D_Fake: 0.958 D_real: 1.294
(epoch: 1, iters: 2100, time: 0.362) GAN: 2.578 GAN_Feat: 66.972 VGG: 0.132 D_Fake: 0.429 D_real: 0.125
(epoch: 1, iters: 2200, time: 0.363) GAN: 2.664 GAN_Feat: 51.285 VGG: 0.117 D_Fake: 0.323 D_real: 0.673
(epoch: 1, iters: 2300, time: 0.364) GAN: 2.826 GAN_Feat: 24.286 VGG: 27.855 D_Fake: 0.409 D_real: 0.231
(epoch: 1, iters: 2400, time: 0.366) GAN: -140.364 GAN_Feat: 24.303 VGG: 8.292 D_Fake: 0.296 D_real: 0.715
(epoch: 1, iters: 2500, time: 0.371) GAN: 2.844 GAN_Feat: 23.004 VGG: 0.710 D_Fake: 0.505 D_real: 4.202
(epoch: 1, iters: 2600, time: 0.365) GAN: 2.341 GAN_Feat: 22.577 VGG: 0.132 D_Fake: 0.301 D_real: 0.723
(epoch: 1, iters: 2700, time: 0.364) GAN: 2.723 GAN_Feat: 12.803 VGG: 0.131 D_Fake: 0.637 D_real: 0.710
(epoch: 1, iters: 2800, time: 0.364) GAN: 3.091 GAN_Feat: 79.301 VGG: 0.132 D_Fake: 0.350 D_real: 0.701
(epoch: 1, iters: 2900, time: 0.371) GAN: 1.558 GAN_Feat: 66.336 VGG: 0.117 D_Fake: 0.262 D_real: 0.749
(epoch: 1, iters: 3000, time: 0.367) GAN: 3.037 GAN_Feat: 25.476 VGG: 42.269 D_Fake: 0.258 D_real: 0.743
(epoch: 1, iters: 3100, time: 0.364) GAN: 2.946 GAN_Feat: 24.450 VGG: 7.939 D_Fake: 0.361 D_real: 0.743
(epoch: 1, iters: 3200, time: 0.367) GAN: 2.855 GAN_Feat: 25.554 VGG: 0.710 D_Fake: 0.899 D_real: 0.990
(epoch: 1, iters: 3300, time: 0.366) GAN: 2.983 GAN_Feat: 24.364 VGG: 0.132 D_Fake: 0.195 D_real: 1.252
(epoch: 1, iters: 3400, time: 0.366) GAN: 3.234 GAN_Feat: 13.203 VGG: 0.131 D_Fake: 0.331 D_real: 0.807
(epoch: 1, iters: 3500, time: 0.365) GAN: 3.182 GAN_Feat: 95.970 VGG: 0.132 D_Fake: 0.320 D_real: 0.708
(epoch: 1, iters: 3600, time: 0.360) GAN: 2.633 GAN_Feat: 73.374 VGG: 0.117 D_Fake: 0.206 D_real: 0.791
(epoch: 1, iters: 3700, time: 0.365) GAN: 2.849 GAN_Feat: 27.089 VGG: 98.349 D_Fake: 0.314 D_real: 0.118
(epoch: 1, iters: 3800, time: 0.363) GAN: 3.041 GAN_Feat: 26.040 VGG: 33.939 D_Fake: 0.866 D_real: 0.793
(epoch: 1, iters: 3900, time: 0.364) GAN: 2.897 GAN_Feat: 27.133 VGG: 0.711 D_Fake: 0.294 D_real: 2.774
(epoch: 1, iters: 4000, time: 0.362) GAN: 2.359 GAN_Feat: 24.519 VGG: 0.132 D_Fake: 0.297 D_real: 0.283
(epoch: 1, iters: 4100, time: 0.363) GAN: 2.644 GAN_Feat: 14.003 VGG: 0.131 D_Fake: 0.196 D_real: 0.827
(epoch: 1, iters: 4200, time: 0.362) GAN: 2.373 GAN_Feat: 110.309 VGG: 0.132 D_Fake: 0.284 D_real: 0.831
(epoch: 1, iters: 4300, time: 0.363) GAN: 2.907 GAN_Feat: 82.215 VGG: 0.117 D_Fake: 0.167 D_real: 0.838
(epoch: 1, iters: 4400, time: 0.370) GAN: 178.683 GAN_Feat: 28.263 VGG: 339.884 D_Fake: 0.275 D_real: 0.797
(epoch: 1, iters: 4500, time: 0.364) GAN: 2.094 GAN_Feat: 27.223 VGG: 89.174 D_Fake: 0.190 D_real: 0.841
(epoch: 1, iters: 4600, time: 0.364) GAN: -6.547 GAN_Feat: 29.395 VGG: 0.712 D_Fake: 0.805 D_real: 0.807
(epoch: 1, iters: 4700, time: 0.365) GAN: 2.897 GAN_Feat: 27.192 VGG: 0.132 D_Fake: 0.819 D_real: 0.795
(epoch: 1, iters: 4800, time: 0.363) GAN: 2.807 GAN_Feat: 14.091 VGG: 0.131 D_Fake: 23.561 D_real: 106.816
(epoch: 1, iters: 4900, time: 0.363) GAN: 3.097 GAN_Feat: 116.728 VGG: 0.132 D_Fake: 0.305 D_real: 0.380
(epoch: 1, iters: 5000, time: 0.361) GAN: 3.865 GAN_Feat: 85.325 VGG: 0.117 D_Fake: 0.184 D_real: 0.628
saving the latest model (epoch 1, total_steps 5000)
Saved current iteration count at ./checkpoints\Anime\iter.txt.
Traceback (most recent call last):
File "train.py", line 43, in
trainer.run_discriminator_one_step(data_i)
File "D:\1A\TEST\SPADE-master\SPADE-master\trainers\pix2pix_trainer.py", line 44, in run_discriminator_one_step
d_losses = self.pix2pix_model(data, mode='discriminator')
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\parallel\data_parallel.py", line 141, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\pix2pix_model.py", line 50, in forward
input_semantics, real_image)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\pix2pix_model.py", line 168, in compute_discriminator_loss
fake_image, _ = self.generate_fake(input_semantics, real_image)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\pix2pix_model.py", line 195, in generate_fake
fake_image = self.netG(input_semantics, z=z)
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\networks\generator.py", line 91, in forward
x = self.head_0(x, seg)
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\networks\architecture.py", line 53, in forward
dx = self.conv_0(self.actvn(self.norm_0(x, seg)))
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 485, in call
hook(self, input)
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\utils\spectral_norm.py", line 100, in call
setattr(module, self.name, self.compute_weight(module, do_power_iteration=module.training))
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\utils\spectral_norm.py", line 86, in compute_weight
sigma = torch.dot(u, torch.mv(weight_mat, v))
RuntimeError: cublas runtime error : an access to GPU memory space failed at C:/a/w/1/s/windows/pytorch/aten/src/THC/THCBlas.cu:21

(Pytorch3.6) D:\1A\TEST\SPADE-master\SPADE-master>`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: cublas runtime error : an access to GPU memory space failed at #153

RuntimeError: cublas runtime error : an access to GPU memory space failed at #153

WNingup commented Sep 27, 2021

RuntimeError: cublas runtime error : an access to GPU memory space failed at #153

RuntimeError: cublas runtime error : an access to GPU memory space failed at #153

Comments

WNingup commented Sep 27, 2021