Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: cublas runtime error : an access to GPU memory space failed at #153

Open
WNingup opened this issue Sep 27, 2021 · 0 comments

Comments

@WNingup
Copy link

WNingup commented Sep 27, 2021

Thanks to the author for releasing the code. I am a newcomer in the field of machine learning and recently tried to use this model to train my own data set. I have encountered some problems in the process, and I wonder if anyone else has encountered such problems. Much appreciated.
`(Pytorch3.6) D:\1A\TEST\SPADE-master\SPADE-master>python train.py --name Anime --dataset_mode custom --label_dir datasets\Anime\label --image_dir datasets\Anime\train --label_nc 5 --no_instance
----------------- Options ---------------
D_steps_per_G: 1
aspect_ratio: 1.0
batchSize: 1
beta1: 0.0
beta2: 0.9
cache_filelist_read: False
cache_filelist_write: False
checkpoints_dir: ./checkpoints
contain_dontcare_label: False
continue_train: False
crop_size: 256
dataroot: ./datasets/cityscapes/
dataset_mode: custom [default: coco]
debug: False
display_freq: 100
display_winsize: 256
gan_mode: hinge
gpu_ids: 0
image_dir: datasets\Anime\train [default: None]
init_type: xavier
init_variance: 0.02
instance_dir:
isTrain: True [default: None]
label_dir: datasets\Anime\label [default: None]
label_nc: 5 [default: 13]
lambda_feat: 10.0
lambda_kld: 0.05
lambda_vgg: 10.0
load_from_opt_file: False
load_size: 286
lr: 0.0002
max_dataset_size: 9223372036854775807
model: pix2pix
nThreads: 0
n_layers_D: 4
name: Anime [default: label2coco]
ndf: 64
nef: 16
netD: multiscale
netD_subarch: n_layer
netG: spade
ngf: 64
niter: 50
niter_decay: 0
no_TTUR: False
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True [default: False]
no_pairing_check: False
no_vgg_loss: False
norm_D: spectralinstance
norm_E: spectralinstance
norm_G: spectralspadesyncbatch3x3
num_D: 2
num_upsampling_layers: normal
optimizer: adam
output_nc: 3
phase: train
preprocess_mode: resize_and_crop
print_freq: 100
save_epoch_freq: 10
save_latest_freq: 5000
serial_batches: False
tf_log: False
use_vae: False
which_epoch: latest
z_dim: 256
----------------- End -------------------
train.py --name Anime --dataset_mode custom --label_dir datasets\Anime\label --image_dir datasets\Anime\train --label_nc 5 --no_instance
dataset [CustomDataset] of size 6078 was created
Network [SPADEGenerator] was created. Total number of parameters: 92.1 million. To see the architecture, do print(network).
Network [MultiscaleDiscriminator] was created. Total number of parameters: 5.5 million. To see the architecture, do print(network).
create web directory ./checkpoints\Anime\web...
C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\upsampling.py:129: UserWarning: nn.Upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\functional.py:1320: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
(epoch: 1, iters: 100, time: 0.356) GAN: 0.590 GAN_Feat: 13.989 VGG: 0.132 D_Fake: 0.920 D_real: 0.794
(epoch: 1, iters: 200, time: 0.360) GAN: 1.309 GAN_Feat: 21.084 VGG: 41.610 D_Fake: 0.609 D_real: 0.100
(epoch: 1, iters: 300, time: 0.364) GAN: 1.374 GAN_Feat: 21.665 VGG: 3.995 D_Fake: 0.471 D_real: 0.459
(epoch: 1, iters: 400, time: 0.360) GAN: -0.204 GAN_Feat: 20.713 VGG: 0.708 D_Fake: 0.633 D_real: 0.532
(epoch: 1, iters: 500, time: 0.362) GAN: 1.410 GAN_Feat: 21.667 VGG: 0.132 D_Fake: 0.464 D_real: 0.544
(epoch: 1, iters: 600, time: 0.371) GAN: 1.970 GAN_Feat: 13.373 VGG: 0.131 D_Fake: 0.572 D_real: 0.538
(epoch: 1, iters: 700, time: 0.364) GAN: 1.761 GAN_Feat: 34.138 VGG: 0.132 D_Fake: 0.562 D_real: 0.549
(epoch: 1, iters: 800, time: 0.368) GAN: 0.355 GAN_Feat: 29.647 VGG: 0.112 D_Fake: 0.450 D_real: 0.566
(epoch: 1, iters: 900, time: 0.365) GAN: 2.103 GAN_Feat: 21.477 VGG: 7.289 D_Fake: 0.458 D_real: 0.575
(epoch: 1, iters: 1000, time: 0.361) GAN: 2.142 GAN_Feat: 21.897 VGG: 5.843 D_Fake: 1.034 D_real: 0.550
(epoch: 1, iters: 1100, time: 0.363) GAN: 0.164 GAN_Feat: 21.554 VGG: 0.710 D_Fake: 1.296 D_real: 0.575
(epoch: 1, iters: 1200, time: 0.361) GAN: 2.531 GAN_Feat: 21.417 VGG: 0.132 D_Fake: 1.005 D_real: 0.563
(epoch: 1, iters: 1300, time: 0.360) GAN: 2.437 GAN_Feat: 12.851 VGG: 0.131 D_Fake: 0.474 D_real: 2.082
(epoch: 1, iters: 1400, time: 0.366) GAN: 1.888 GAN_Feat: 47.128 VGG: 0.132 D_Fake: 0.454 D_real: 0.131
(epoch: 1, iters: 1500, time: 0.370) GAN: 1.848 GAN_Feat: 35.804 VGG: 0.116 D_Fake: 0.492 D_real: 0.617
(epoch: 1, iters: 1600, time: 0.362) GAN: 3.135 GAN_Feat: 22.408 VGG: 10.954 D_Fake: 0.377 D_real: 0.622
(epoch: 1, iters: 1700, time: 0.362) GAN: 2.834 GAN_Feat: 21.910 VGG: 5.201 D_Fake: 0.456 D_real: 0.213
(epoch: 1, iters: 1800, time: 0.364) GAN: 2.337 GAN_Feat: 22.842 VGG: 0.707 D_Fake: 0.933 D_real: 0.644
(epoch: 1, iters: 1900, time: 0.379) GAN: 2.430 GAN_Feat: 21.782 VGG: 0.132 D_Fake: 0.346 D_real: 0.646
(epoch: 1, iters: 2000, time: 0.366) GAN: 2.908 GAN_Feat: 13.164 VGG: 0.131 D_Fake: 0.958 D_real: 1.294
(epoch: 1, iters: 2100, time: 0.362) GAN: 2.578 GAN_Feat: 66.972 VGG: 0.132 D_Fake: 0.429 D_real: 0.125
(epoch: 1, iters: 2200, time: 0.363) GAN: 2.664 GAN_Feat: 51.285 VGG: 0.117 D_Fake: 0.323 D_real: 0.673
(epoch: 1, iters: 2300, time: 0.364) GAN: 2.826 GAN_Feat: 24.286 VGG: 27.855 D_Fake: 0.409 D_real: 0.231
(epoch: 1, iters: 2400, time: 0.366) GAN: -140.364 GAN_Feat: 24.303 VGG: 8.292 D_Fake: 0.296 D_real: 0.715
(epoch: 1, iters: 2500, time: 0.371) GAN: 2.844 GAN_Feat: 23.004 VGG: 0.710 D_Fake: 0.505 D_real: 4.202
(epoch: 1, iters: 2600, time: 0.365) GAN: 2.341 GAN_Feat: 22.577 VGG: 0.132 D_Fake: 0.301 D_real: 0.723
(epoch: 1, iters: 2700, time: 0.364) GAN: 2.723 GAN_Feat: 12.803 VGG: 0.131 D_Fake: 0.637 D_real: 0.710
(epoch: 1, iters: 2800, time: 0.364) GAN: 3.091 GAN_Feat: 79.301 VGG: 0.132 D_Fake: 0.350 D_real: 0.701
(epoch: 1, iters: 2900, time: 0.371) GAN: 1.558 GAN_Feat: 66.336 VGG: 0.117 D_Fake: 0.262 D_real: 0.749
(epoch: 1, iters: 3000, time: 0.367) GAN: 3.037 GAN_Feat: 25.476 VGG: 42.269 D_Fake: 0.258 D_real: 0.743
(epoch: 1, iters: 3100, time: 0.364) GAN: 2.946 GAN_Feat: 24.450 VGG: 7.939 D_Fake: 0.361 D_real: 0.743
(epoch: 1, iters: 3200, time: 0.367) GAN: 2.855 GAN_Feat: 25.554 VGG: 0.710 D_Fake: 0.899 D_real: 0.990
(epoch: 1, iters: 3300, time: 0.366) GAN: 2.983 GAN_Feat: 24.364 VGG: 0.132 D_Fake: 0.195 D_real: 1.252
(epoch: 1, iters: 3400, time: 0.366) GAN: 3.234 GAN_Feat: 13.203 VGG: 0.131 D_Fake: 0.331 D_real: 0.807
(epoch: 1, iters: 3500, time: 0.365) GAN: 3.182 GAN_Feat: 95.970 VGG: 0.132 D_Fake: 0.320 D_real: 0.708
(epoch: 1, iters: 3600, time: 0.360) GAN: 2.633 GAN_Feat: 73.374 VGG: 0.117 D_Fake: 0.206 D_real: 0.791
(epoch: 1, iters: 3700, time: 0.365) GAN: 2.849 GAN_Feat: 27.089 VGG: 98.349 D_Fake: 0.314 D_real: 0.118
(epoch: 1, iters: 3800, time: 0.363) GAN: 3.041 GAN_Feat: 26.040 VGG: 33.939 D_Fake: 0.866 D_real: 0.793
(epoch: 1, iters: 3900, time: 0.364) GAN: 2.897 GAN_Feat: 27.133 VGG: 0.711 D_Fake: 0.294 D_real: 2.774
(epoch: 1, iters: 4000, time: 0.362) GAN: 2.359 GAN_Feat: 24.519 VGG: 0.132 D_Fake: 0.297 D_real: 0.283
(epoch: 1, iters: 4100, time: 0.363) GAN: 2.644 GAN_Feat: 14.003 VGG: 0.131 D_Fake: 0.196 D_real: 0.827
(epoch: 1, iters: 4200, time: 0.362) GAN: 2.373 GAN_Feat: 110.309 VGG: 0.132 D_Fake: 0.284 D_real: 0.831
(epoch: 1, iters: 4300, time: 0.363) GAN: 2.907 GAN_Feat: 82.215 VGG: 0.117 D_Fake: 0.167 D_real: 0.838
(epoch: 1, iters: 4400, time: 0.370) GAN: 178.683 GAN_Feat: 28.263 VGG: 339.884 D_Fake: 0.275 D_real: 0.797
(epoch: 1, iters: 4500, time: 0.364) GAN: 2.094 GAN_Feat: 27.223 VGG: 89.174 D_Fake: 0.190 D_real: 0.841
(epoch: 1, iters: 4600, time: 0.364) GAN: -6.547 GAN_Feat: 29.395 VGG: 0.712 D_Fake: 0.805 D_real: 0.807
(epoch: 1, iters: 4700, time: 0.365) GAN: 2.897 GAN_Feat: 27.192 VGG: 0.132 D_Fake: 0.819 D_real: 0.795
(epoch: 1, iters: 4800, time: 0.363) GAN: 2.807 GAN_Feat: 14.091 VGG: 0.131 D_Fake: 23.561 D_real: 106.816
(epoch: 1, iters: 4900, time: 0.363) GAN: 3.097 GAN_Feat: 116.728 VGG: 0.132 D_Fake: 0.305 D_real: 0.380
(epoch: 1, iters: 5000, time: 0.361) GAN: 3.865 GAN_Feat: 85.325 VGG: 0.117 D_Fake: 0.184 D_real: 0.628
saving the latest model (epoch 1, total_steps 5000)
Saved current iteration count at ./checkpoints\Anime\iter.txt.
Traceback (most recent call last):
File "train.py", line 43, in
trainer.run_discriminator_one_step(data_i)
File "D:\1A\TEST\SPADE-master\SPADE-master\trainers\pix2pix_trainer.py", line 44, in run_discriminator_one_step
d_losses = self.pix2pix_model(data, mode='discriminator')
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\parallel\data_parallel.py", line 141, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\pix2pix_model.py", line 50, in forward
input_semantics, real_image)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\pix2pix_model.py", line 168, in compute_discriminator_loss
fake_image, _ = self.generate_fake(input_semantics, real_image)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\pix2pix_model.py", line 195, in generate_fake
fake_image = self.netG(input_semantics, z=z)
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\networks\generator.py", line 91, in forward
x = self.head_0(x, seg)
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "D:\1A\TEST\SPADE-master\SPADE-master\models\networks\architecture.py", line 53, in forward
dx = self.conv_0(self.actvn(self.norm_0(x, seg)))
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\modules\module.py", line 485, in call
hook(self, input)
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\utils\spectral_norm.py", line 100, in call
setattr(module, self.name, self.compute_weight(module, do_power_iteration=module.training))
File "C:\ProgramData\Anaconda3\envs\Pytorch3.6\lib\site-packages\torch\nn\utils\spectral_norm.py", line 86, in compute_weight
sigma = torch.dot(u, torch.mv(weight_mat, v))
RuntimeError: cublas runtime error : an access to GPU memory space failed at C:/a/w/1/s/windows/pytorch/aten/src/THC/THCBlas.cu:21

(Pytorch3.6) D:\1A\TEST\SPADE-master\SPADE-master>`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant