Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--label_nc and --contain_dontcare_label #137

Open
zerolatnc opened this issue Mar 25, 2021 · 2 comments
Open

--label_nc and --contain_dontcare_label #137

zerolatnc opened this issue Mar 25, 2021 · 2 comments

Comments

@zerolatnc
Copy link

Hi,

I have a dataset with labels 1 to 121 and an unknown label 0.

I am a little confused as to what value I have to set --label_nc to and whether I should use the flag --contain-dontcare-label. I tried --label_nc 121 --contain_dontcare_label and this gives me a CUDA error like this:

/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [55,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [56,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [57,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [50,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [51,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [52,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [53,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [54,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [55$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [21$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [22$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [23$
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [29,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):
  File "train.py", line 43, in <module>
    trainer.run_generator_one_step(data_i)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/trainers/pix2pix_trainer.py", line 36, in run_ge
nerator_one_step
    g_losses, generated, pred_seg = self.pix2pix_model(data, mode='generator')
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py",
 line 159, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 59, in forward
    generator_input, real_image)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 183, in compute_g
enerator_loss
    input_semantics, real_image, compute_kld_loss=self.opt.use_vae)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 247, in generate_
fake
    fake_image = self.netG(input_semantics, z=z)
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/networks/generator.py", line 92, in forwa
rd
    x = F.interpolate(seg, size=(self.sh, self.sw))
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/functional.py", line 3132,
in interpolate
    return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
RuntimeError: CUDA error: device-side assert triggered
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: device-side assert triggered
Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:687 (most recen
t call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f21021568b2 in /home/user/anacon
da/envs/synthcp/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xad2 (0x7f21023a8952 in /home/user/anacon
da/envs/synthcp/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f2102141b7d in /home/user/anaconda/envs/synthcp
/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x60246a (0x7f2150b3e46a in /home/user/anaconda/envs/synthcp/lib/python3.7
/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x602516 (0x7f2150b3e516 in /home/user/anaconda/envs/synthcp/lib/python3.7
/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #20: __libc_start_main + 0xe7 (0x7f2153662b97 in /lib/x86_64-linux-gnu/libc.so.6)

exp1_train_spade.sh: line 17: 10074 Aborted                 (core dumped) python3 train.py --name caos --
dataset_mode caos --dataroot ../anomaly/exp1_train_spade --label_nc 121 --contain_dontcare_label --no_ins
tance --niter 1 --batchSize 2 --nThread 15 --gpu_ids 0 --no_html --tf_log --dataroot_source ../anomaly/ex
p1_train_spade --dataroot_target ../anomaly/exp1_train_spade --image_dir ../anomaly/exp1_train_spade/trai
n/imgs --image_dir_source ../anomaly/exp1_train_spade --image_dir_target ../anomaly/exp1_train_spade --la
bel_dir ../anomaly/exp1_train_spade/train/segs_gs --label_dir_source ../anomaly/exp1_train_spade --label_
dir_target ../anomaly/exp1_train_spade --checkpoints_dir ./checkpoints_spade

I tried other combinations of --label_nc and --contain_dontcare_label, but I haven't been able to get the training to run successfully. If you could provide some clarity on how these parameters should be set for a custom data set, it would help me a lot!

@HaotianWang6897
Copy link

I have faced the similar issue. Please organize your label ID from 0-N, N is the total number of your class. All the don't care label is set to be 255.

@ruibo5
Copy link

ruibo5 commented Aug 2, 2023

I already resetting this label pixel number. class pixel is 16 or 05, and background pixel number is 255, but i got the same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants