Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'GaussianDiffusion' object has no attribute 'cond' when training with multi-GPU #18

Open
WillQuCD opened this issue Mar 30, 2023 · 8 comments

Comments

@WillQuCD
Copy link

File "train.py", line 320, in 3.24s/it]
main(args, configs)
File "train.py", line 196, in main
figs, wav_reconstruction, wav_prediction, tag = synth_one_sample(
File "/data/workspace/liukaiyang/TTS/DiffGAN-TTS-main/utils/tools.py", line 227, in synth_one_sample
mels = [mel_pred[0, :mel_len].float().detach().transpose(0, 1) for mel_pred in diffusion.sampling()]
File "/root/anaconda3/envs/LKYBase/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/data/workspace/liukaiyang/TTS/DiffGAN-TTS-main/model/diffusion.py", line 157, in sampling
b, *_, device = *self.cond.shape, self.cond.device
File "/root/anaconda3/envs/LKYBase/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1177, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'GaussianDiffusion' object has no attribute 'cond'

Hi,
thank you very much for your great work!

the model runs well on single GPU but encounters the above problem when training with multi-GPU.

This problem only arises in the val phase, since the sampleling() requires the cond parameter, but self.cond is not defined in the val phase.

The issue was reported in the previous issue, but without any solutions.

Could you please give some hints?

@iooops
Copy link

iooops commented Apr 5, 2023

Encountered the same issue

@WillQuCD
Copy link
Author

@iooops any progress?

@yyh565655555
Copy link

@WillQuCD do you train the naive model? any ways to solve it?

@WillQuCD
Copy link
Author

@yyh565655555 yes, I train the naive model. The error only happens during the validation phase.
One rude solution is only running the trainig phase without validation.

@Joyful-Buffalo
Copy link

@WillQuCD Hi, I encountered the same issue, how to skip validation?

@yyh565655555
Copy link

yyh565655555 commented Sep 14, 2023 via email

@Joyful-Buffalo
Copy link

the problem is using more than one gpu to run this code,just choose one gpu or add nn.dataparallel in the code

---- Replied Message ---- From liao-h-c @.> Date 09/14/2023 22:00 To keonlee9420/DiffGAN-TTS @.> Cc @.>, Mention @.> Subject Re: [keonlee9420/DiffGAN-TTS] 'GaussianDiffusion' object has no attribute 'cond' when training with multi-GPU (Issue #18) @WillQuCD Hi, I encountered the same issue, how to skip validation? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Thank you for your response,where should it be added in the code?This is very important to me, and I'm eagerly looking forward to your reply. Thank you.

@wangxuanji
Copy link

Has anyone solved it? What should be done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants