Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about training #73

Open
Jamie-Cheung opened this issue Mar 13, 2022 · 3 comments
Open

A question about training #73

Jamie-Cheung opened this issue Mar 13, 2022 · 3 comments

Comments

@Jamie-Cheung
Copy link

[Epoch 20/2558] [Batch 750/782] [D loss: 1.229028] [G loss: -36.288643] [ema: 0.999577]
100%|██████████████████████████████████████████████████████████| 782/782 [04:56<00:00, 2.64it/s]
100%|██████████████████████████████████████████████████████████| 782/782 [04:47<00:00, 2.72it/s]
INFO:functions:=> calculate inception score
=> calculate inception score
Inception score: 0
=> calculate fid score
0%| | 0/6250 [00:00<?, ?it/s]
I return to train the experiment on 23090, but it always reponsed this issue for several hours. And 23090 are running on the half efficiency. I want to know whether it is appropriate? Thank you for your contribition and it gives me much help.

@yifanjiang19
Copy link
Contributor

Sorry, I can not understand the description. Do you mean your program is stuck for several hours?

@Jamie-Cheung
Copy link
Author

Yes, The code seems to be stuck (=> calculate fid score
0%| | 0/6250 [00:00<?, ?it/s]). But the GPU and CPU are used. Is it due to that 2 * 3090 is not enough to calculate FID ?

@yifanjiang19
Copy link
Contributor

I would suggest disable calculating fid score during training program and launch a separate jobs for evaluation only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants