-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training becomes very slow with these transforms. #426
Comments
Hi, could you include the snippet of code on how you using Augraphy in your training? |
Sure, following is the flow. Importing transforms
Creating a wrapper around it
Then I compose the transforms using
|
By looking at the benchmark results:
|
I did that, I only considered the augmentation, for which |
So probably you can let me know roughly your image size? Then i can try to reproduce this with the code above from my end too. |
Thanks. The image size is around |
I have narrowed down the issue. As I mentioned,
But the thread limiting is somehow not working for the subprocesses. Hence fetching data from dataloader is still slow as it spawns multiple workers. |
Okay, and looks like your provided code above is not complete, what would be this Then in In your 20 cores machine, you uses multi GPUs too? |
A minor correction in the code: This is how the transforms are defined:
This is how transform config look like:
We use 2 GPUs to train. |
I tried with colab with image size of Here's the notebooks: Probably there's some overhead in the custom augmentation functions in multi gpu or multi cores. Have you try with other created augmentation function instead of Augraphy? |
I am training a CNN model with
Augraphy
and some other transforms. When I include just 4-5Augraphy
transforms with 10-20% probability, my training becomes ~10 times slower.When I checked the
htop
, I noticed that load average was shooting up very high when using these transforms.I tried to do few things but nothing helped in speeding the training, such as reducing
num_workers
, etc.Please guide me on how can I overcome this issue.
The text was updated successfully, but these errors were encountered: