-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is this library parallelized? #165
Comments
Hi @ch21d012 yes, whilst FTorch itself has no MPI components we have run several applications that are large MPI codebases on HPC systems. To do this we create a net and tensors on each process, then run inference on each process. The most common cause for slowdown is if you are reading in the net at every iteration/timestep of your code, as this is expensive. I'm hoping to write a worked example around this when I get time. Please let us know if this helps, and any further questions. |
This is my module i am loading the model only once at the start of the simulation in intialization by calling read_pt_file and deleting at the end of simulation delete_pt_file |
i am calling the ann_get_src_pt in the above loop where it gives me sol values this code is parallelized and each loop runs on different processor |
Dear @ch21d012 , in order for us to help more, we would need some timing information. Could you please provide timing for the original code on 1 and 4 processes. And then similar timings for running the code with the pytorch model. You can time the whole code, but it would also be handy to know how long the inference/ |
Dear @TomMelt thanks for the reply so my original code on 1 processor and 4 processor takes same time i.e., 1e-3. when calling the function "call ann_get_src_pt(sol,dt)" But when i run with pytorch model on 1 processor the time is 1e-3, and on 4 processor the time is 5e-3 it increased and these are at ann_get_src_pt step only. |
Hi @ch21d012 , I think it might be easiest if we meet virtually so we can discuss this problem in more detail. |
Hi I am using this FTorch library to perform tensor operations in my fortran code to improve the speed. But my code is parallelized using mpi and when i am running in multiple processors using this library it is slow. So I want to know whether this library can be used in parallel environment or nor?
Thanks in advance
The text was updated successfully, but these errors were encountered: