You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally posted by jakubMitura14 August 30, 2024
Hello is it possible to do a profiling of a model per layer - to monitor how long each layer took to execute and preferably max gpu memory consumption per layer ?
Originally posted by avik-pal August 30, 2024
I have wanted to build something like this for quite some time. It isn't going to be very hard. I like how easy it is to generate flamegraphs in Julia (especially in vs-code), but I agree that it doesn't give the data in a higher-level granularity that is helpful for end-users.
The general sketch for it would be similar to how DebugLayer is implemented. Essentially, we do @profile_mode model then for each "leaf" model, we construct ProfiledLayer(...) which stores a common timer_output (TimerOutputs.jl) this gives a tree-view of where all the time went.
This can be further augmented with other profiling tools like memory usage (CPU / GPU) etc. A good source of inspiration would be how the pytorch profiler works.
The text was updated successfully, but these errors were encountered:
Discussed in https://github.com/orgs/LuxDL/discussions/863
Originally posted by jakubMitura14 August 30, 2024
Hello is it possible to do a profiling of a model per layer - to monitor how long each layer took to execute and preferably max gpu memory consumption per layer ?
Originally posted by avik-pal August 30, 2024
I have wanted to build something like this for quite some time. It isn't going to be very hard. I like how easy it is to generate flamegraphs in Julia (especially in vs-code), but I agree that it doesn't give the data in a higher-level granularity that is helpful for end-users.
The general sketch for it would be similar to how DebugLayer is implemented. Essentially, we do @profile_mode model then for each "leaf" model, we construct ProfiledLayer(...) which stores a common timer_output (TimerOutputs.jl) this gives a tree-view of where all the time went.
This can be further augmented with other profiling tools like memory usage (CPU / GPU) etc. A good source of inspiration would be how the pytorch profiler works.
The text was updated successfully, but these errors were encountered: