-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor scaling with many calls to add_prefetch #756
Comments
Passing |
Also, if the workload is coming from Mirge-Com, it might be useful to evaluate if such big batched einsums are relevant. See illinois-ceesd/mirgecom#777 for context. |
Good point. I'll dump out the kernels for the current y3 driver to see if anything has changed in terms of batch sizes. In any case, just setting
|
Even with #755, attempting to prefetch many arrays scales poorly. By the 19th add_prefetch operation it takes around 5 seconds for add_prefetch to complete on one fused Mirgecom kernels with 100+ einsums. Profiling shows a lot of time is spent in
get_grid_sizes_for_insn_ids_as_dicts
.The text was updated successfully, but these errors were encountered: