Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform FD broadcast objs to use MArrays #1763

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

charleskawczynski
Copy link
Member

Similar to #1735, this PR transforms our broadcast expressions by replacing array-backed DataLayouts to MArray-backed DataLayouts in each column of the stencil kernels.

The main benefit here is that this will result in at most 1 heap memory read for whatever fields we transform. One downside is that those reads become unconditional. I'll update #1746 with more details on these trade-offs, and what design choices we can make to ensure that we only see performance gains without regressions.

Closes #1746.

@charleskawczynski
Copy link
Member Author

To recap on why this stalled: transforming the broadcasted object into an MArray-backed data structure forces reads of all field variables in the broadcasted object. That means that, even if a variable is not used in the broadcast expression, it still incurs a read, which will tank performance.

If there was some way that we could determine which fields are read, and only transform those variables, then this would be feasible, but I'm not sure if that's possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use shared/local memory in FD stencils kernels
1 participant