Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SplitPytatoArrayContext that attempts a trivial parallelization strategy #216

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kaushikcfd
Copy link
Collaborator

@kaushikcfd kaushikcfd commented Jan 17, 2023

  • CI should pass once Fix codegen for temporary variables declaration loopy#738 is in.
  • ⚠️ Concurrency across reductions is not targeted i.e. results in poor performance for computing error norms, etc. (Maybe we should fix this before merging?)
    • In the meshmode version, we desperately froze such reductions eagerly
    • Update: Added a transformation specifically for "reduce-to-scalar" operations to parallelize the reduction operation using loopy transformations.
  • We need a better name than SplitPytatoArrayContext.

@kaushikcfd kaushikcfd force-pushed the split_pytato_actx branch 5 times, most recently from 615e2b5 to 5249a38 Compare January 17, 2023 20:56
@kaushikcfd
Copy link
Collaborator Author

Making this a draft until the CI is fixed.

@kaushikcfd kaushikcfd marked this pull request as draft January 19, 2023 02:53
@kaushikcfd kaushikcfd force-pushed the split_pytato_actx branch 3 times, most recently from adf8d2b to b736f3a Compare January 19, 2023 03:25
@kaushikcfd kaushikcfd force-pushed the split_pytato_actx branch 3 times, most recently from 11583d7 to fe7517e Compare January 24, 2023 04:59
@kaushikcfd kaushikcfd force-pushed the split_pytato_actx branch 3 times, most recently from be0878b to ac39271 Compare January 28, 2023 20:46
@kaushikcfd kaushikcfd marked this pull request as ready for review January 28, 2023 21:09
@kaushikcfd kaushikcfd force-pushed the split_pytato_actx branch 2 times, most recently from 41ecb8d to bb03c86 Compare January 30, 2023 21:43
Copy link
Owner

@inducer inducer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. LGTM other than these (fairly minor) issues.

@@ -714,7 +716,7 @@ def test_array_equal(actx_factory):
def test_array_context_einsum_array_manipulation(actx_factory, spec):
actx = actx_factory()

mat = actx.from_numpy(np.random.randn(10, 10))
mat = actx.from_numpy(np.random.randn(16, 16))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason why these have to be a nice power of two now?

import pytato


class SplitPytatoPyOpenCLArrayContext(PytatoPyOpenCLArrayContext):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this thing is named "split". Maybe "generic parallelizing"/"basic parallelizing"?

Comment on lines +3 to +5
Copyright (C) 2023 Andreas Kloeckner
Copyright (C) 2022 Matthias Diener
Copyright (C) 2022 Matt Smith
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these (other than @kaushikcfd) should be U of I BOT. None of us hold copyright to our work.

@@ -0,0 +1,143 @@
"""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to docs somewhere?

Comment on lines +91 to +93
We deliberately avoid using :class:`pytato.transform.CombineMapper` since
the mapper's caching structure would still lead to recomputing
the union of sets for the results of a revisited node.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this comment. Could you explain? And: shouldn't we fix this in CombineMapper (if we can)? (or at least file an issue in pytato?) (also above)


def _split_reduce_to_scalar_across_work_items(
kernel: lp.LoopKernel,
callables: Mapping[str, InKernelCallable],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit weird that callables is needed here/by precompute_for_single_kernel.

device: "pyopencl.Device",
) -> lp.LoopKernel:

assert len({kernel.id_to_insn[insn_id].reduction_inames()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could elevate reduction inames to a property of the _LoopNest?


# collect loop nests of instructions that assign to scalars in the array
# program.
insn_id_to_loop_nest: Mapping[str, _LoopNest] = {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If insn_id_to_loop_nest isn't used, why not go straight to all_loop_nests?

Comment on lines +443 to +444
This routine **assumes** that the entrypoint in *t_unit* global
barriers inserted as per :func:`_get_call_kernel_insn_ids`.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar? I'm not sure I understand.

Comment on lines +497 to +499
# a mapping from shape to the available base storages from temp variables
# that were dead.
shape_to_available_base_storage: Dict[int, Set[str]] = defaultdict(set)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# a mapping from shape to the available base storages from temp variables
# that were dead.
shape_to_available_base_storage: Dict[int, Set[str]] = defaultdict(set)
# a mapping from size in bytes to the available base storages from temp variables
# that were dead.
nbytes_to_available_base_storage: Dict[int, Set[str]] = defaultdict(set)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants