New and Improved MapFusion #1629

philip-paul-mueller · 2024-08-22T13:54:43Z

A new and improved version of the map fusion transformation.
The transformation is implemented in a class named MapFusionSerial, furthermore the MapFusionParallel transformation is added, that allows to fuse parallel maps together.
The new transformation analyses the graph more carefully when it checks if and how it should perform the fusing.
Special consideration was given about the correction of memlets.
However, there is still some aspects that should be improved and allowed to handle.

The new transformation produces graphs that are slightly different from before, and certain (other) transformations can not handle the resulting SDFG. For that reason a compatibility flag strict_dataflow was introduced. However, by default this flag is disabled. The only place where it is activated is inside the auto optimization function.

Furthermore, the SDFGState._read_and_write_sets() function has been rewritten to handle the new SDFGs, because of some bugs. However, one bug has been kept because of other transformations that would fail otherwise.
But it is a bug, tests were written to demonstrate this.

Collection of known issues in other transformation:

Now using the 3.9 type hints.

But it is too restrictive.

When the function was fixing the innteriour of the second map, it did not remove the readiong.

It almost passes all fuction. However, the one that needs renaming are not yet done.

…t in the input and output set. However, it is very simple.

Before it was going to look for the memlet of the consumer or producer. However, one should actually only look at the memlets that are adjacent to the scope node. At least this is how the original worked. I noticed this because of the `buffer_tiling_test.py::test_basic()` test. I was not yet focused on maps that were nested and not multidimensional. It seems that the transformation has some problems there.

Whet it now cheks for covering (i.e. if the information to exchange is enough) it will now no longer decend into the maps, but only inspect the first outgoing/incomming edges of the map entrie and exit. I noticed that the other way was to restrictive, especially for map tiling.

Otherwise we can end up in recursion.

Before it was replacing the elimated variables by zero. Which actually worked pretty good, but I have now changed that such that `offset()` is used. I am not sure why I used `replace` in the first place, but I think that there was an issue. However, I am not sure.

…ck is taken.

…and_write_sets()`.

…so Issue spcl#1634 for more details. As it is written in the issue, I can not simply remove the check but I also have to adapte the tests. The main important one is `tests/transformations/move_loop_into_map_test.py::MoveLoopIntoMapTest::test_more_than_a_map` where the behaviour has changed. However, after carefull examination I am sure that the test is still correct, or better now works correct as there is no dependency.

…sabled in 5c49eee. The reason is because `tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple` fails (in auto optimizer mode). I remember that now. Also the issue is 1643.

Before it was a DaCe Property, but I relaized now that it should actually be a plain data member. This also solves lots of issues I had with serialization.

I will now add a very complicated test to ensure that it realy does what I want.

Let's see if the CI can handle it.

acalotoiu · 2024-09-25T14:35:44Z

I have some trouble understanding the difference between -Serial and -Parallel and why both are necessary?

acalotoiu

I agree that MapFusion has issues - but I am slightly confused by the current approach to provide multiple versions - it seems as though fixes to the original should be preferrable. Why should we maintain many different implementations, and not have one maintained one, and deprecate the rest? @tbennun - what is your view?

acalotoiu · 2024-09-25T14:13:43Z

dace/sdfg/state.py

            # Union all subgraphs, so an array that was excluded from the read
            # set because it was written first is still included if it is read
            # in another subgraph
            for data, accesses in rs.items():
                read_set[data] += accesses
            for data, accesses in ws.items():
                write_set[data] += accesses
-        return read_set, write_set
+        return copy.deepcopy((read_set, write_set))


Why is it necessary to make a copy here?

Because, the subsets are still linked to the Memlets, so if you modify them then you change them at the Memlets.
This might be useful but since it is not possible to determine which subset belongs to which memlet, it does not make sense to maintain this link.
You could potentially copy them above, because some of them are constructed on demand anyway, however:

The on demand construction is the minority case.

Doing it here allows to do it in one big sweep.

acalotoiu · 2024-09-25T14:18:29Z

dace/sdfg/state.py

-                    for e in out_edges:
-                        # skip empty memlets
-                        if e.data.is_empty():
+                if not isinstance(n, nd.AccessNode):


Can you please add more comments to make it easier to follow what is being done?

I added more comments.

acalotoiu · 2024-09-25T14:20:31Z

dace/transformation/dataflow/__init__.py

I am not 100% about the naming solution - having 3/5 different kinds of MapFusion seems like a poor solution that will lead to confusion - between Serial,Parallel, OTF and the original MapFusions versions are proliferating. Would it not be preferrable to fix/choose one?

I agree, but the original map fusion was unable to do parallel fusing, the available solutions are:

Combining serial and parallel map fusion into one

Doing the above one but making parallel an opt in option.

The best solution is probably 2. because then everything will work as before. But I do not care what do you think @tbennun @acalotoiu

Now only the serial version is there. Also integrated the helper into the serial file.

This reverts commit 259d17c.

…f data can be removed. This is because the function is much less strict.

Started with a first version of the map fusion stuff.

aa433fe

philip-paul-mueller changed the title ~~Started with a first version of the map fusion stuff.~~ New and Improved MapFusion Aug 22, 2024

philip-paul-mueller marked this pull request as draft August 22, 2024 13:55

philip-paul-mueller added 27 commits August 23, 2024 08:32

Made some stylistic modification to teh code.

71a88a1

Now using the 3.9 type hints.

Added a function for estimating if something is pointwhise.

bc87ddb

But it is too restrictive.

Now there is an error in the actuall rewiering stuff.

497a2d6

Fixed a bug in the map fusion.

9e36447

When the function was fixing the innteriour of the second map, it did not remove the readiong.

Made some formating changes.

7a48e0d

Updated the tests of the map fusion.

d609045

It almost passes all fuction. However, the one that needs renaming are not yet done.

WIP: Started with a renamer function.

52c4542

Continued with the parallel fusion stuff.

3b758bf

The fusion transformation now also checks if there is a write conflic…

377b428

…t in the input and output set. However, it is very simple.

Updated some tests.

db4864b

Fixed an error. I shouild refactor that damn loop.

f395acd

Some improvements to the tests.

b1ab95e

Removed some debugging stuff.

945ca8f

Fixed some typing stuff.

940b9b6

Started with a better implementation for the data dependency test.

ecae361

First version of the pointwise checker in the map fusion.

64d07fd

Updated some test cases.

33a0edf

The shared data cache can not be dumped.

ff018f4

Otherwise we can end up in recursion.

Buffer tiling now finally works.

9267ea9

The Mapreduce now also works.

fc2db8a

Added a test to the map fusion stuff that ensures that the shared blo…

4d9f11d

…ck is taken.

Added a test for the indirect accesses case.

2b91465

Updated the heat 3d test. It now ensures that the fusion is now done.

73f4415

Fixed an error in the parallel map fusion.

94ecd19

philip-paul-mueller marked this pull request as ready for review September 6, 2024 13:42

philip-paul-mueller requested a review from tbennun September 6, 2024 14:13

philip-paul-mueller added 4 commits September 9, 2024 07:50

Merge branch 'master' into new-map-fusion

a444992

Updated the comment about the wrong filter check in `SDFGState._read_…

a023f7c

…and_write_sets()`.

Had to reenable the check in SDFGState._read_and_write_sets() is di…

63e78c9

…sabled in 5c49eee. The reason is because `tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple` fails (in auto optimizer mode). I remember that now. Also the issue is 1643.

philip-paul-mueller mentioned this pull request Sep 10, 2024

Fixed a bug in the map fusion transformation. #1535

Closed

philip-paul-mueller added 7 commits September 11, 2024 08:48

Modified the shared_data attribute of teh MapFusionHelper.

896ac68

Before it was a DaCe Property, but I relaized now that it should actually be a plain data member. This also solves lots of issues I had with serialization.

Merge remote-tracking branch 'spcl/master' into new-map-fusion

33f9fdd

This compute offset function seems to solve all my problems.

fcffb22

I will now add a very complicated test to ensure that it realy does what I want.

Added a test for the special case.

0ddb3c2

Let's see if the CI can handle it.

Did some cleanup.

05ffee4

Merge remote-tracking branch 'spcl/master' into new-map-fusion

6de85c7

Specified how the corrector function of the offsets works.

8c86662

philip-paul-mueller mentioned this pull request Sep 23, 2024

feat[dace]: Updated DaCe Transformations GridTools/gt4py#1639

Draft

6 tasks

Merge branch 'master' into new-map-fusion

dfc92e7

acalotoiu self-requested a review September 25, 2024 14:36

acalotoiu requested changes Sep 25, 2024

View reviewed changes

philip-paul-mueller added 2 commits September 26, 2024 08:17

UPdated some comments.

11a3167

Added more comments.

44cf6ad

phschaad self-requested a review September 26, 2024 15:22

philip-paul-mueller mentioned this pull request Oct 16, 2024

[DO NOT REVIEW] Fixing a (likely) bug in MapFusion. #1673

Draft

philip-paul-mueller added 7 commits October 31, 2024 09:07

Merge branch 'main' into new-map-fusion

914d67b

Removed the parallel map fusion transformation.

5e25816

Now only the serial version is there. Also integrated the helper into the serial file.

Added a new test.

259d17c

Fixed a missing include.

db26320

Revert "Added a new test."

fa67492

This reverts commit 259d17c.

It seems that I have removed a test.

3453c6c

Realized that I can not use SDFG.shared_transient() for detection i…

90731af

…f data can be removed. This is because the function is much less strict.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New and Improved MapFusion #1629

New and Improved MapFusion #1629

philip-paul-mueller commented Aug 22, 2024 •

edited

Loading

acalotoiu commented Sep 25, 2024

acalotoiu left a comment

acalotoiu Sep 25, 2024

philip-paul-mueller Sep 26, 2024

acalotoiu Sep 25, 2024

philip-paul-mueller Sep 26, 2024

acalotoiu Sep 25, 2024

philip-paul-mueller Sep 26, 2024

New and Improved MapFusion #1629

Are you sure you want to change the base?

New and Improved MapFusion #1629

Conversation

philip-paul-mueller commented Aug 22, 2024 • edited Loading

acalotoiu commented Sep 25, 2024

acalotoiu left a comment

Choose a reason for hiding this comment

acalotoiu Sep 25, 2024

Choose a reason for hiding this comment

philip-paul-mueller Sep 26, 2024

Choose a reason for hiding this comment

acalotoiu Sep 25, 2024

Choose a reason for hiding this comment

philip-paul-mueller Sep 26, 2024

Choose a reason for hiding this comment

acalotoiu Sep 25, 2024

Choose a reason for hiding this comment

philip-paul-mueller Sep 26, 2024

Choose a reason for hiding this comment

philip-paul-mueller commented Aug 22, 2024 •

edited

Loading