Dyno: add a type to represent iterators, implement loop expressions #25994

DanilaFe · 2024-09-25T19:27:08Z

This PR adds a new type sub-hierarchy for iterable things. It creates some scaffolding for promoted expressions, though it does not currently implement them. It also matches production's implementation of iterators, which, as we have discussed with @bradcray, @mppf and @benharsh may not be the direction we want to take in the long term.

What?

This PR adds a new IteratorType type, as well as subclasses for it for various situations that produce iterables (promotion, loop expressions, calls to iter procedures). This way, we can unify the logic for resolving iterators on records/classes and "iterables", invoking a these method. The these methods for user records/classes are, as before, provided explicitly by the user; the these methods for iterators are compiled-provided (production generates them, in Dyno we simply have special logic to mimic their resolution).

The three subclasses of IteratorType are as follows:

FnIteratorType: the class of iterators created from invocations to iter procedures. Just like in production, this type contains a reference to the function that created it. When the iterator is traversed (using these), the compiler resolves a call to a function with the same name as the original iter procedures, and with the same formal types. This matches the production compiler's implementation, down to the ability to provide additional overloads for serial/parallel iterators outside of where they are originally defined (for better or for worse).
LoopExprIteratorType: the class of iterators created from loop expressions. In production, there are some tricky cases that make the "yield type" of a loop expression be fluid, since it can depend on the instantiation of the follower iterator it uses, which can change depending on the leader. Thus, it's possible that in forall i in (loopExpr) and forall (_, i) in zip(.., loopExpr), the index variable i is a different type. To make this work, we could need to capture the outer variables of the loop expression and re-resolve it with different leaders. This seems undesirable as per discussions with Brad et. al., and as a result, I instead make the assumption that loop expressions always yield the same type, and that they do not re-resolve their leaders and followers. There's some future work to make this more principled.
PromotionIteratorType: the class of iterators created from promoted expressions. Since promotion is not implemented, this is just left as a scaffold.

This PR removes some speculative resolution logic originally introduced by @dlongnecke-cray in #24915. Specifically, calls to iterators like foo() are no longer re-resolved with tags if original resolution fails. This has several motivators:

Production doesn't support this, and instead requires a serial iterator to be present for foo() to successfully invoke foo(standalone) etc.
In David's PR, the re-resolution only happens at the site of zips / loop iterands. However, if we truly wanted to implement this feature, we would need to handle invocations of iterators in arbitrary contexts (e.g., as sub-expressions of promoted expressions), and thus would likely need to hoist the "insert tags if needed" logic to the more general call resolution world.
The need to speculatively resolve an expression before inserting tags (and re-emit errors later if needed) made it difficult to resolve arrays types early (instead of resolving them as bracket loops).
- Resolving array types late no longer works under this PR, but I think that's fine. Previously, we would resolve a call to _domain.these with a generic receiver (normally not allowed), and fail to infer its return type despite succeeding in call resolution. This PR requires knowing the return type (to make use of the fact that it's iterable), so this no longer flies. Anyway, relying on resolving call with an accidentally generic receiver seems unfortunate.
- Resolving array types early also means we avoid unnecessary invocations of 'these' which is a nice bonus.

Future Work

Following some discussion, we'd like to change the following properties of the loop expressions (at least):

The return type may not depend on the type of the leader, or on whether the serial or parallel overload was called. This way, loop expressions will always have a consistent yield type, no matter the context in which they are used.
Implementing (1) would still allow a different follower iterator to be resolved depending on the leader type (it would just need to return a value of the right type). We would like to rule out this "context dependence", and make it so the same follower function is invoked for any given iterator expression.

This would likely require eagerly resolving the leader/follower of the iterator in the place where it is found, and then re-using these resolve calls whereever needed.
Loop expressons that need one follower type but are used with a leader type yielding other things will be rejected.

We do find it desirable for foo() to be able to invoke foo(standalone) even if foo() doesn't exist. This PR removes David's original form of this, but does not yet provide a replacement.

Reviewed by @dlongnecke-cray and @mppf -- thanks!

Testing

dyno tests

Signed-off-by: Danila Fedorin <[email protected]>

dlongnecke-cray · 2024-09-27T18:28:19Z