Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dyno: add a type to represent iterators, implement loop expressions #25994

Merged
merged 22 commits into from
Sep 30, 2024

Conversation

DanilaFe
Copy link
Contributor

@DanilaFe DanilaFe commented Sep 25, 2024

This PR adds a new type sub-hierarchy for iterable things. It creates some scaffolding for promoted expressions, though it does not currently implement them. It also matches production's implementation of iterators, which, as we have discussed with @bradcray, @mppf and @benharsh may not be the direction we want to take in the long term.

What?

This PR adds a new IteratorType type, as well as subclasses for it for various situations that produce iterables (promotion, loop expressions, calls to iter procedures). This way, we can unify the logic for resolving iterators on records/classes and "iterables", invoking a these method. The these methods for user records/classes are, as before, provided explicitly by the user; the these methods for iterators are compiled-provided (production generates them, in Dyno we simply have special logic to mimic their resolution).

The three subclasses of IteratorType are as follows:

  • FnIteratorType: the class of iterators created from invocations to iter procedures. Just like in production, this type contains a reference to the function that created it. When the iterator is traversed (using these), the compiler resolves a call to a function with the same name as the original iter procedures, and with the same formal types. This matches the production compiler's implementation, down to the ability to provide additional overloads for serial/parallel iterators outside of where they are originally defined (for better or for worse).
  • LoopExprIteratorType: the class of iterators created from loop expressions. In production, there are some tricky cases that make the "yield type" of a loop expression be fluid, since it can depend on the instantiation of the follower iterator it uses, which can change depending on the leader. Thus, it's possible that in forall i in (loopExpr) and forall (_, i) in zip(.., loopExpr), the index variable i is a different type. To make this work, we could need to capture the outer variables of the loop expression and re-resolve it with different leaders. This seems undesirable as per discussions with Brad et. al., and as a result, I instead make the assumption that loop expressions always yield the same type, and that they do not re-resolve their leaders and followers. There's some future work to make this more principled.
  • PromotionIteratorType: the class of iterators created from promoted expressions. Since promotion is not implemented, this is just left as a scaffold.

This PR removes some speculative resolution logic originally introduced by @dlongnecke-cray in #24915. Specifically, calls to iterators like foo() are no longer re-resolved with tags if original resolution fails. This has several motivators:

  1. Production doesn't support this, and instead requires a serial iterator to be present for foo() to successfully invoke foo(standalone) etc.
  2. In David's PR, the re-resolution only happens at the site of zips / loop iterands. However, if we truly wanted to implement this feature, we would need to handle invocations of iterators in arbitrary contexts (e.g., as sub-expressions of promoted expressions), and thus would likely need to hoist the "insert tags if needed" logic to the more general call resolution world.
  3. The need to speculatively resolve an expression before inserting tags (and re-emit errors later if needed) made it difficult to resolve arrays types early (instead of resolving them as bracket loops).
    • Resolving array types late no longer works under this PR, but I think that's fine. Previously, we would resolve a call to _domain.these with a generic receiver (normally not allowed), and fail to infer its return type despite succeeding in call resolution. This PR requires knowing the return type (to make use of the fact that it's iterable), so this no longer flies. Anyway, relying on resolving call with an accidentally generic receiver seems unfortunate.
    • Resolving array types early also means we avoid unnecessary invocations of 'these' which is a nice bonus.

Future Work

Following some discussion, we'd like to change the following properties of the loop expressions (at least):

  1. The return type may not depend on the type of the leader, or on whether the serial or parallel overload was called. This way, loop expressions will always have a consistent yield type, no matter the context in which they are used.
  2. Implementing (1) would still allow a different follower iterator to be resolved depending on the leader type (it would just need to return a value of the right type). We would like to rule out this "context dependence", and make it so the same follower function is invoked for any given iterator expression.
  • This would likely require eagerly resolving the leader/follower of the iterator in the place where it is found, and then re-using these resolve calls whereever needed.
  • Loop expressons that need one follower type but are used with a leader type yielding other things will be rejected.
  1. We do find it desirable for foo() to be able to invoke foo(standalone) even if foo() doesn't exist. This PR removes David's original form of this, but does not yet provide a replacement.

Reviewed by @dlongnecke-cray and @mppf -- thanks!

Testing

  • dyno tests

Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
bool wasCallInjected = false;
CallResolutionResult crr;
const TypedFnSignature* sig = crr.mostSpecific().only().fn();
const IteratorType* iterType = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably just remove this type now I think 😃.

frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved

rv.enterScope(loop);
loop->body()->traverse(rv);
// Scope is exited in exit(IndexableLoop)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to make this happen explicitly?

E.g., separate this into two parts: isShapedLikeArrayTypeExpr and then the type construction part, and then have the parent always responsible for enterScope and loop->body()->traverse(...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a shot, let me know what you think.

@@ -4830,11 +4904,11 @@ bool Resolver::enter(const IndexableLoop* loop) {
with->traverse(*this);
}

loop->body()->traverse(*this);
if (!bodyResolved) {
loop->body()->traverse(*this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing an enterScope call here? Hard to tell from the diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not (it's above, before the other stuff), but there was a subtle bug that had to do with entering and exiting scopes. The handleArrayExpr enters the scope, if it’s an array, we just return early. But if it turns out that it’s not an array but a loop expression, then we go on to handle the iterator overload selection -- without exiting the loop body scope. so a tricky thing you could do is define an iterator overload inside the body of the loop, and invoke that iterator from the iterand.

iter foo() { ... }

forall i in foo() { // will invoke the foo(standalone)!!
  iter foo(param tag) where tag == iterKind.standalone { ... }
}

I fixed this as part of my changes to your comment above.

@@ -5052,6 +5126,7 @@ static QualifiedType resolveReduceScanOp(Resolver& resolver,
const AstNode* reduceOrScan,
const AstNode* op,
const AstNode* iterand) {
iterand->traverse(resolver);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame that we can't have the resolveIterDetails take care of resolving the iterand anymore. But maybe that's the correct thing to do in the long run. I've gone back and forth on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually like it this way, particularly for loops, since for array declarations, we need to "separately" resolve the iterand, and doing so in resolveIterDetails would cause either duplicate resolution or tiptoeing.

Copy link
Member

@mppf mppf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked primarily at the header file changes and to some degree at the .cpp changes, to consider the broad approach.

Here, FnIteratorType, LoopExprIteratorType, and PromotionIteratorType include the return type by nature of IteratorType containing the yielded type. This makes sense to me, provided we don't need to compute this type in order to do candidate selection or disambiguation.

It is a little bit related that TypedFnSignature does not include the return type. I have discussed with @dlongnecke-cray in the past somewhere (maybe he can find / recall the details) about how we ought to have a FunctionType in the types hierarchy. Here, I think one of the key questions is whether or not TypedFnSignature has a return type.

Currently, the TypedFnSignature has pretty much the stuff that we need during the process of resolution and nothing else. Since the return type doesn't come up in candidate selection, it's not present. I think it's important that we can continue to do candidate selection without computing the return type.

One way to handle this is to make a types::FunctionType that is related, but not the same as TypedFnSignature. Another way is to rename TypedFnSignature to FunctionType and then add an optional return type. Note that we already have a precedent for a different TypedFnSignature being created as we are resolving since out formals are computed like return types, so not computed initially during resolution. So, we could imagine having some TypedFnSignatures that include a computed return type and others where it is not computed yet. (And a way to compute the "full" TypedFnSignature). I think that, today, we are using ResolvedFunction for the "full" TypedFnSignature, which could probably be improved upon with this kind of approach.

Adjusting TypedFnSignature / FunctionType is not really necessary right now, but it will be when we add FCF support.

Anyway, I bring this up because this PR is adding new types which probably have similar design tradeoffs in terms of whether or not the return/yield type is included. I am happy with the approach here (with the caveat, that if we discover cases where we need IteratorType during candidate selection where the yielded type is not relevant, then we should be forming versions of these without the return type).

bool isZippered_;
bool supportsParallel_;
QualifiedType iterand_;
ID sourceLocation_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is interesting because historically we have not been able to have IDs for compiler-generated stuff. So if there came a time when we needed to resolve a compiler-generated loop expression, we would need to either change this or fabricate an ID.

Copy link
Contributor Author

@DanilaFe DanilaFe Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a TODO / note, and I agree with your comment, but I am not personally too worried by this field while we do not have compiler-generated loop expressions.

Signed-off-by: Danila Fedorin <[email protected]>
@DanilaFe DanilaFe merged commit 1453dbf into chapel-lang:main Sep 30, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants