Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dyno: eagerly build iterator groups (in the scope where they're defined) #26039

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

DanilaFe
Copy link
Contributor

@DanilaFe DanilaFe commented Oct 4, 2024

This PR changes the way that iterators are resolved to stop considering the scope in which the iterator is iterated over, and only consider the scope in which it was defined / created. This came out of a discussion with @benharsh , @bradcray and @mppf about the expected semantics of iterators. I wanted to get these changes (which are largely architectural in nature) out of the way before proceeding with other iterator-related tasks such as promotion.

In production, each time an iterator is used, the compiler performs a search for matching overloads. E.g,, if a serial iterator is defined in one place, then a standalone overload is defined in the scope of a loop, the standalone loop will be preferred.

iter myIter() {
  yield 1;
}

{
  iter myIter(param tag) where tag == iterKind.standalone {
    yield "hello";
  }

  [i in myIter()] {
    writeln(i); // prints "Hello"
  }
}

However:

iter myIter() {
  yield 1;
}

{
  iter myIter(param tag) where tag == iterKind.standalone {
    yield "hello";
  }
}
[i in myIter()] {
  writeln(i); // prints 1
}

There are all sorts of other fun cases in which the type of the iterator can change depending on what is being zippered with, since the follower is re-resolved using the new leader's type:

iter myIter() do yield 1;
iter myIter(param tag) where tag == iterKind.leader do yield (1..1,);
iter myIter(param tag, followThis) where tag == iterKind.follower do
    yield if followThis.type.size == 1 then 1 else "hello";

iter otherIter() do yield 1;
iter otherIter(param tag) where tag == iterKind.leader do yield (1..1, 1..1);
iter otherIter(param tag, followThis) where tag == iterKind.follower do yield 1;

forall i in myIter() do
    writeln(i); // prints 1

forall (_, i) in zip(otherIter(), myIter()) do
    writeln(i); // prints "hello"

This PR changes this process in Dyno, making a switch to determining the types of the leader, follower, and standalone iterators associated with a particular function using the scope in which the function is declared, as opposed to the scope in which it's being used. This is somewhat complicated by the existence of point-of-instantiation scopes; generic iterators do include their isntantiation information, so that other overloads are resolved with the same instantiation scope in mind. However, this is determined at the point where the iterator is created, as opposed to the point where the iterator is used in a loop.

As of this PR, both version of the bracket loop programs print i, and the program in which the leader type affects resolution fails to resolve (because only once follower type is allowed per iterator).

Changes in This PR

Bundling PoI scope with iterators

The main trick with this PR was to store enough information in an IteratorType so that we can reconstruct the context in which it was created. This way, given an iterator type, we have all the information we need to find its various overloads. The most complicated situation in this case is when the iterator was constructed from a call to a generic iter procedure; in this case, the body of the procedure -- and presumably the other overloads for the iterator -- can use functions from the context of the call, via the point-of-instantiation mechanism. This allows programs like the following to resolve:

iter myIter(arg) {
  yield computeReturn(arg);
}
iter myIter(arg, param tag) where tag == iterKind.standalone {
  yield computeReturn(arg);
}

{
  proc computeReturn(arg: int) {
    return arg + 1;
  }

  forall i1 in myIter(13) {}
  for j1 in myIter(13) {}
}

{
  proc computeReturn(arg: int) {
    return arg == 1;
  }

  forall i2 in myIter(13) {}
  for j2 in myIter(13) {}
}

In the first block, myIter is instantiated in a scope in which computeReturn returns int, which means myIter -- and all of its versions, including the parallel one -- should return int as well. In the second block, another computeReturn produces bool, so myIter invocations there should create iterators that yield bool. Iterator values can be returned and passed around to different contexts (changing PoI), which to me suggests that the iterator type itself should include the point-of-instantiation information.

This PR takes that approach, storing a PoI scope with the iterator.

Circumventing CallInfo etc. for invoking these overloads

I noticed that we use "CallInfo" and friends as an odd intermediate representation. The resolver knows that it's handling a special case of resolving these using a tag and optionally a followThis argument; the resolution queries need to specially handle calls to these when they are made, so that they can provide compiler-backed implementations for iterator types. However, to get from the resolver to the resolution queries, prior to this PR, we went through resolveGeneratedCall, which required the construction of actual types etc.. Then, resolution queries had to unpack the CallInfo, inspect the formals, and call in to the special case.

This PR circumvents this by introducing a new IteratorKind enum, which encodes the type of these overload we're looking for (serial, standalone, leader, etc.). A new query in resolution-queries is then defined that handles the special case right away, before constructing a CallInfo. Thus, special overloads (like calls to these on a loop expression) are handled without the intermediate packing and unpacking of actual types into a CallInfo object. This considerably simplifies the resolution process in my opinion.

Issuing Errors for Incompatible Iterator Types

When a call to an iterator is made, the resolver immediately (and eagerly) searches for all other overloads (leader, follower, standalone, etc.). This is possible since the types of these overloads are determined when they are created. By doing this eager resolution, we can also report incompatible yield types. Now -- as discussed with @bradcray et al. -- the serial, standalone, and parallel iterators must all have the same type.

Testing

  • dyno tests, including new ones of the programs discussed in this PR OP

Their context is entirely determined by where they are created.

Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
The _created_ PoI scope after instantiation is what is fed
to return type inference, as opposed to the "original" PoI
scope in which the function was resolved.

Signed-off-by: Danila Fedorin <[email protected]>
@DanilaFe DanilaFe marked this pull request as ready for review October 4, 2024 22:17
Signed-off-by: Danila Fedorin <[email protected]>
@mppf
Copy link
Member

mppf commented Oct 7, 2024

Iterator values can be returned and passed around to different contexts (changing PoI), which to me suggests that the iterator type itself should include the point-of-instantiation information.

In the event that the iterators are using the same functions from POI even though the PoiScope is different, will we end up resolving just one function? (I think it will work this way (at least if it works that way for other Function resolution) because the iterator logic will use the regular Function resolution code.

@DanilaFe
Copy link
Contributor Author

DanilaFe commented Oct 7, 2024

In the event that the iterators are using the same functions from POI even though the PoiScope is different, will we end up resolving just one function?

Yes, with the caveat that to determine used functions from PoI we need to resolve the function body. So, the body resolution process happens again, but the results of running it are re-used.

@@ -493,6 +500,22 @@ const uast::Decl* findFieldByName(Context* context,
const types::CompositeType* ct,
UniqueString name);

const MostSpecificCandidate&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these new functions need docs comments

@@ -408,7 +408,7 @@ CallResolutionResult resolveCallInMethod(ResolutionContext* rc,
as the point-of-instantiation scopes that were used when resolving them.
*/
CallResolutionResult resolveGeneratedCall(Context* context,
const uast::AstNode* astForErr,
const uast::AstNode* astForErrAndPoi,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO if it's used for multiple things it's better to have a short name like astContext and then describe what it's for in the docs comment.

/* rejectedPossibleIteratorCandidates */ false,
types::QualifiedType(),
PoiInfo());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think the zero-args constructor should do the same thing?

@@ -73,6 +74,7 @@ class LoopExprIteratorType final : public IteratorType {
bool contentsMatchInner(const Type* other) const override {
auto rhs = (LoopExprIteratorType*) other;
return yieldType_ == rhs->yieldType_ &&
poiScope_ == rhs->poiScope_ &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's better to have a function for IteratorType to call that handles the fields there. An example from uAST is simpleBlockLikeContentsMatchInner.

scalarFn_(scalarFn),
promotedFormals_(std::move(promotedFormals)) {}

bool contentsMatchInner(const Type* other) const override {
auto rhs = (PromotionIteratorType*) other;
return this->yieldType_ == rhs->yieldType_ &&
this->poiScope_ == rhs->poiScope_ &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here

@@ -3403,7 +3403,7 @@ static const Type* getManagedClassType(Context* context,
}

static const Type* getNumericType(Context* context,
const AstNode* astForErr,
const AstNode* astForErrAndPoi,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

@@ -3594,7 +3594,7 @@ convertClassTypeToNilable(Context* context, const Type* t) {
// Resolving compiler-supported type-returning patterns
// 'call' and 'inPoiScope' are used for the location for error reporting.
static const Type* resolveBuiltinTypeCtor(Context* context,
const AstNode* astForErr,
const AstNode* astForErrAndPoi,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

// Resolving calls for certain compiler-supported patterns
// without requiring module implementations exist at all.
static bool resolveFnCallSpecial(Context* context,
const AstNode* astForErr,
const AstNode* astForErrAndPoi,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

@@ -4137,7 +3990,7 @@ collectGenericFormals(Context* context, const TypedFnSignature* tfs) {

static void
considerCompilerGeneratedCandidates(Context* context,
const AstNode* astForErr,
const AstNode* astForErrAndPoi,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

QualifiedType ret;
if (!c.exprType().isUnknownOrErroneous()) {
if (auto it = c.exprType().type()->toIteratorType()) {
ret = it->yieldType();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing is necessarily wrong but I'm missing something. When is the IteratorType including the yield type computed? I thought that here we were going to be resolving the iterator Function through normal means (and so we would need to separately compute the yield/return type after we have found the most specific candidate TypedFnSignature).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants