Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compiler): preflight expression lifting #3125

Closed
wants to merge 8 commits into from

Conversation

eladb
Copy link
Contributor

@eladb eladb commented Jun 28, 2023

TODO

  • Handle base classes lifts
  • Verify CDK lift
  • Handle unknown operations

Description

The lifting system (previously known as the FindReferenceVisitor) was not robust enough because it tried to identify lifting by looking for references (e.g. bucket.put()). This resulted in a million edge cases (e.g. what about foo.bar.baz.foo().put()?). I realized that we basically worked at the wrong abstraction level.

So this change refactors the lifting system to work on expressions rather than on references. During type checking, we used to calculate the type for each expression. Now we are also calculating the phase of each expression. This is the key to lifting.

Now, during jsification, when we visit an expression we ask whether its phase it aligned with the scope's phase. If it's a preflight expression inside an inflight phase, it means we need to lift it.

This means, for example, that now we can lift complete expressions. For example, the expression foo.bar.myArray.get(0) will be evaluated in preflight and only the resulting value will be lifted into inflight (this is because get() is a phase-independent method and inherits its phase from the object). In the expression foo.bar.myBucket.list() only foo.bar.myBucket is a preflight value and will be lifted into inflight as a single object ($foo_bar_myBucket).

Implementation Notes

To support this, I added the lift results into the JSifyContext struct as well as the current method name. These are required in jsify_expression to record the lift. Additionally, in order to identify the lift operations, I am propagating the property of object references into the jsify_expression(obj).

The mangling module is responsible to take a multi-component expression (e.g. a.b.c, turn it into a_b_c and then deconstruct it back to a.b.c by emitting a JavaScript object with the same structure. This is primarily used to capture nested type named (e.g. std.Json).

Testing

Added a bunch of Rust unit tests for mangling and for capturing/lifting. These are implemented as snapshot tests which capture some internal capturing details as well as the emitted javascript output. Ideally we should be able to run all of these tests as full tests, but since there are dozens of new tests, we decided to first implement more parallelism into our build so that this won't add too much time to our CI builds. Not ideal. This required extracting a partial_compile

Resolves

Misc

  • Change the Debug Wing Compiler launch configuration to simply debug the current file. This allows basically just pressing F5 inside any .w file to kick off a debugging session. Sweet!
  • Moved the capturing logic from jsify.rs into captures.rs.
  • Change the variable portion of StmtKind::Assignment to an Expr instead of a Reference to make sure it's type and phase is captured during type checking.
  • Add () to jsified closures and improve output of functions.
  • Do not emit $inflight_init if it doesn't have any statements.
  • Do not allow reassigning to preflight variables from inflight
  • In the jsii importer, mark structs (and struct fields) as phase independent (because they are).

Checklist

  • Title matches Winglang's style guide
  • Description explains motivation and solution
  • Tests added (always)
  • Docs updated (only required for features)
  • Added pr/e2e-full label if this feature requires end-to-end testing

By submitting this pull request, I confirm that my contribution is made under the terms of the Monada Contribution License.

## TODO

- [ ] Handle base classes lifts
- [ ] Verify CDK lift
- [ ] Handle unknown operations

## Description

The lifting system (previously known as the `FindReferenceVisitor`) was not robust enough because it tried to identify lifting by looking for references (e.g. `bucket.put()`). This resulted in a million edge cases (e.g. what about `foo.bar.baz.foo().put()`?). I realized that we basically worked at the wrong abstraction level.

So this change refactors the lifting system to work on *expressions* rather than on *references*.
During type checking, we used to calculate the type for each expression. Now we are also calculating the *phase* of each expression. This is the key to lifting.

Now, during jsification, when we visit an expression we ask whether its phase it aligned with the scope's phase. If it's a preflight expression inside an inflight phase, it means we need to lift it.

This means, for example, that now we can lift complete expressions. For example, the expression `foo.bar.myArray.get(0)` will be evaluated in *preflight* and only the resulting value will be lifted into inflight (this is because `get()` is a phase-independent method and inherits its phase from the object). In the expression `foo.bar.myBucket.list()` only `foo.bar.myBucket` is a preflight value and will be lifted into inflight as a single object (`$foo_bar_myBucket`).

## Implementation Notes

To support this, I added the lift results into the `JSifyContext` struct as well as the current method name. These are required in `jsify_expression` to record the lift. Additionally, in order to identify the lift operations, I am propagating the `property` of object references into the `jsify_expression(obj)`.

The `mangling` module is responsible to take a multi-component expression (e.g. `a.b.c`, turn it into `a_b_c` and then deconstruct it back to `a.b.c` by emitting a JavaScript object with the same structure. This is primarily used to capture nested type named (e.g. `std.Json`).

## Testing

Added a bunch of Rust unit tests for mangling and for capturing/lifting. These are implemented as snapshot tests which capture some internal capturing details as well as the emitted javascript output. Ideally we should be able to run all of these tests as full tests, but since there are dozens of new tests, we decided to first implement more parallelism into our build so that this won't add too much time to our CI builds. Not ideal. This required extracting a `partial_compile`

## Resolves

* Fixes #1878

## Misc

* Change the `Debug Wing Compiler` launch configuration to simply debug the current file. This allows basically just pressing F5 inside any `.w` file to kick off a debugging session. Sweet!
* Moved the capturing logic from `jsify.rs` into `captures.rs`.
@eladb eladb requested a review from a team as a code owner June 28, 2023 12:30
@Chriscbr
Copy link
Contributor

Moved the capturing logic from jsify.rs into captures.rs.

👏 👏

Copy link
Contributor

@Chriscbr Chriscbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only got through a few files so far before I needed to get some fresh air 😄 but I hope some of the initial comments are useful. Really exciting to see this PR converge to a working state

Comment on lines 5 to 6
x.copy().at(y);
// ^ Cannot reference an inflight object from a preflight expression
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is a bit on the confusing side to me. Could you break down what's happening?

x is something that can be lifted into inflight (so for all intents and purposes, we could pretend x is a Array<str>), so calling copy() (phase-dependent function) on it inflight and then calling at (another phase independent function) seems like it ought to be okay, at least on my initial read.

Comment on lines 21 to 24
types: IndexMap<String, TypeRef>,

/// captured variables per method
vars: BTreeMap<String, BTreeMap<String, BTreeSet<String>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are two different map types used? (Maybe add a comment?)

Comment on lines +27 to +28
/// Visitor that finds all the types and variables used within an inflight method but defined in its
/// parent environment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"within an inflight method but defined in its parent environment"

Sanity check that I understand this description - when it says "any parent environment" does this include inflight scopes as well? e.g.

inflight () => {
  let x = 5;
  inflight class Bar {
    inflight getX(): num {
      return x; // <-- captured from a parent (inflight) scope
    }
  }
}

Comment on lines +30 to +31
/// The name of the method
method_name: &'a str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The name of the method
method_name: &'a str,
/// The name of the method we're analyzing
method_name: &'a str,

Comment on lines +338 to +339
#[cfg(test)]
mod tests;
Copy link
Contributor

@Chriscbr Chriscbr Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO / remove?

Comment on lines +42 to +47
impl Display for MangleDemangle {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
writeln!(f, "keys: {}", self.keys.join(","))?;
writeln!(f, "\n{}", self.javascript)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this only needs to be compiled for unit testing, you could consider slapping this label in front

Suggested change
impl Display for MangleDemangle {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
writeln!(f, "keys: {}", self.keys.join(","))?;
writeln!(f, "\n{}", self.javascript)
}
}
#[cfg(test)]
impl Display for MangleDemangle {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
writeln!(f, "keys: {}", self.keys.join(","))?;
writeln!(f, "\n{}", self.javascript)
}
}

Comment on lines +5 to +20
keys: MyClass,a_b_c,a_b_d,a_b_e,b_myBucket,hello,hello_world,std_Json,std_Number

const a = {
b: {
c: a_b_c,
d: a_b_d,
e: a_b_e,
},
};
const b = {
myBucket: b_myBucket,
};
const std = {
Json: std_Json,
Number: std_Number,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty neat algorithm 👍

Comment on lines -142 to -146
"error: Cannot capture symbol \\"y\\" because it is shadowed by another symbol with the same name
--> ../../../examples/tests/invalid/capture_redefinition.w:4:7
|
4 | log(y);
| ^ Cannot capture symbol \\"y\\" because it is shadowed by another symbol with the same name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression perhaps?


Tests 1 failed (1)
"fail ┌ inflight_ref_explicit_ops.wsim » root/env0/test:test1
│ Error: Missing environment variable: BUCKET_HANDLE_5982bab0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh oh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an E2E test for the fixed issue? (#1878)

* Change the `variable` portion of `StmtKind::Assignment` to an `Expr` instead of a `Reference` to make sure it's type and phase is captured during type checking.
* Add `()` to jsified closures and improve output of functions.
* Do not emit `$inflight_init` if it doesn't have any statements.
* Do not allow reassigning to preflight variables from inflight
* In the jsii importer, mark structs (and struct fields) as phase independent (because they are).
* Fix #2916
- emit test results to stderr
- simplify capture (no need to distinguish between types and objects)

we still have some issues with types.
eladb added a commit that referenced this pull request Jul 3, 2023
## TODO

- [ ] Still failing tests
- [ ] Determine how to pass lifting tokens to super classes

Refactor the lifting system to work on expressions instead of references and merge lifting and capturing into a unified model.

*Lifting* is the concept of referencing a preflight expression from an inflight scope, while *capturing* is the concept of referencing an object defined in a parent scope (albeit not another phase).

The mechanism works like this: we now have another "lifting" compilation phase which runs immediately after type checking. This is a transform ("fold") which does the following:

1. Lifting: for all inflight methods, it scans all the expressions. If it finds and expression marked as a preflight expression, it transforms it into an `Expr::Lift` which is a placeholder that can be used later to replace it with a lifted token.
2. Capturing: for each reference (both type refs and object refs), it checks looks up in which symbol environment it is defined. It then checks if that environment is a parent of the current method environment. If it is, is transforms it into a `Reference::Capture` to indicate downstream that this is a captured reference.

Now we move to the jsification phase. When the jsifier hits a *lifted expression* or a *captured reference*, to jsifies it as if it was in a preflight scope and registers it in a centralized token registry maintained at the class level during jsification. This registry maintains a mapping between *inflight tokens* and *preflight expressions* (both for lifting and for capturing purposes).

The token registry is consulted when the class is jsified in order to emit the code that lifts expressions from preflight to inflight (both at the object level and at the type level).

## Fixes

* #1878 is fixed because now we are able to lift complete expressions. So `bucket.bucketArn` is a simple string and there is no need to lift the bucket.

This is the 3rd attempt at this refactor, and hopefully the last for now. It supersedes #3125 and #2909.
@eladb eladb mentioned this pull request Jul 3, 2023
5 tasks
@eladb
Copy link
Contributor Author

eladb commented Jul 3, 2023

Superseded by #3213

@eladb eladb closed this Jul 3, 2023
mergify bot pushed a commit that referenced this pull request Jul 6, 2023
Refactor the lifting system to work on expressions instead of references and merge lifting and capturing into a unified model.

*Lifting* is the concept of referencing a preflight expression from an inflight scope, while *capturing* is the concept of referencing an object defined in a parent scope (albeit not another phase).

The mechanism works like this: we now have another "lifting" phase which runs immediately after type checking. This phase does the following:

1. **Lifting**: for inflight methods, it scans all the expressions. If it finds an expression marked as a preflight expression, it uses the jsifier to render it's preflight code and obtains a *token* from a lifts registry that's maintained at the class *type* level.

2. **Capturing**: for each expression that is a reference (both type refs and object refs), it checks looks up in which symbol environment it is defined. It then checks if that environment is a parent of the current method environment. If it is, is adds it to the lifts registry as well.

For each class, the lifting phase returns a new class object with the lift registry updated. So `class.lifts` has all the information about lifting and capturing for this class, later to be used downstream.

Now we move to the jsification phase. For each expression, the jsifier checks the `lifts` registry and if there's a token for this expression (by id), it emits the token instead (only in inflight of course).

The token registry is also consulted when the class machinery is jsified in order to emit the code that lifts expressions from preflight to inflight (both at the object level and at the type level) and binds them for access management.

When jsifying inflight base classes, we need to resolve the base class type and extract the lifting tokens from it.

## Misc

* The simulator failed to start certain test apps due to unresolvable attributes. This is because there was no mechanism to handle implicit dependencies on startup. A popular solution to this problem is to use an *init queue*. We basically pop a resource from the queue and try to start it. If we can't resolve a reference to other resources, we retry by returning it to the queue. An attempt count is decremented to avoid an infinite loop in case of a dependency cycle or unresolvable attribute (cc @Chriscbr).
* Additionally, the simulator did not call `cleanup()` during the shutdown sequence. This resulted in leaking ports in the `cloud.Api` resource that prevented the program from exiting (not exactly sure how this have worked). @tsuf239, any idea?


## Fixes

* #1878 is fixed because now we are able to lift complete expressions. So `bucket.bucketArn` is a simple string and there is no need to lift the bucket.
* #2729 is fixed with the following error: `Unable to reassign a captured variable`
* #3253

This is the 3rd attempt at this refactor, and hopefully the last for now. It supersedes #3125 and #2909.

## Follow ups

* #3244
* #3249
* #3189

## Checklist

- [x] Title matches [Winglang's style guide](https://docs.winglang.io/contributing/pull_requests#how-are-pull-request-titles-formatted)
- [x] Description explains motivation and solution
- [x] Tests added (always)
- [x] Docs updated (only required for features)
- [x] Added `pr/e2e-full` label if this feature requires end-to-end testing

*By submitting this pull request, I confirm that my contribution is made under the terms of the [Monada Contribution License](https://docs.winglang.io/terms-and-policies/contribution-license.html)*.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wing test returns 0 exit code Unable to capture non-Wing CDK constructs
3 participants