-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(compiler): preflight expression lifting #3125
Conversation
## TODO - [ ] Handle base classes lifts - [ ] Verify CDK lift - [ ] Handle unknown operations ## Description The lifting system (previously known as the `FindReferenceVisitor`) was not robust enough because it tried to identify lifting by looking for references (e.g. `bucket.put()`). This resulted in a million edge cases (e.g. what about `foo.bar.baz.foo().put()`?). I realized that we basically worked at the wrong abstraction level. So this change refactors the lifting system to work on *expressions* rather than on *references*. During type checking, we used to calculate the type for each expression. Now we are also calculating the *phase* of each expression. This is the key to lifting. Now, during jsification, when we visit an expression we ask whether its phase it aligned with the scope's phase. If it's a preflight expression inside an inflight phase, it means we need to lift it. This means, for example, that now we can lift complete expressions. For example, the expression `foo.bar.myArray.get(0)` will be evaluated in *preflight* and only the resulting value will be lifted into inflight (this is because `get()` is a phase-independent method and inherits its phase from the object). In the expression `foo.bar.myBucket.list()` only `foo.bar.myBucket` is a preflight value and will be lifted into inflight as a single object (`$foo_bar_myBucket`). ## Implementation Notes To support this, I added the lift results into the `JSifyContext` struct as well as the current method name. These are required in `jsify_expression` to record the lift. Additionally, in order to identify the lift operations, I am propagating the `property` of object references into the `jsify_expression(obj)`. The `mangling` module is responsible to take a multi-component expression (e.g. `a.b.c`, turn it into `a_b_c` and then deconstruct it back to `a.b.c` by emitting a JavaScript object with the same structure. This is primarily used to capture nested type named (e.g. `std.Json`). ## Testing Added a bunch of Rust unit tests for mangling and for capturing/lifting. These are implemented as snapshot tests which capture some internal capturing details as well as the emitted javascript output. Ideally we should be able to run all of these tests as full tests, but since there are dozens of new tests, we decided to first implement more parallelism into our build so that this won't add too much time to our CI builds. Not ideal. This required extracting a `partial_compile` ## Resolves * Fixes #1878 ## Misc * Change the `Debug Wing Compiler` launch configuration to simply debug the current file. This allows basically just pressing F5 inside any `.w` file to kick off a debugging session. Sweet! * Moved the capturing logic from `jsify.rs` into `captures.rs`.
👏 👏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only got through a few files so far before I needed to get some fresh air 😄 but I hope some of the initial comments are useful. Really exciting to see this PR converge to a working state
x.copy().at(y); | ||
// ^ Cannot reference an inflight object from a preflight expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error is a bit on the confusing side to me. Could you break down what's happening?
x
is something that can be lifted into inflight (so for all intents and purposes, we could pretend x
is a Array<str>
), so calling copy()
(phase-dependent function) on it inflight and then calling at
(another phase independent function) seems like it ought to be okay, at least on my initial read.
libs/wingc/src/jsify/captures.rs
Outdated
types: IndexMap<String, TypeRef>, | ||
|
||
/// captured variables per method | ||
vars: BTreeMap<String, BTreeMap<String, BTreeSet<String>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are two different map types used? (Maybe add a comment?)
/// Visitor that finds all the types and variables used within an inflight method but defined in its | ||
/// parent environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"within an inflight method but defined in its parent environment"
Sanity check that I understand this description - when it says "any parent environment" does this include inflight scopes as well? e.g.
inflight () => {
let x = 5;
inflight class Bar {
inflight getX(): num {
return x; // <-- captured from a parent (inflight) scope
}
}
}
/// The name of the method | ||
method_name: &'a str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The name of the method | |
method_name: &'a str, | |
/// The name of the method we're analyzing | |
method_name: &'a str, |
#[cfg(test)] | ||
mod tests; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO / remove?
impl Display for MangleDemangle { | ||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | ||
writeln!(f, "keys: {}", self.keys.join(","))?; | ||
writeln!(f, "\n{}", self.javascript) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this only needs to be compiled for unit testing, you could consider slapping this label in front
impl Display for MangleDemangle { | |
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | |
writeln!(f, "keys: {}", self.keys.join(","))?; | |
writeln!(f, "\n{}", self.javascript) | |
} | |
} | |
#[cfg(test)] | |
impl Display for MangleDemangle { | |
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | |
writeln!(f, "keys: {}", self.keys.join(","))?; | |
writeln!(f, "\n{}", self.javascript) | |
} | |
} |
keys: MyClass,a_b_c,a_b_d,a_b_e,b_myBucket,hello,hello_world,std_Json,std_Number | ||
|
||
const a = { | ||
b: { | ||
c: a_b_c, | ||
d: a_b_d, | ||
e: a_b_e, | ||
}, | ||
}; | ||
const b = { | ||
myBucket: b_myBucket, | ||
}; | ||
const std = { | ||
Json: std_Json, | ||
Number: std_Number, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty neat algorithm 👍
"error: Cannot capture symbol \\"y\\" because it is shadowed by another symbol with the same name | ||
--> ../../../examples/tests/invalid/capture_redefinition.w:4:7 | ||
| | ||
4 | log(y); | ||
| ^ Cannot capture symbol \\"y\\" because it is shadowed by another symbol with the same name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regression perhaps?
|
||
Tests 1 failed (1) | ||
"fail ┌ inflight_ref_explicit_ops.wsim » root/env0/test:test1 | ||
│ Error: Missing environment variable: BUCKET_HANDLE_5982bab0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh oh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an E2E test for the fixed issue? (#1878)
* Change the `variable` portion of `StmtKind::Assignment` to an `Expr` instead of a `Reference` to make sure it's type and phase is captured during type checking. * Add `()` to jsified closures and improve output of functions. * Do not emit `$inflight_init` if it doesn't have any statements. * Do not allow reassigning to preflight variables from inflight * In the jsii importer, mark structs (and struct fields) as phase independent (because they are). * Fix #2916
## TODO - [ ] Still failing tests - [ ] Determine how to pass lifting tokens to super classes Refactor the lifting system to work on expressions instead of references and merge lifting and capturing into a unified model. *Lifting* is the concept of referencing a preflight expression from an inflight scope, while *capturing* is the concept of referencing an object defined in a parent scope (albeit not another phase). The mechanism works like this: we now have another "lifting" compilation phase which runs immediately after type checking. This is a transform ("fold") which does the following: 1. Lifting: for all inflight methods, it scans all the expressions. If it finds and expression marked as a preflight expression, it transforms it into an `Expr::Lift` which is a placeholder that can be used later to replace it with a lifted token. 2. Capturing: for each reference (both type refs and object refs), it checks looks up in which symbol environment it is defined. It then checks if that environment is a parent of the current method environment. If it is, is transforms it into a `Reference::Capture` to indicate downstream that this is a captured reference. Now we move to the jsification phase. When the jsifier hits a *lifted expression* or a *captured reference*, to jsifies it as if it was in a preflight scope and registers it in a centralized token registry maintained at the class level during jsification. This registry maintains a mapping between *inflight tokens* and *preflight expressions* (both for lifting and for capturing purposes). The token registry is consulted when the class is jsified in order to emit the code that lifts expressions from preflight to inflight (both at the object level and at the type level). ## Fixes * #1878 is fixed because now we are able to lift complete expressions. So `bucket.bucketArn` is a simple string and there is no need to lift the bucket. This is the 3rd attempt at this refactor, and hopefully the last for now. It supersedes #3125 and #2909.
Superseded by #3213 |
Refactor the lifting system to work on expressions instead of references and merge lifting and capturing into a unified model. *Lifting* is the concept of referencing a preflight expression from an inflight scope, while *capturing* is the concept of referencing an object defined in a parent scope (albeit not another phase). The mechanism works like this: we now have another "lifting" phase which runs immediately after type checking. This phase does the following: 1. **Lifting**: for inflight methods, it scans all the expressions. If it finds an expression marked as a preflight expression, it uses the jsifier to render it's preflight code and obtains a *token* from a lifts registry that's maintained at the class *type* level. 2. **Capturing**: for each expression that is a reference (both type refs and object refs), it checks looks up in which symbol environment it is defined. It then checks if that environment is a parent of the current method environment. If it is, is adds it to the lifts registry as well. For each class, the lifting phase returns a new class object with the lift registry updated. So `class.lifts` has all the information about lifting and capturing for this class, later to be used downstream. Now we move to the jsification phase. For each expression, the jsifier checks the `lifts` registry and if there's a token for this expression (by id), it emits the token instead (only in inflight of course). The token registry is also consulted when the class machinery is jsified in order to emit the code that lifts expressions from preflight to inflight (both at the object level and at the type level) and binds them for access management. When jsifying inflight base classes, we need to resolve the base class type and extract the lifting tokens from it. ## Misc * The simulator failed to start certain test apps due to unresolvable attributes. This is because there was no mechanism to handle implicit dependencies on startup. A popular solution to this problem is to use an *init queue*. We basically pop a resource from the queue and try to start it. If we can't resolve a reference to other resources, we retry by returning it to the queue. An attempt count is decremented to avoid an infinite loop in case of a dependency cycle or unresolvable attribute (cc @Chriscbr). * Additionally, the simulator did not call `cleanup()` during the shutdown sequence. This resulted in leaking ports in the `cloud.Api` resource that prevented the program from exiting (not exactly sure how this have worked). @tsuf239, any idea? ## Fixes * #1878 is fixed because now we are able to lift complete expressions. So `bucket.bucketArn` is a simple string and there is no need to lift the bucket. * #2729 is fixed with the following error: `Unable to reassign a captured variable` * #3253 This is the 3rd attempt at this refactor, and hopefully the last for now. It supersedes #3125 and #2909. ## Follow ups * #3244 * #3249 * #3189 ## Checklist - [x] Title matches [Winglang's style guide](https://docs.winglang.io/contributing/pull_requests#how-are-pull-request-titles-formatted) - [x] Description explains motivation and solution - [x] Tests added (always) - [x] Docs updated (only required for features) - [x] Added `pr/e2e-full` label if this feature requires end-to-end testing *By submitting this pull request, I confirm that my contribution is made under the terms of the [Monada Contribution License](https://docs.winglang.io/terms-and-policies/contribution-license.html)*.
TODO
Description
The lifting system (previously known as the
FindReferenceVisitor
) was not robust enough because it tried to identify lifting by looking for references (e.g.bucket.put()
). This resulted in a million edge cases (e.g. what aboutfoo.bar.baz.foo().put()
?). I realized that we basically worked at the wrong abstraction level.So this change refactors the lifting system to work on expressions rather than on references. During type checking, we used to calculate the type for each expression. Now we are also calculating the phase of each expression. This is the key to lifting.
Now, during jsification, when we visit an expression we ask whether its phase it aligned with the scope's phase. If it's a preflight expression inside an inflight phase, it means we need to lift it.
This means, for example, that now we can lift complete expressions. For example, the expression
foo.bar.myArray.get(0)
will be evaluated in preflight and only the resulting value will be lifted into inflight (this is becauseget()
is a phase-independent method and inherits its phase from the object). In the expressionfoo.bar.myBucket.list()
onlyfoo.bar.myBucket
is a preflight value and will be lifted into inflight as a single object ($foo_bar_myBucket
).Implementation Notes
To support this, I added the lift results into the
JSifyContext
struct as well as the current method name. These are required injsify_expression
to record the lift. Additionally, in order to identify the lift operations, I am propagating theproperty
of object references into thejsify_expression(obj)
.The
mangling
module is responsible to take a multi-component expression (e.g.a.b.c
, turn it intoa_b_c
and then deconstruct it back toa.b.c
by emitting a JavaScript object with the same structure. This is primarily used to capture nested type named (e.g.std.Json
).Testing
Added a bunch of Rust unit tests for mangling and for capturing/lifting. These are implemented as snapshot tests which capture some internal capturing details as well as the emitted javascript output. Ideally we should be able to run all of these tests as full tests, but since there are dozens of new tests, we decided to first implement more parallelism into our build so that this won't add too much time to our CI builds. Not ideal. This required extracting a
partial_compile
Resolves
Misc
Debug Wing Compiler
launch configuration to simply debug the current file. This allows basically just pressing F5 inside any.w
file to kick off a debugging session. Sweet!jsify.rs
intocaptures.rs
.variable
portion ofStmtKind::Assignment
to anExpr
instead of aReference
to make sure it's type and phase is captured during type checking.()
to jsified closures and improve output of functions.$inflight_init
if it doesn't have any statements.Checklist
pr/e2e-full
label if this feature requires end-to-end testingBy submitting this pull request, I confirm that my contribution is made under the terms of the Monada Contribution License.