Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GC-05-16.md #1279

Merged
merged 3 commits into from
Jun 12, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 173 additions & 0 deletions gc/2023/GC-05-16.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,176 @@ Installation is required, see the calendar invite.
## Meeting Notes

### Introduction of attendees

- Thomas Lively
- Jérôme Vouillon
- Conrad Watt
- Jun Xu
- Ashley Nelson
- Alon Zakai
- Ryan Hunt
- Bruce He
- Emanuel Ziegler
- Jakob Kummerow
- Ben Titzer
- Nick Fitzgerald
- Adam Klein
- Andreas Rossberg
- Petr Penzin
- Sergey Rubanov
- Zalim Bashorov
- Igor Iakovlev

### Status checks

TL: I did a PR adding the bulk_array instructions to the MVP doc and I saw some movement on Conrad’s PR adding to the spec interpreter as well. AR, I know you’ve been hard at work writing the formal spec, great to see movement there. Anyone else have any status they want to share? Ryan, how’s the FF impl?

RH: Quite a bit of perf work this half, also some of the final features we missed in our first impl pass. So we have i31 wrapped up, should be landing soon. I think just a couple random edge cases, we don’t handle element segments correctly, maybe something else small. Hoping to finish this soon, and then some performance work. Should be fully standards compliant pretty soon.

### Presentation: TypeScript on WasmGC (Jun Xu, 40 minutes)

JX prexenting [slides](presentations/2023-05-16-jun-typescript.pdf)

AR: In what sense are you lowering to nominal types? Because the Wasm types aren’t nominal either.

JX: In the compiler we check to make sure two different types with the same structure cannot be assigned to each other.

AR: But what about dynamic typing? I guess there’s no need to go too deep into it here, but right now you’re talking only about the static type system?

JX: Right, the static type system is nominal.

AR: Question about Any. In TypeScript it’s not a top type, it’s supposed to be compatible in both directions. How can this work when you have two different representations of static and dynamic objects?

JX: Static and dynamic world are absolutely separate. In this case, how to create and manage the basic structure of the dynamic objects. Maybe after the next page, you can see a bridge between the dynamic and static world.

BT: One idea is to have a table of func refs of the right types then use an index in the table to get a property or interface?. So, effectively you need an interface dispatch, right? So you need to do sa dynamic search, and you can’t use a struct for that, you have to use an array of some kind if using GC memory You can also use a table for that, so the list would just be the index of functions into that table?

JX: When we access the field of an interface, we need to access it through the field name rather than a static calculated index.

BT: So you have two dynamic lookups, that’s the issue.

JX: We store the name information in the runtime representation.

TL: Is this a nonstandard instruction you’re using or you’re emulating it with a function?

JX: We extend this code to support this feature and we use a function to emulate this behavior and it works.

TL: Have you prototyped this opcode in any implementation and seen the performance benefits?

JX: No, we use the native APIs to emulate this behavior but we haven’t measured the performance. We think this is supported for some limited dynamic typing semantics. We know wasmGC isn’t created for dynamic typing languages. It would be more efficient in their development. We think this indirect function call/get would have lower performance but wouldn’t influence the original opcode.

TL: It would obviously be great for codesize to have this struct.get_indirect opcode because you wouldn't have to generate these dynamic getter functions for every field of the struct, but it would be interesting to see if there are any performance benefits to

JX: Great suggestion.

BT: If you want to take a struct reference and turn it into the Any type, you need to use the extern.externalize instruction? So you rely on the engine to do the externalization?

JX: Yes, we do rely on the engine or host environment to do externalization.

CW: I imagine for a lot of users of TS, the alternative is to compile to JS, which will be pretty fast. Is the intention to compete with compiling to JS, or just to use TS in an environment where JS isn’t available?

JX: Currently we want to bring more application development languages to Wasm. We don’t expect this approach will be much faster than JS.

CW: Who do you imagine will use this as this first choice?

JX: Developers for IoT devices, an app-developer friendly language based on WebAssembly may help build a larger ecosystem for IoT world. In the browser, WebAssembly can be some part of the project, the part more related to UI can be still compiled to JavaScript, while other parts more related to computation can be compiled to WebAssembly, and the developer don't need to write two languages for these two parts. This is more flexible than the current solution.

AR: You mentioned gradual typing here but the approach you are using is something akin to coercive subtyping and that’s difficult. You definitely need sound subtyping to make it work. But for gradual typing you need to be even more flexible, and I’m not sure how that would work with two competing representations. With gradual typing I should be able to pass an array of C as an array of Any and vice versa. And that could be a mutable array, so you can’t just convert an array to fit the right implementation that you need. You have to keep it in this original presentation, so I’m not sure how you would support that in this approach.

JX: I mentioned the gradual type here, but just want to explain the language. For the static languages, they are trying to support dynamic language features and for dynamic languages, javascript/typescript we are trying to bring more type information to their code. So for gradual typing, wasm might be a good package to bring benefits to these kinds of languages. That does not mean we are going to support all the gradual typing semantics in the languages.

AR: Understood, but how do you even support the most basic gradual semantics with this approach? In your language, can you assign an array of C to an array of Any?

JX: Not sure what the C to array means.

AR: So C is some static class and you can have an array of C, right? And with basic gradual typing you should be able to assign that array to a variable of array of type Any. In that case, you cannot wrap the individual objects, it would be expensive and in fact, violate the identity of the objects because this is a mutable array so you can observe the members changing. How do you support that?

NF (chat): Array<C> my_concrete_array = ...; Array<Any> my_any_array = my_concrete_array; This, right?

JX: Good question, I think our POC hasn’t covered this feature. But in our design, we need to static C into dynamic object. So this would bridge the immutable semantics of the array. For the performance impact, we think that if the director uses Any, then they should pay for the performance impact.

BT: Along the lines of AR’s question, this arises because you’re using externref and you need to have a coercion there, but it would also arise if you had an array of Ints too. Is that supported? Array of Ints can be assigned to Array of Any.

JX: … missed response ...

AR: How can you have any form of gradual typing if you don’t have uniform representation? You are converting between dynamic and static objects

CW: Am I interpreting your ..making everything uniform by putting a box around everything?

JX: We are starting from the static type and at the start we want to represent everything as static typing and then we find that adding an interface results in a special design. Proposed opcode? There may be some gaps between static type and dynamic type.

TL: If I understand correctly, you are mostly interested in supporting static type non-gradual subset of the language and because Any does exist and people use it, you also have this escape hatch for representing Any values but it’s not necessarily a goal to perfectly implement all of the original gradual type semantics of typescript, right?

JX: Yeah, we are not experts in type systems or program languages, we are supporting the features that are required by engineering.

TL: Have you looked at existing typescript code and validated how much is compatible with the type system features you support and how much uses the gradual typing features you don’t support?

JX: Currently, no statistics, just started this POC recently. We just support basic features. To get any, we can define properties and pass these to any functions and then access these properties. For some apps, this is good. Not a fully supported language feature.

JX: Key idea of this presentation is to introduce this new op code struct.get/set_indirect, we think it would be useful for many languages.

CW: To implement the dynamic check, if the runtime didn’t want to carry around information about the fields, you’d have to use the tag on the object and effectively work out from there how the bounds check needs to happen. Engine folks, does that sound right?

JX: In this table, we would have an additional column to store runtime type information and we can use that information to do that check.

CW: I was referring to the actual wasm engine implementation, but presumably if you pass the struct ref, the dynamic representation of the struct ref needs to tell whether the index is in bounds at runtime and if the field matches the annotation. And reconstituting the field

JX: The runtime type information is bound to the struct type reference and when we pass the struct ref to that opcode, the runtime can check.

CW: I wouldn’t expect implementations to have all that machinery setup. Maybe the tags are simpler now because you only need enough for the downcasts. I’ll stop speculating.

JX: Because we need to do the downcasts for this, we also need the runtime checking, so we didn’t require additional environment in the runtime.

RH: In SM, we have a link to the full type information from the RTT. We need that for tracing for GC anyway. If we didn’t have tracing, we could perhaps omit it. Theoretically we could load the type information, index into a field, then compare the type.

BT: There would need to be a subtype check there, too, not just an identity check.

RH: Good point

PP: Bottom line is that we require to be implemented

CW: Depends on the implementation strategy of the engine for downcasts.

PP: TL, know anything about it in v8?

TL: Jakob?

JK: Similar to what Ryan has described, we do have the information but the implementation is not optimized for doing fast access, so that would be pretty slow for getting the field offset. Orthogonal, the type issue on the result, when there are reference and non-reference types in there, I’m not sure how we would do that at all. Our existing subtyping expects all ref types. I think if I had to implement a system like this, I would take a very hard look at something based on virtual vTables you can auto-generate that you can generate from the interface information that the compiler sees. Accessors in the vTable to access the fields. My initial reaction.

TL: Was that describing how you would implement it in the engines or how you would fulfill the functionality in userspace.

JK: TypeScript to Wasm compiler.

RH: You’d need performance data to see why to implement it and what it solves. How much performance work to get it to an acceptable level. I could have a built-in that groks the type information, but it would be very slow. Making it fast would be a considerable amount of work, would need to know if its worth it.

JX: Need performance data for the engines to support this implementation?

RH: That would be one thing

CW: Presumably if you are doing the faster for the user code to generate the separate getters and setters, it would probably be faster, though a little annoying.

RH: I would guess it would be in that case. but I’m sure we could put the type information in that shape so it would be really fast. Right now it’s not, so we’d need a reason to do the work like that. There might be extra metadata costs too.

AR: This is a form of reflection right? It’s about putting reflection into core wasm. As you might guess, I’m not a big fan of that idea, it seems to be something that should be handled at a higher level than the basic asm instruction set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rossberg Sorry for missing this question in the meeting, a quick reply here:

  • I think the proposed opcode struct.get/set_indirect is not related to reflection, it's all about getting field index from operand rather than immediate
  • The idea of interface description table may be something like reflection, but that's not what we want to bring to the core wasm spec, it's just a feasible usage of the proposed opcode

I'll prepare an issue in WebAssembly/gc repo and list some benefits as well as impacts of this opcode for further discussion.

BTW, thanks for your professional questions and suggestions in the meeting, which let me realize there are more things to be considered for the type system

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xujuntwt95329, since these operations have to inspect and dispatch on (the structure of) the runtime type of the object, they represent a form of type reflection. They would be the Wasm equivalent to struct.getClass().getField(x).get/set(struct), the only difference being that x is numeric, not a string, because fields are accessed by number in Wasm. This idea further assumes that every heap object carries sufficient runtime information to perform such reflection, which is not an assumption I'd like to build into Wasm itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rossberg Sorry for the late reply because I didn't notice the update on this thread. Since this PR has been merged, I've created an issue WebAssembly/gc#397 for further discussion


JX: I’ll check our implementation to see if we are adding performance impact for supporting this. In our experiments, we didn’t change any runtime type information.

CW: What’s the implementation you are extending, Intel interpreter or another one?

JX: WAMR. Wasm micro runtime.

BT: Implementation of wasmGC in WAMR?

JX: Yes, we are developing it in a separate branch.

BT: What garbage collector are you using underneath?

JX: It is a mark and sweep collector. Currently tested on the WAMR engine.

JX (chat): https://github.com/bytecodealliance/wasm-micro-runtime/tree/dev/gc_refactor

PP: Jun, are you doing this as part of the Bytecode alliance SIG? If the SIG is public, you can also share a link.

JX: Not sure about this

PP: Okay, because the Bytecode alliance has initiatives along the same lines.