- Where: zoom.us
- When: June 21st, 4pm-5pm UTC (June 21st, 9am-10am Pacific Time)
- Location: link on calendar invite
None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.
The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.
- Opening, welcome and roll call
- Opening of the meeting
- Introduction of attendees
- Find volunteers for note taking (acting chair to volunteer)
- Adoption of the agenda
- Proposals and discussions
- Discussion: backward-incompatible tweak to text format lexing (WebAssembly/spec#1490) (Thomas Lively, 5 minutes)
- Closure
- Alex Crichton
- Andreas Rossberg
- Ashley Nelson
- Asumu Takiwaka
- Andrew Brown
- Bailey Hayes
- Chris Fallin
- Deepti Gandluri
- Derek Schuff
- George Kulakowski
- Jacob Abraham
- Jakob Kummerow
- Jeff Charles
- Luke Wagner
- Nick Fitzgerald
- Paolo Severini
- Petr Penzin
- Richard Winterton
- Ryan Hunt
- Sabine Schmaltz
- Thomas Lively
- Zalim Bashorov
Discussion: backward-incompatible tweak to text format lexing (WebAssembly/spec#1490) (Thomas Lively, 5 minutes)
TL: I've been working on new parser in Binaryen and noticed an inconsistency. There is a non-normative note saying that tokens in the grammar must be separated by whitespace or parentheses. But in the normative grammar, they can also be separated by double quotes with no space. I was going to add a note about this but Andreas said that it was the intended behavior that you’d need whitespace or parentheses there. So maybe we should update the grammar instead of the note. The reason this might matter is for consistency, e.g. if you have 2 float numbers or identifiers next to each other with no space (e.g. with a $ prefix) they would parse as one (invalid) token. But not if one of them is a string. It would be nice to require a space between them. Technically this is a backward-incompatible change, there could be programs with identifiers next to strings with no space, and this would break them. But it seems pretty unlikely. Does anyone here have a text parser and know what would happen there, or anyone have other opinions?
AR: I could draw an example if that helps. One thing to note that we already had similar decision with numbers in the past, possibly Dan Gohman who pointed out that we would parse 0$x as two tokens, we fixed that we would pass it as one token. Similarly the thing we would need to fix for strings are:
0”a”
“A”0
“A””b”
Do we want to fix that to be consistent, but it would technically be backwards incompatible. Tested this in the Spec format, but didn’t bring up anything.
PP: What was the use case for the first one?
AR: it was really just meant for clarity. We allow lots of characters in keywords and tokens, even special characters, but e.g. 0a0.52*xx4 would parse as two tokens 0 and a0.52*xx4 before, which was rather surprising.
PP: So it’s not about the “$”
AR: No, nothing special there. The only thing we allow without spaces are brackets/parens. Other than we always have to separate with white space, an oversight was that we don’t do that with strings
PP: Right now, we would have two buckets, instead of a single string we would have a token?
AR: Currently “a”0 would be 2 tokens, but if we make this change it would be rejected.
PP: Unless someone was passing a string and a number, and they forgot to put the space in were expecting the parser to put the space in which would no longer work.
AR: in the webassembly syntax, you can’t really pass things to parse as input. You could e.g. have multiple strings in data, e.g
(data _ “a””b”)
TL: Data grammar already concatenates multiple elements so shouldn’t matter?
AR: This change would make that example illegal. You’d need a space.
AR: Even single quotes are allowed in identifiers, but double quotes are not.
TL: Does anybody care if we make this backwards incompatible change to the text format?
DS: We should just turn this into a unanimous consensus vote. Does anyone have objections to changing the text format to disallow strings without spaces around them?
PP: Maybe we should post it somewhere to notify people?, but I have no objections
TL: There’s a PR on the spec repo that’s been up for a while and there’s been no discussion. Probably nobody cares that much.
AR: Text format is a secondary concern to most people. One thing I have to figure out is how easy that change actually is, we could add double quotes to reserved category but not possible because strings also can contain unicode characters, so we have to allow every string elements except white space, e.g. to cover (import “aä””b”)
TL: We can work out the details. I don’t hear any objections to the proposal though.