Skip to content

Latest commit

 

History

History
2212 lines (1176 loc) · 116 KB

CG-02.md

File metadata and controls

2212 lines (1176 loc) · 116 KB

WebAssembly logo

Table of Contents

Agenda for the February meeting of WebAssembly's Community Group

  • Host: Google, San Francisco, California
  • Dates: Tuesday-Wednesday, February 11-12, 2020
  • Times:
    • Tuesday - 9:30am - 5:00pm
    • Wednesday - 9:30am - 5:00pm
  • Video Meeting:
    • TBD
  • Location:
    • Google SF
    • 121 Spear St, San Francisco, CA 94105
    • Guest entry point: Two Rincon Courtyard
  • Wifi: TBD
  • Code of conduct:

Registration

Registration is now closed, email WebAssembly CG chair if you would like to attend virtually.

Logistics

  • Detailed directions:
    • To find the guest entry point, enter through Rincon Plaza between Spear St. and Steuart St., take the elevator to the 2nd floor, and follow directions to the Google suite 200.
    • Commuting from the SFO airport or East Bay: Take BART to the Embarcadero station, and exit towards Spear St., Rincon plaza is a short (~4 minute) walk from the station.
    • Commuting from the South bay: Take Caltrain to the SFO 4th & King station, followed by a Muni to the Embarcadero station. Renting city bikes, and scooters along the embarcadero are also an option from the caltrain station to the Rincon plaza (~10-15 minutes).
    • If driving, parking is available at the Rincon Center garage and 201 Spear St. garage. To book cost efficient parking ahead of time, try using Spot Hero.
  • Contact Information:
  • Breakfast:
    • Breakfast will be available on both days starting at 8:45AM.
  • Dinner Options:

Agenda items

Schedule constraints

  • Effect handlers in the morning, possible remote participants from Europe

Meeting notes

Opening, welcome and roll call

  • Adam Foltzer, Fastly
  • Adam Klein, Google
  • Alex Crichton, Mozilla
  • Alon Zakai, Google
  • Andreas Rossberg, Dfinity
  • Arun Purushan, Intel
  • Asumu Takiwaka
  • Ben Titzer
  • Ben Smith, Google
  • Chris Drappier
  • Clemens Backes, Google
  • Charles Vaughn, Tableau Software
  • Conrad Watt
  • Dan Gohman, Mozilla
  • Deepti Gandluri, Google
  • Deian Stefan, UCSD
  • Derek Schuff, Google
  • Eftychios Theodorakis, Dfinity
  • Emmanuel Ziegler, Google
  • Eric Hennenfen, Trail of Bits
  • Eric Prud'hommeaux, W3C
  • Eric Rosenberg, Apple
  • Erik McClure
  • Francis McCabe, Google
  • Gus Caplan
  • Heejin Ahn, Google
  • Hovav Shacham, UT
  • Ingvar Stepanyan, Google
  • Ioanna Dimitriou, Igalia
  • Istvan Szmozsanszky (Flaki), Mozilla
  • Jacob Gravelle, Google
  • Jakob Kummerow, Google
  • Joachim Breitner, Dfinity
  • John Plevyak, Google
  • John Renner, UCSD
  • Jonathan Beri
  • Keith Miller, Apple
  • Lars Hansen, Mozilla
  • Lee Campbell, Google
  • Luke Wagner, Mozilla
  • Mingqiu Sun, Intel
  • Nabeel Al-Shamma, Adobe Inc.
  • Natalie Popescu, Princeton University
  • Nick Fitzgerald, Mozilla
  • Pat Hickey, Fastly
  • Paul Dworzanski, Ethereum
  • Peter Huene, Mozilla
  • Philip Pfaffe, Google
  • Piotr Sikora, Google
  • Radu Matei, Microsoft
  • Rich Winterton, Intel
  • Robin Freyler
  • Ross Tate, Cornell University
  • Ryan Hunt, Mozilla
  • Sam Clegg, Google
  • Sander Spies
  • Sergey Rubanov
  • Shravan Ravi Narayan, UCSD
  • Stephanie Doll
  • Sven Sauleau
  • Svyatoslav Kuzmich, JetBrains
  • Thomas Lively, Google
  • Thomas Trankler
  • Till Schneidereit, Mozilla
  • Vivek Sekhar, Google
  • Wouter Van Oortmersson, Google
  • Yuri Iozzelli, Leaning Technologies
  • Yury Delendik, Mozilla
  • Zalim Bashorov, JetBrains
  • Zhi An Ng, Google

Opening of the meeting

Introduction of attendees

Host facilities, local logistics, code of conduct

Find volunteers for note taking

Joachim and Anonymous Wolf volunteer

Proposals and discussions

Presenter: Ben Smith (Slides)

Ben Smith gives updates on Bulk Memory, see slides.

Poll: Move to Phase 4?

AR: Caveat. Open issue on reference types proposal that affects this. Something about ref.null instruction and subtyping between funcref and anyref.

HA: We’re removing nullref? If so isn’t that a separate proposal?

AR: Will present + explain all that as it affects this proposal.

BS: Will continue with reference types and come back to this afterwards. Other concerns?

Presenter: Andreas Rossberg (Slides)

Andreas Rossberg recaps the reference types proposal, including recent changes (pre-declare ref’able functions, nullref). Spec is mutual dependent with spec for bulk memory operations.

Spec status at stage 3, Lars promises the missing tests related to multi-table bulk by next week surely.

New open question: Get rid of subtyping funcref <: anyref? Nice for impl, but annoying impact on bulk memory, C API.

HA: Remove all subtyping, or just this particular relation?

AR: Yes, part of the question. With types introduced in this proposal, then we don't have subtyping anymore. We could rip it out of the proposal, We still need subtyping for references, but we may not want a top type, and nullref is not a (sort of) bottom type. The two endpoints will be gone. With GC types we still have hierarchies in the middle. Slightly weird to still have subtyping in the proposal.

HA: For exnref, it doesn't matter if nullref is bottom or anyref is top.

AR: We recently discussed that it should be both.

HA: We only care that we have a null value, right.

AR: Would have another ref.null form for exnref, so that proposal too is affected.

RT: Question for call indirect, what are you planning when the calles types are different from caller types?

AR: Deliberately require the same type, not subtypes.

JR: Why don't we have funcref and valueref, and funcref sits above it, and no anyref in the language.

AR: Not sure what you mean?

Perin: We can have a supertype any but … nevermind.

RT: Another problem, we could have existential types. Theorem that having a nullref as bottom type affects tooling. If nullref is a subtype of any type, that can break tooling.

AR: Unclear impact of this change on JS API, maybe some lazy boxing?

TL: Repetition of type parameter on ref.null in elems. Why can’t this be fixed using new encoding?

AR: because the idea is that the elements are expressions, and they are regular instruction sequences, so they need to have the same instructions as used in other parts.

LW: New elem section encoding maybe?

AR: We have two elem section versions, one expressions and one indices (the latter has no null)

LW: Things start null right [AR: yes]

BT: Basically you're advocating for having separation.

AR: I am lawful neutral on this.

LH: What are good reasons to do it?

LW: You can represent it differently if they are not subtypes. Otherwise they have to have the same boxing format, and that can deoptimize one case.

BT: About boxing: if we have separation, another instance of not having a universal supertype. If we want a universal supertype, introduce boxing instructions. Or you can have boxing in embedding, or boxing instructions in bytecode. Not fundamentally different than having value types different than reference types.

LW: Explicit coercion instructions could be fine -- not yet measured, but we try not to prohibit optimizations. Do we need funcref to be a subtype of anyref? We thought originally we wanted anyref to be universal, but maybe we don't want that. More natural state is that we don't require that.

BT: Generally agree, but another consideration -- when things escape to embedding, JS, then comes back as anyref, then you want to downcast. You may want to downcast anyref to funcref, which would be explicit coercion.

AR: It would no longer be case, it would be conversion. Luke just said, and I agree, not a strong use case to have conversion. If it turns out we need them then it becomes a difficulty in the future.

HA: I understand the rationale, but what is the rationale that we remove nullref? Prefer for tooling not to implement. But why?

AR: Reason is that it only makes sense if you keep type hierarchies disjoint. Can’t have a shared null value. Subtyping should not introduce coercions, if they have different values, then they have different types.

BT: With the typed function references proposal, are those nullable?

AR: Have both. Questions?

LH: If we adopted the current proposal, would that stand in the way of anything really? Would it really preempt other solutions?

AR: Not really, but then you can't implement the way you want now. Not sure how significant.

LW: Would end up with two func reference types.

FM: Is there a requirement that the C-API represent unboxed values -- you could have boxing as part of the C-API.

AR: You would have to box at the boundary, that seems undesirable. We do that in JS, but we don’t really want implicit boxing and allocations in the C API.

LH: Having two function types is not desirable, but reflects the complexity of the domain. It is useful for some purposes.

AR: Once you have GC types, you can always wrap the flat ones in the struct, but we don’t have that yet.

BT: Is there another alternative, we can introduce the supertype later, you can't reference now because it doesn't exist. (supertype of anyref and funcref)

AR: This doesn’t solve any of the problems we have right now; the spec design assumes a supertype now, so if we don’t have it now, we have to change design.

LW: Once you share a supertype you share a boxing format.

AR: How do we proceed?

AK: I am trying to understand the disadvantages to the separation, besides changes to proposals?

AR: Complicates APIs, more work for multiple impls.

BT: Separation is to maintain possibility of optimization -- we can see it there -- but the complication is forever, if we never do the optimization there we need another advantage that makes it worth it.

AR: Uniformity tends to be simpler on the surface.

RT: Can't you add the subtyping later if you want it?

AR: But then you already have the complications.

RT: The complication is that you make it abstract enough -- so you can add subtyping later and decomplicate.

AR: You can’t get rid of the extra, redundant stuff that we have already.

RT: Another side -- are we going to run into these complications anyway -- once we get into more types we get to that anyway?

AR: One thing I could imagine is that eventually we want “unboxed references”. Nothing preventing us from adding that later, would not invalidate anything we do now.

BT: Another consideration -- if we want to add parametric polymorphism, then this discussion is moot -- so adding thirty different reps doesn't matter.

AR: That’s an interesting point. So far the thinking on generics is that (if we get them) for the foreseeable future is that the types the quantify over is anyref. If we remove anyref, then what?

LW: We did talk about this in the issue -- there's always a bound, func is a subtype of any,

AR: But then you can't compile it generically.

BT: What we have now: Any is the universal supertype for reference thingies. Polymorphic functions have to offline specialized to each of the primitive types, and one for any (untyped with casts).

AR: If you are compiling generics in a language, and then using generics in wasm, then you have a universal rep for that. If you use generics on primitives, then you'd have to specialize that. Languages that have universal rep throughout, would start to have to specialize.

LW: You'd have to specialize for value types...

AR: They could define their own hierarchy.

RT: We have a fleshed out proposal that does this.

AR: That requires more machinery, variant types, which we tried to duck for a while. Puts a cost downstream on other things if you don’t have that shortcut.

BT: I think we're going to need to look at advantages and disadvantages, may need to take this to next CG.

EP: How many downstream proposals depend on this?

AR: Depends on what you count. Many depend, but don’t care about this detail yet. At least three or four do. Exceptions, Bulk, GC, func refs.

TL: On topics of pros/cons, any tooling language people who have a concrete use-case for subtype relationship between funcref+anyref? – silence –

HA: If we go with this version, we remove nullref and anyref?

AR: anyref still exists, but no more relation to exnref. Purpose is host objects.

TL: At the summit someone explained anyref as “crossref”, as it references foreign objects. If it is not a supertype then we should probably rename it.

AR: Thought about this as well. Maybe “foreignref”? But more changes through more proposals and tools.

HA: It seems like we discussed this before, and we made a decision. But now we're revisiting, why?

AR: I don't think we suggested this at the beginning -- I think Ross suggested it in Lyon. At the time we concluded it didn’t matter. The anyref idea is natural when you're coming from typical engines that use uniform rep, like JS engines. You want to have a type that can represent any type of value.

LW: I want to point out that anyref can store any JS value, but isn't supertype, you can convert, but it isn't a supertype.

AR: It has only anyness wrt some embedding

LW: JSValues aren't subtypes of anyref, they're just convertibleconvertable -- boxable.

KM: Do you actually do a conversion?

LH: When you pass in a small value. [KM: box it differently?] Yes.

KM: I don't want to allocate a new object when I do this...

LH: There are reasons for reboxing.

KM: Strongly opposed to any proposal that requires reallocation.

AR: If you do funcref, and it has a flat representation, then you do have to do it? I guess you could give it the uniform representation, but then you have allocations in Wasm.

RT: Some languages benefit from uniform representation. Cost only for values for cross-communication.

KM: I expect that most interactions will be with DOM, so the bridge will be expensive.

RT: I don’t think his example was the DOM that would be moved. These would just be moved. Only primitive stuff (i32). Heap stuff can just be moved around.

KM: As long as heap values don’t need to be boxed it’s ok.

AC: Can we just take this and put it in a separate proposal, to unblock reference types to get host references, without dealing with funcrefs.

AR: Already the MVP has funcrefs, so we need.

AC: We can just prohibit touching function tables.

AR: grumble

AC: Is this within the realm of possibilities?

AR: I don’t think it would solve the problems. There is a junction, we still need to decide.

AC: There are two options: Separate hierarchy (always) doable or subtype of anyref. Both could be done separately, it seems we can do anyref now.

AR: But we need to know what to do for null? It would be super ad-hoc if we don’t solve that now.

AC: Just a suggestion…

AR: I wish there would be something to unblock.

BS: Should we do this design work here now?

AR: Yeah, but how do we proceed here? How do we decide?

BT: Which possibilities have one-way streets. Making anything a anyref is a one-way street; keeping them separate is less of a one-way street.

AR: It’s not one-way, we can likewise add flat later

BT: Then we have two kinds of funcref.

AR: So what, we have multiple ref types anyways.

RT: Fundamental question: Do we want uniform representation, or do we want to allow different representations.

AR: Not either-or. Do we want to make it either-or? If not, we can go ahead as is.

CW: Do we need an infinite number of nulls, or one for each root in the hierarchy?

AR: That’s the question. Probably enough to have one for each kind. But how many kinds?

CW: Also about Ross’ question about … I don’t understand that point

DG: Cutting off discussion. 10 min break.

Presenter: Heejin Ahn (Slides)

RT: Why can exnref can be null?

HA: Good question. Basically the toolchain compiler has to have the ability to handle that. We need a null value for reference types.

BT: When you declare a local of reftype, you need to have a default value.

HA: Good point as well.

AR: It's because we don't have let yet (it's in typed function ref proposal)

RT: Could throw check for null?

AR: Throw doesn’t take an operand.

HA: Rethrow might. There it traps.

KM: Back to JS API. JS catch keyword would not catch exceptions from Wasm traps anymore? (Impl. do that at the moment)

HA: This is to be consistent whether we catch before or after a JS frame. Trap exception turns into JS exception when it hits a JS frame. But should not be catched in Wasm. So Wasm should not -- since wasm doesn't catch traps. Before and after it hits the JS frame, but if we catch it hits the JS frame, that's not consistent. So we decided that traps aren't caught by JS frame, but it becomes converted to a JS exception at the first point. The original trap is not caught before or after entering JS frame.

KM: If you had change the semantics to not be able to catch Wasm traps would break lots of applications. But that is not what is happening, so all good.

IS: Can you catch a regular JS exception from wasm code?

HA: It is not going to match any Wasm signature if it was created externally.

IS: Can you import it?

DS: We don’t have type imports yet. They can be caught, but you cannot test on them until we have type import. But you can catch and cleanup.

HA: also print error message.

IS: What happens to Wasm exception that gets propagated through JS frame.

HA: It still is a wasm exception. Will be caught and matched like the original.

AR: You need not type import, but exception import.

HA: We can import and export events, but not the kind that are foreign, without type imports. With type imports we can import foreign imports.

EP: Is there a way for a JS programmer to wrap their exceptions just inside a callback and turn them into something that Wasm could consume? Synthesize a Wasm exception?

HA: Two points: if they want to disguise their exceptions as wasm exceptions, they can't do it. They need to have internal structure. The other point, it's not in the proposal, but we may extend with function API with type reflection, which would allow us to construct an exception externally.

EP: And you could proceed with the current proposal and add that later?

HA: Yes, it's an extension. We can extend it along with the reflection proposal.

IS: What he is asking for should work already. … Should just work out of the box.

EP: Provide another parameter (…)

EP: If you are about to call a JS callback you could pass a webassembly function that constructs an exception and the JS catch could call the Wasm exception constructor, that way you can propagate back into Wasm an exception it recognizes.

HA: Isn’t that how intertwined frames work?

EP: Yes. the exception happens in JS land and is handled in Wasm

AR: Simply put: You can export from Wasm an auxiliary function generating exception. Could also polyfill missing API.

Back to slides

POLL:

Move Exception handling proposal to Phase 2.

SF F N A SA
30 14 1 0 0

Exception Handling moves to phase 2.

Stack switching / Coroutines / Effect handlers

Presenter: Andreas Rossberg (Slides)

TL: Can you dig in the term “delimited”.

AR: You only capture the continuation up to a defined point. It’s a stack segment.

Back to slides

RT: Can you mark that the exception is expired?

AR: It is a runtime state, it traps on second invocation. We have complete formal semantics for such details.

Back to slides

LH: You can store these in global variables? When does the continuation get captured? When you throw or when you store in global? The stack becomes disjointed (…?…)

AR: That's on the next slides. Basically ....

BT: Resume is unreachable after that?

AR: No, it can come back.

BT: But I thought it was one-shot?

AR: But it can still come back.

Back to slides

(question on slide “effect handler”)

BT: So if cont is a value type, you can store in a local, isn't it the rep of the stack where it was thrown. How does that relate to the stack when you exit the try?

AR: Like with exception values themselves, you have to do some automatic management. If you abandon some continuation, you have to collect the stack.

BT: That means that it's not the case when you exit the try that you end the lifetime of the stack.

AR: You can only exit the try when you have resumed the continuation. You only get back into the try if you resume. Make sense?

BT: not really, .... if you get the continuation in the catch, then you exit the catch which exist the try...

AR: The “regular” exit from the try body ends the stack, not when exiting from catch.

BT: Ah... "regularly" missed that

AR: one thing I gloss over -- in effect handler community -- deep vs. shallow handlers. When I resume, am I still under the handler or not? There are tradeoffs. None is better. You have to annotate everything anyway, maybe it's not like exceptions very much.

HA: The current exception spec doesn't use the result field at all -- does annotation of result count enough as an annotation?

AR: No, not enough, you could have an effect where you don’t pass anything back, it might be empty. If you have a generator where you don’t pass anything back.

HA: You can use the attribute event, I think?

AR: Right, that’s why we put these things in the binary format.

JR: On cont.resume, what exactly does $l refer to? What is the semantic meaning?

AR: That's where the instruction will jump to when the continuation yields. [it's the name of the catch block] yes, the handler.

LW: When it flows through to the next instruction, without the branch. Is that when you can destroy it? That consumes the stack?

AR: At that point the stack is done, yes.

AR: What this adds over usual coroutines is that you get typing. And it makes it composable. You can define different ones, that don't have to know about each other. You can compose continuations that use both without interfering. For example, if you want to do a scheduler, then the computations use generators inside. But you want to be able to switch to another light-weight thread, even inside the generators. So you can have composable control abstractions.

RT: If you have br_on_event, then you need to dynamically inspect the event. can you interfere by grabbing someone else's events?

AR: Good question, yes. You can swallow other events. Fair question whether that should be allowed or whether they should list events that they catch.

RT: It would be useful for more useful implementations, too.

AR: Yes. Same true for exceptions. One could argue against a catch all.

Back to slides

(on slide with example execution and long vertical blue arrows)

LH: here you don't use the label, right?

AR: Yes, you need it here.

LH: Really, why?

AR: Otherwise you can't resume...

LH: It seems like there's an implicit label here?

AR: explains

back to slides

AR: Not yet a proposal, but looking for comments now.

RT: Trying to represent people who know more about continuations. What about finally clauses? Dynamic unwind? Any plans?

AR: There are various things that these semantics/languages have -- I hope to get away with just yield for wasm.

RT: Possible to get away with it, maybe.

AR: For a low-level language, maybe the producer should handle that. Needs investigation.

JR: What happens if I run a continuation through a tee.

AR: You can reference it many times, resuming a second time traps dynamically. Will just be a flag in the representation on these.

JR: Can you check if it is run?

AR: That’s some kind of reflection, don’t think so, that might invite insanity. No obvious use case.

LH: What if you resume across a host function? That’s the hard case usually.

AR: Other systems switch to a different stack, can be expensive, there are various tricks, but not an expert. But good question!

LW: continuation would keep a stack alive, locals are mutable, you can create cycles. Does this require GC?

AR: Good question. I don't know, how would you pass back a live continuation to itself without consuming it...?

LW: As long as there is hope that this doesn’t require GC this is good.

HA: (can you back to the slide example w/ try?) in this case br_on_exn returns two values?

AR: Yes, yield exception has param int, if you use this exception that can be resumed, you get continuation as well.

HA: If you resume, it's does not have any relationship with rethrowing.

AR: Rethrowing goes the other direction.

HA: At which point do you call continuation?

AR: explains

Break for lunch

Presenter: Ben Smith

BS: we did talk about this in December 2018. People had strong opinions about this, wondering if this have changed. There have been changes in the environment (number of browsers changed, non-web VM-s). We’ve been word-smithing a lot how exactly to phrase the change, I don’t want to spend too much time on this but wanted to bring this up again. Any thoughts?

TS: Some of our discussions are happening, and ongoing in TC39 - but now there are embedded engines that play a role in production and to me it feels like a simple rewording doesn’t capture what’s at stake. What we want out of multiple implementations is that that implementation actually works in a feasible way. What’s most realistic here, for each proposal there should be a list of complexities, and what kind of implementations address these. Ex: Synchronous I/O

BS: I think Ben has been speaking to the same point, . The wording could be a bit looser which is making people a bit uncomfortable. …

DS: If we’re going to allow differences between proposals, declaring upfront when going from e.g. stage 0 to stage 1 what the criteria should be.

BS: Is there a benefit for having it upfront? DS: No DG: Yes, at lest people know what the bar is.

TS: There’s just value in having a three word change here, and everybody knows what it means - proposals should have language on which kinds of hosts will implement a said proposal

DS: maybe we should even flesh out why do we have a requirement in the first place. Like when we are talking about production web VMs we should probably explain what properties we have and why.

FM: There is a complementary requirement on the part of the engine. If an engine doesn’t do anything… We could have the notion of a reference engine. With a reference engine any proposal must be implementable on that reference engine. Conversely if you’re sponsoring a reference proposal, you have to implement it.

AK: My understanding of the TC39 wording, is that it comes somewhat from WHATWG/W3C. I wouldn’t want to require engines to implement things, it’s more demonstrating the implementation’s intentions(?).

BT: It will also be difficult to have one engine that implements everything. There could be a lot of diversity of proposals so it’s unrealistic to expect that one engine can implement everything.

EP: W3C has the conformance section for a reason, discouraging a lot of “MAY”-s. Basically we try to minimize the optionality. We used to have reference implementations, but we dropped that notion (people had to be bug-compatible with them). What we do encourage is having discrete chunks, more modularity: “here are the things you need to have, the modules you need to implement”.

BS: We have talked about certain features that people won’t need to be supporting -- simd is unreasonable to expect to be supported. Optional features have been discussed multiple times and is something we will want to have. What we need here is how we make this process more supportive of the entire WebAssembly universe, as opposed to just web VMs.

KM: Historically, it was based on WebVMs being able to veto everything. If we’re moving towards an optional feature word - instead of having a WebVM requirement - say these features will need to be implemented by WebVMs.

BS: It’s more like getting enough buy in.

KM: We haven’t gotten to a point where we have a contentious feature yet - like WebUSB for Wasm. You want to plan for the possibility of contention

TS: I want to say that at TC39, effectively we have a commitment to implement. Not blocking advancement to stage 3 means we plan to implement the feature, it falls out of the process. There’s value to this, it provides bounds. However not all features make sense for all engines (Ex: WASI). I don’t feel like we should block Stage 4 for WASI just because browsers don’t have it natively implemented yet.

EP: Is it possible to come up with a name for it? Like for the WASI ecosystem, just having a hand;le for that makes it easier for people to say I want that and differentiate.

BT: One thing is kinda what Till said, the dark side of allowing features to be implemented in certain engines, some features might only get implemented for a small niche of users. That is a potential downside, some design problems that need to be solved would not be solved for the general case.

BS: That would be the decision the group would make, somewhere early on we would say is this is a requirement - are you saying we may not do a good job of that?

BT: The scenario I have in mind: someone proposes something for a narrow domain, nobody really cares, so they rush through the process. Then later on, they realize, “if we have made this one small change it would be applicable to many more domains”. I want us to avoid over-specializing.

TL: We’ve identified a lot of risk factors that can show up in future proposals - we’re looking for a mechanism that will show up in feature proposals. But we’re not there yet - all we can do is enumerate the challenges, and for the actual process change. The best we can do is allow an escape hatch so we have a reasonable default process, and then individual proposals can opt out of that default case and we can evaluate it on a case by case basis.

EM: As someone developing a non-web VM, I’d prefer defining a set of properties for Web VMs versus properties we expect in a non-web VM, because this makes it more clear what engine implements what and makes figuring out compliance easier for everyone.

BS: There’s precedence, for the JS API, we do limit some things there - we could do something similar. How do people feel about taking this line out, and have a line about establishing requirements for Phase 4? Will not do a poll now - I’ll file a design issue and we can discuss offline.

Presenters: Andreas Rossberg (Slides), Thomas Lively

AR: Summarizes

TL: Despite being a simple proposal, it’s been a real pain in Tools side. The fundamentals are there, we know it will work, it’s not landed but we know it works, I see no problem moving it forward.

RT: Can you give an insight to what the problems are?

TL: Implemented as an AST, so everything that flows out of it.

AR: It’s interesting, we don’t talk about tooling so far in Stage 4. Any concerns to move to stage 4 at this point?

DS: It’s a requirement in Stage 3(?)

AR: Oh in that case I take it back.

POLL:

Move Multi-value proposal to Phase 4

SF F N A SA
34 0 7 0 0

Multi-Value proposal moves to Phase 4.

Presenter: Andreas Rossberg (Slides)

AR: What do folks want to hear about?

CW: Nominal vs. Structural

FM: Are we planning to hear from RT today?

RT: Didn’t know this was going to be discussed, so no slides, but can wing it.

AK: Also vote for nominal vs structural

AR: We will need to talk about the other thing at some point in time as well, as that’s more a question of “where do we go from here?”. But it probably makes sense to get this out of the way first.

slides

BT: It’s not just efficient support, but programs will leak memory unless we do this.

AR: yes, we want efficiency and correctness.

Back to slides

AR: talking about avoiding requiring a specific toolchain

TL: Can you explain what you mean by specific toolchain? it doesn't seem to fit with the producer doing all the lowering.

AR: Maybe not the right wording. Different producers can coexist. But we don't want to have something essential outside the module that you need to be able to use GC. It's fine to build toolchains around that, but it should not be an essential requirement.

TL: I should be able to use a single module and have it use GC. There is not a toolchain convention that is more than a convention

Back to slides

FM: What about requirements coming from GC itself. What kind of GC, reference counts. What properties of the GC itself? Minimum jank, etc.

AR: Cost model. We might want to mention things that are not semantic, jank, etc.

FM: What we count as a minimal GC itself.

AR: GC is where you don’t have to free memory yourself

BT: I think the question is “can you just do this with reference counting”. The answer is “yes”, but you might still leak memory. It’s very hard to come up with a definition of a correct GC without leaking memory.

AR: Other than saying "we don't want manual release of memory", we can't say much else about it. Not aware of other languages doing that.

TL: As a matter of implementation practicality, we may have other designs that make things not possible -- incremental vs. full GC. Not semantic, but does matter.

LH: One thing you didn’t mention is how different host systems would impact the type system we are designing.

AR: What level of host interaction do we want here? It depends on the embedding, JS will be easy, … C-API will be ?? Do you have a suggestion?

LH: Don't have a suggestion here...

Back to slides

??: What's the motivation for encoding mutability?

AR: Subtyping. We need a sound type system. Subtyping only sound when the fields are immutable. One place where this shows up is objects + v-tables, you need them to be immutable or subtyping doesn't work out.

Back to slides

FM: What about interior pointers?

AR: That is a good question, one of the reason for leaving nested aggregates out of MVP is so we don’t have interior pointers yet. Will be added in follow-up, but as separate type.

BT: Interior pointers is tricky. Even if you represent them as a fat pointer, not easy for concurrent access, I'd actually suggest that you have to provide an operator with a full path, you can only access the scalar parts of the struct in an array. Can discuss later.

AR: Could be races or something?

BT: You either have to organize the way the GC like Go does, or you have fat pointers, which are potentially racy and slightly scary

Back to slides

AR: talking about guaranteed unboxed scalars using pointer tagging

KM: how does the spec guarantee it's unboxed?

AR: We can't guarantee, but it's part of the cost model, that’s informal. We've talked about other types, that can reference a full 32-bits, but that might mean implicit allocations and branching on unboxing.

BT: You mentioned value range -- v8 now has 30-bit SMIs, to deal with weak refs. Should it always 30 bits, 31 bits, engine-specific?

AR: Should be deterministic, pick something we know will be OK for most engines.

AK: No, v8 is still 31 bits.

AR: OK. More fine-tuning in the design, but...

??: Is the goal is that we can tell whether it’s boxed or unboxed? Can we statically tell if it’s too large of a value?

AR: Yes, scalar.new will trap... or it could forget the bits, that's maybe cheaper to do. It's more like assembly language that way.

RT: It makes sense that we have some size guaranteed that your unboxed. But why guarantee trap or forget bits. What's the value in having it crash?

AR: If you don't then you have hidden allocation, and test+branch on projection. You have to check other reps. If you already know it's a scalar then you have to check whether it's boxed or unboxed...

AK: I'd like to get to nominal vs. structural, perhaps we can move forward.

back to slides

break

Structural vs. Nominal typing

Presenter: Andreas Rossberg (Slides)

talking about java's “unsound” type system

RT: The reason why they do that is so they can start running before verifying it.

BT: Late bindings.

AR: Yes, lazy linking / late bindings.

AR: You either have to know the structure of the imports to make it sound, or do runtime checks later.

RT: Or you could eagerly have accesses available. This is because of class loading. The easy solution is to compile the classes when you load them.

AR: But that eliminates the modularity.

RT: In wasm i can compile wasm -> module, and another, and verify that they're linkable. But then you hit the same property.

AR: But you validate and compile before you link.

RT: Java does this to start running faster -- wasm has the same issue.

AR: But that’s not the WebAssembly model, in webassembly you compile and verify an individual module before you link & execute the code.

BT: instantiation never needs to generate code, it's all delegation.

RT: This happens because you didn't check that your links are valid.

AR: Need to be able to compile without knowing your imports. At least need to know the structure. Some basic constraints of what you're using from the outside.

FM: Equivalently you have to import the accessors.

AR: But that is what this is saying...

BT: The way Java linking works doesn't work with wasm modules. As soon as it succeeds it always succeeds, but it can fail multiple times.

RT: Does our nominal linking not work...?

BT: I'll give away the answer, you can't implement Java semantics using wasm's model. You need runtime checks...

RT: I agree, that wasn't the plan.

AR: Do you agree you need structural constraints?

RT: Yes, you need to have structure on the schemes.

AR: OK, but then it's structural typing right?

BT: Java is looser... you can have looser typing.

RT: And that's important.

TL: Considering Java here is making it more confusing... what's the best we can do with nominal and structural.

back to slides

export inversion, naming imports

FM: That's a bug in the way wasm works!

AR: What alternative are you thinking of?

FM: Why should I know where the variable is coming from, why not just provide it?

AR: Ah, I'll get to that ... the importing module itself is determining where it is coming from.

back to slides

AR talks about type sharing by centralisation

SC: To a certain extent, the linker already understands the type system, it needs to dedupe the functions.

AR: It wouldn't have to do that, though it's an optimization. For other types it becomes more involved...

BT: This example uses arrays, does that mean only arrays need to be structurally typed.

AR: Yes, good question. Spoiler: I think you want both structural and nominal types.

RT: So what language did you compile from?

AR: Does it matter?

RT: You say you're not trying to solve magical language interop.

AR: Does it matter?

RT: You wouldn't be linking two things that have unrelated byte arrays ... so why would you try to link this.

RT: The reason I brought this up is that for interface types, we need them to be structural.

AR: But they are desugared into raw WebAssembly.

RT: They have adaptors though.

AR: This is basically pusing everything into the toolchain, so you can use them without complicated infrastructure around it. There are cases where I want to use a primitive wasm module, independent of language.

RT: Your byte array module, won't be able to interact with any major language, ...

FM: There are some types you've subconsciously made structural -- i32, it's a bitstring. Some types we've decided that are so important that they need to be the same anywhere.

AR: No structure in i32, it’s an abstract type. Abstract types can only be nominal...

back to slides

FM: Why is the number of imports not bounded on the number of types used in module?

AR: Because to define the constraints on some of the types, you need the other types. You have to describe the type of that import, and that is this Dvt type, and when you import it you have to describe the structure, and that may use others...

FM: That's still implicitly bound on the types used in the module. For any given object, it has a finite set of...

AR: It's a transitive closure of the entire dependency graph in the worst case.

JR: it's so bad because we're trying to specify subtyping in a nominal system...

AR: It's not specific to subtyping, it's just that you need to keep accumulating...

JR: How do you avoid this with structural types?

AR: The problem is that you can never drop imports on any level. Exports you can forget or abstract. It has a much tighter coupling between modules.

RT: Why can't I just import D?

AR: But then you can't use the structure?

RT: You're not using the structure, though?

CW: I don't see why this is better with structural typing?

AR: Because you can forget the structures that you don't care about, you can omit the parts that have been abstracted over.

BT: IIUC, you only need to specify the types that you actually use. With nominal imports you need to propagate the entire type chain.

RT: You have fewer imports -- but you still haven't saved type space. You haven't saved that?

AR: Not sure ... Inheritance sort of the worst case, is tricky since it needs the structure. But if you were only passing the object around, then you wouldn't need to know anything. If you want to set a value on the instance, you don't necessarily need the vtable type.

BT: I don't think this is covered -- if you import a type A, and you have no constraint. And then you want a compatible struct with an extra field, how do you do that?

AR: Can't do that. So you need the full structure. If you want to be able to produce something, you need to know what you are producing.

RT: Here's a use case ... suppose two modules are compiled with the same language. Languages come with runtimes. You want to share runtimes, so someone will build the standard java runtime, dynamically link to this module...

AR: But then you create a dependency...

RT: But then you have things that you don't want to know about. Different ways of doing locking, objects, etc. Structural approach, everyone who compiles down, they need to specify the structure. If they change anything, then everything is invalidated. Whereas if you have nominal, then the maintainer of the runtime doesn't have to do anything, and it just works.

BT: Not enough... imagine a new type with fields, how do you initialize fields that you don't know about?

RT: Agree, it's in our proposal. But agreeing on structure is not a solution.

AR: It depends on how they want to compile. If you want to reduce coupling ... it's kind of annoying. If you want to avoid it, you can abstract. It's somewhat orthogonal that the producer has to decide. They could compile to some class that doesn't need to know structure.

BT: I did some thinking about extending structures where you don't know the fields... you need another indirection. We have to have an explicit mechanism, but we need to have a way to initialize the fields you don't know about.

AR: The abstraction would have to do some indirection [BT: literally is late binding] yes.

AR: You should do it in userspace though. Inheritance is most complicated -- but for simple structures, you shouldn't have to do this.

RT: What's the indirection you're referring to.

BT: So suppose you have class A, some java library two fields. User library imports the type with one field. Now you compile the user module offline, it has an extra indirection, it doesn't know where the field is at. A table of numbers that you have to lookup through, another load.

RT: Another way to do it is talking to other communities: two other ways

... no indirection..

BT: We did that in v8, we tried to go this way, but we found that it was too hard, we had to patch machine code...

RT: It's not patching machine code. You don't regenerate the code, you link with a module, ...

AR: That's an indirection.

BT: Then you have a performance cliff at linking mismatch... we found that people are feeding v8 200MB modules, we want it to be cached, no way to modify it. There's no slow path with imports that cause you to generate new machine code. Modules in the limit can be essentially huge.

RT: I'd like to find a way to make sharing runtimes possible... faster load times.

BT: Agree with this goal. I think we need to have an abstract import, when you know a little about the type, then it gets compiled with another load + indirection. It might be a fair solution. I don't see another way to do that.

AR: Fundamentally you have to either use indirection or patching. And then the question is what level -- primitive in the engine, or something in userspace.

RT: Your proposal is that anyone who wants to do this needs to manually encode a layer of indirection, that you handle at import time...?

AR: Depends on what you're trying to do. If you're trying to do inheritance, then it's up to the producer to decide how much coupling they want. We don't want to make that decision. I just don't want this to be a decision on our end.

JR: I think this is a bigger issue -- existential imports, not structural vs. nominal.

RT: The link is that nominal you have to work around it, but structural you just bake in the structure into your module. You don't need to import it anymore.

AR: That's not structural vs. nominal, that's how much abstraction that you have. If you want to abstractions, you need nominal. Don't disagree that you should have that, but is that the only use?

TL: Does your proposal allow -- if I want to say the entire contents of the type, including private fields?

RT: Yeah, we were thinking about doing this. You can do import exhaustive to get that functionality.

TL: it seems like for code generation that may be an important property.

RT: At some point I wanted to have this discussion, about import linking. What kind of linking do we want to do, and how do we evaluate this.

TL: If this is functionality that can be added to your proposal in a straightforward manner...

RT: Another thing we tried doing ... you mention that you have a canonical rep for a type and otherwise. So similarly with schemes you can ask for a canonical scheme, and you get this functionality for free. The reason why we didn't do that, is when you have imported nominal types, it makes things more complex.

AR: But you have to do that anyway. At link time you need to check the structure.

RT: You can't ...

AR: We talked about newtype, here you have structural types, something that's declared nominal. You have structural and nominal types, you need both.

RT: We agree that interface types should be structural, and other schemes for nominal...

BT: import an abstract type, is there a way to specify a nominal constraint, can you specify that they're related? Even if you don't have structure?

AR: Yes, subtyping bounds on types, but just one.

BT: More than one bound right?

RT: What's the use case in mind?

BT: Structural types ... you may want more than one path through the graph.

more design discussion

??: There are still subtypes, if something is an int32 any, and another thing is int32 any any.

AR: One thing about structural subtyping is that you can specify the minimum part you care about. The producer can compute the GLB in this case...

BT: Why would you want to do that? If you want to know A is compt with B and A is compat with C, but not.... you only have the parts of the subtyping relationship that you need in the module.

AR: I have a hard idea seeing why you'd want this. It sounds like you're going to union and intersection types.

RT: You don't want to go into union types for structural typing because that's P-Space complete.

AR: Yes, I don't think we need this in MVP

JR: Compelling case for structural subtyping that isn't the linking problem? Is this all about hacking the import system.

AR: This gets back to modularity -- you should be able to merge modules. You can't do that with nominal and structural separation. You want these to be the same anywhere.

JR: When do we need to worry about width subtyping ...

AR: That's when you want to have a function that only has a prefix where you only care about certain fields. If you're compiling a single inheritance OO language, then it probably doesn't come up. With other languages, a compiler may be able to say that it doesn't care about the type, so it should be able to make a better choice.

RT: A fundamental question: for modules you want abstraction. When you share data, the abstractions are lost. So the question, how do you share data without exposing everything about yourself. With type parameterization, that requires a different compilation model. But another model where you can compile it independently.

JR: You want to be able to validate independently... so if we want to handle this issue with extra information that you don't want to pass along, then you need existential types. But that question seems to be not a question of nominal -- but it's more like a new feature to help with linking.

BT: There's a case to be made for structural subtyping for languages that have it built-in themselves, like Go.

AR: That would not map directly.

RT: Not many languages that have the same behavior for structural subtyping that matches low-level structural subtyping. They're different.

TL: It sounds like we've established what Andreas mentioned here -- we need both, nominal types for abstractions. And we need structural types in the short term so we can get stuff implemented with wasm's linking model. You need to know the complete structure. So given that constraint, we want to add an ability to ross's proposal to specify the complete hidden structure. Now Ross's proposal sounds kind of like Andreas's proposal, with extra scheme information. It's not really competing proposals, then it's an incremental roadpath. Is there a way to establish this roadmap, where we can start with structural, then in the future, to enable further abstraction opportunities, we can add in some of the extra information from Ross's proposal.

RT: Yes, so you could add scheme.canon to get a canonical scheme. Then for imports, you want to have a way to support exhaustive imports. Then down the line you can remove that restriction later. Or later maybe we can defer that decision.

AR: I'd like to know the difference between schemes and types ...

AR: Fundamentally we converge on, is that two ways to define a type. structural and nominal. That's the conclusion I wanted to arrive at. Languages have both, you need to lower them. Cross lowering is not practical. So we need to find a way to lower them to the same way.

Some other things we've talked about. What is the benefit of nominal types...?

TL: Maybe table this for now.

break

Debugging in non-web embedding - Limitations of source maps

Presenter: Erik McClure (Slides)

DS: What we have right now, we have a subgroup to discuss the various debugging topics. One of these are the types. All of these things LLVM need to change, it’s a bit unclear.

EM: The only thing we need to change is in practice when all this falls apart.

PP: Maybe we should have a look at this offline, but this (OP_AT…) works for me just fine. But I work on the browser side.

EM: Then we need to teach LLVM how to use this.

PP: In chrome we can use this, with Dwarf.

DS: One thing the LLVM community found is that they could generate code that was working in LLDB, but it wasn’t working in GDB, not because the DWARF was invalid but, GDB simply couldn’t interpret them correctly.

EM: This is an issue we are working with. They don’t want to use anything else, but Visual Studio. If GDB is not behaving, we need to try to convince them to behave.

TS: Are you aware of the debugging support in WasmTime.

EM: No.

TS: We have full support for debugging in WasmTime and LLDB. It doesn’t work in GDB, but there are no fundamental blockers.

EM: But how are you using pointers?

YD: Same workaround.

DS: …

EM: Question is whether we want to fix this, or just rely on this workaround.

PP: The DWARF encoding of pointers is correct in relation to the WASM virtual machine. But it breaks outside of it. I wouldn’t expect this to be fixed inside DWARF, but in the context.

PP: We want to settle on who the DWARF is for.

DS: There’s WASM in and Native Code out. There’s no one correct way to represent what your code did. Out of necessity the debug information will depend on the particular ABI. There’s no standard way to do this.

EM: There are minimal changes we could do. We could fix this in LLVM and not make every debugger implement support for this.

End of Day 1

Presenter: Thomas Lively (Slides)

JR: We can say we don't standardize these, but they will be standardized right?

TL: Yeah, they're de-facto standardized, but the core spec is agnostic about it. What we have here is a mechanism to allow turning things on and off. What is exactly turned on and off, because we can punt to tools, it's less work to do.

JR: Follow up Q: If the embedder does not know what is going on, why make it a standard at all?

TL: We do need the mechanism, without conditional sections in the binary, there's no way of -- for example, running a module that has a SIMD engine in an engine that doesn't support SIMD.

CW: In practice, browsers are going to have to converge on features they expose through strings.

TL: The idea: Not browsers. Continue to do feature detection using tiny test modules. Emscrpiten or Wasmpack will insert that feature detection, same mechanism, which feature strings to pass.

CW: Do you think that's going to stay the case for a long time. One of the original motivations was so you didn't have to ship these modules.

TL: Yes, it’s a trade off. For something like SIMD, just a few functions, I want to ship it to all. For other proposals that change a lot, you’d just double the size of your module. There you probably still want to build twice and choose which to download.

FM: For the names of the feature detection, each proposal could provide a name for the proposal. There's also overlap on this proposal and luke's proposal and module linking. And also, why just one section that is wrapped? You may have multiple things dependent on a particular feature.

TL: Yes, a conditional section contains only one other section - but because you can have arbitrarily large number of sections, you can also have conditional segments in there

AR: The answer is that we have the ordering constraints, so multiple sections of different types in a conditional section would violate that. Unless we loosen that ordering requirement, but that breaks streaming property.

CW: If there's a thought that you could have an advisory role in having strings for features. Does it make sense to carve out a prefix for strings that represent features?

TL: Good idea

RT: These are basically #ifdefs? [TL: yes] So you basically have to validate only the parts that you care about. [TL: yes]

JB: This is really just -- you want different files, so you want to share bytes between them? Is it worth it? Do you want an out-of-bound mechanism instead?

TL: We have to figure that out, would love to have SIMD users use this code, and just work

JB: But you need a lot of help from tooling -- can the tooling just do this? Dynamically paste things together in the tooling?

TL: Good point, this is easily polyfillable - you don’t need the engine to do the work for you but you can individually parse the bytes that would make the model uncachable, a downside

NF: You have to load the polyfill too -- that's a lot of code to load on your webpage.

DG: Outside the web, we don't have JS to do this. So it would be nice in WASI to have a standard set of feature names for this. Should this be standardized by WASI or wasm?

TL: I would be happy for WASI to specify the feature names, and only keep the mechanism under the core spec.

DG: It feels awkward to say it’s a core spec issue, but we have to have WASI solve it differently and we have JS in the browser to resolve this

TL: Possible, yes. But we could do it in the core spec.

NF: Seems valuable to have users to give names, but also have standard names for features.

Flaki: The mechanism is stateless, so there is no way to do fallbacks. Do engines have to understand fallbacks, how does the engine support this if they don't have conditional sections?

TL: You could come up with a design that’s MVP compatible, but that design would have the downside that you have meaningful custom sections that just have metadata, and the design would be much more complicated - so we decided against that

Flaki: You expect that engines know about conditional sections, even if they don't do anything with them?

TL: I expect engines to be aware of them. Engine can’t pass feature strings, and anything that gets passed that way would get ignored.

AR: Probably won't produce a valid module though...

TL: well, tooling could ensure that, the MVP-compat variant

RT: What are the pros/cons? The engine side seems straighforward... what about the tooling side. Generally optimizers don't have to deal with #ifdefs... macros in general cause problems for tooling.

TL: Not quite as bad as text macros -- can't replace function bodies... it's a complication for the tools, but I'd be happy to code it. :-)

EM: For the WASI thing -- a lot of people want to run wasm binaries as executables. "Why can't we download five files"? Users don't want to do this. They want to be able to do "hello world"

TL: How does that relate to conditional sections?

EM: This is why we need it. They're going to be confused about that.

RT: What about the challenge of compiling #ifdefs down to conditional sections.

TL: Not doing it -- interesting experiment but I'm not going to do it.

JR : Idea of parts of the spec that are not mandatory - this provides a way to parse optional features, but we may also have to standardize strings

AR: Once we start standardizing subsets that we can take out, that would come with a string associated with it. In the appendix of the core spec.

TL: That’s good feedback

BT: Core spec is already organized around proposals, so it's natural to do it this way.

TL: Is it organized around proposals?

DG: Depends on the proposal

NAS: If the engine didn't support conditional sections, how would it work?

TL: If it didn’t support conditional sections at all?

NAS: What if it ignored conditional sections?

TL: If you mean it understands conditional sections, but doesn't use feature strings -- that means that the tooling would put in a predicate that has no feature strings, it would activate that predicate and it would work. By having no feature strings, you'd activate the default features.

LW: This works with SIMD, but don’t see how this works with threads, interface types etc.

TL: Now that you can use atomics on unshared memory, threads is no problem.

BT: Can you explain the default?

TL: Not necessarily a default -- but the tooling could emit a default section with no feature flags. We have an example in the explainer on github.

RT: If you have multiple conditions satisfied...? [TL: probably broken module] Do you have a prioritization scheme?

TL: Yes, clang has this prioritization. I'd like to leave it out of the core spec, encode that prioritization in the predicates.

LW: Hard to imagine a program that takes away threads and is fine...

TL: Localized performance features -- tail call maybe, non-trapping, [AR: bulk memory] Big ones like reference types, GC, may be harder.

LW: Small things may just be implemented by most engines, so might make this unnecessary.

BT: Bikeshedding comment - if you have a switch and you have to choose one of them you have to put in 4 formulas, and you can only choose one - wondering if it’s just simpler to have a switch?

TL: Possibly. We should discuss further.

Back to slides

TS: I have a further question on meta-topic of having sections would play out in practice. I think this ties in to stage 4 reqs, with engines. If we are not careful we'll have easily where an engine says we can implement a custom section with a given name, and then that's CSS prefixes. It's not good there or here. If we use something like this, it would have to go with strong coupling around norms that an engine wouldn't ship without consensus of domain. Web VMs, we are shipping this -- as long as other engines are not on board.

TL: Happy to consider how we can consider these norms in a process document.

JB: You can merge modules -- how do you merge modules that have different conditions, do you get an explosion of sections in the resulting thing?

TL: Talked with AR about this design - his primary motivation is that we should be able to merge modules together.

AR: That's one of the reasons we don't have a switch, a tool doesn't have to understand these things.

JB: But if the first module has a different number of features -- then you need to renumber. You need to do different things to renumber. Seems tricky.

TL: Yep, definitely. Merging is easy, but you need to redo numbering. How that renumbering would work is important.

AR: Only can work without understanding the program, if the number of things you create is the same. (Or tool has to explore all used configurations)

BT: Could you make that a requirement of the switch that it has to have the same number of items for all the cases?

AR: What if we don't want that

TL: They're skipped entirely -- we don't want engines to reason about how many things are in there.

BT: If you had a switch, they could check that they're all the same.

AR: The point is that you don’t want to enforce same number of items, that will be a strong requirement that breaks use cases (ex: auxiliary functions)

TL: I think we should make sure that we're explicitly writing down use-cases, including merging requirements. So we should write down why that's important to have.

RT: If this is mostly about localized things -- can we have an expression instead? A macro expression instead of a compiled expression.

TL: You have to execute a code snipped to figure out if it’s deactivated?

RT: You have #if and then some condition, then you can run inside a normal program. It can be precompiled. This side is always true -- don't even parse the false side.

TL: Are you suggesting a completely different design, where this is handled at compile time?

RT: Yeah, it could be an #if expression instead.

TL: Not sure how to expose at the source language level.

AR: We made it coarse grained is because we would want the work done at decode time for example if you have types that the engine doesn’t understand, then this wouldn’t work

RT: You need different functions -- do you need to have different types in your functions, and not in the body level?

TL: Do you have use cases that would require...

AR: Seems natural that this will come up quickly.

TL: Good question, what we’ve designed is very general, and powerful - don’t intend to expose the full power of that in the C lang. It’s a fair question

WVO: I can see how this is useful in the short term. In long term, given that SIMD is #1 use case. In the long term, are there many platforms that can't support/lower. That seems not super likely -- trying to compare against complexity. Are we solving a temporary problem?

TL: Hard to say, anyone that has an embedded engine?

BT: There are such engines yes.

DG: If you want to have a deterministic engine, you want to do this.

BT: Like the generality of this approach, we shouldn’t overemphasize the SIMD use case, for example threads, don’t think we should overkey on SIMD because it’s more broadly applicable.

TL: The observation that this doesn't work for GC is valid. It would be helpful if we added a doc that lays out every use case we can think of.

KM: On the web -- a lot of features can be polyfilled through JS, maybe not fast, but will work... so I can see using this on older browsers, to support your business needs. For SIMD use case, when you lower, you may generate worse code. So that's bad for... that's one of the reasons we pushed back on SIMD. We should not pay that cost for places that don't support it.

TL: Another use -- CT-wasm, you could call out to JS for constant-time stuff.

end of discussion

Presenter: Deepti Gandluri (Slides)

AR: how bad is the non-det in this proposal?

DG: In this proposal we've tried to minimize non-determinism. We've traded off for non-optimal semantics. On x64 it's not nice codgen, because nan-propagation is not IEEE754 compliant?

AR: just a handful of instructions?

DanG: no fma?

DG: no fma, everything in non-det we try to kick out, try to make sure mvp is as consistent with wasm as possible, and be mindful of engines that don't support SIMD, having same semantics when scalarized is impt, different engines generating different results will be surprising, we want to (at a min) make sure scalar and vector does the same thing

AR: Are you saying it's fully deterministic, what's in there?

DG: We've tried to be diligent about that -- if not, we'll consider it again.

KM: were all those instructions added between last cg and now?

DG: Yes, added between meeting. These were all requested by applications. If you had to combine it would be much slower, this was from application feedback.

KM: isn't this something engine can do in the optimizations?

DG: v8 couldn't really do this -- we may be able to find ways to optimize this, but the engine still has to be able to generate one instruction. We haven't done it yet.

KM: do we have benchmarks that show using them works well across all platforms?

DG: It's been increasinly challenging for benchmarking instructions -- general consensus for the original instruction. None of them are weird, they map to one instruction.

KM: map to 1 instruction is fine, when you have a dozen instructions and cause stall then that's not great

DG: Been judicious about merging instructions. If your engine doesn't support SSE4.1 -- a few may map to two instructions, not to do with instructions themselves.

KM: high level comment towards proposals, didn't notice this was being polled to go to phase 3, hard for someone to make a decision on this without more notice, future polls should have a lead time of maybe a week, make a note in the meeting agenda?

DG: We could do provisional poll for now -- but yes, happy to do that in the future. Apologies.

KM: Yeah, it's happened for other polls too -- just hard to follow along for implementers when there's less lead time.

DG: understand, great feedback, going forward try to document that and make sure implementers have enough time, move on to the poll

POLL

Move SIMD proposal to phase 3

SF F N A SA
19 18 4 0 0

SIMD moves to phase 3.

back to slides

BT: I have a question about the future and evolution. SSE has evolved by drips, adding features here and there. What about wasm, new instructions here and there, adding them simd128 one-by-one. Will that happen?

DG: hope not, adding things on standard somewhat cumbersome, try to get a broad set of ops in the MVP, have optional extension set that comes with fast SIMD, going forward, might be problematic, hardware adding new instr. Long SIMD should be more portable, specific vector length, engine decies which instr to use, not tied to instr in hardware, raises interesting perf question, what we're thinking of is a perf guarantee, X % better, but not specify exact ops that engines will use

break

Presenter: Andreas Rossberg (Slides)

??: one option is o have a more readable text format, or have a way to convert anything from binary to text, and text to binary but not readable,

AR: the custom annotation allows you to do this. Roundtripping from binary -> text -> binary. The concerns about text -> binary -> text -- you lose something. You might not know how to convert it.

??: if you dump text in binary format, it won't be readable, but you can still dump it into binary for at

AR: If you originally convert it to text in a binary custom section. But yes, you're right.

RT: Could you make it that the custom binary format that stays where it is -- when you turn this binary into text format, add these annotations to the AST, and then get roundtrip.

AR: two things here, 1: concerns are about binary format, they would require changes to custom sections

RT: You could make a custom format -- modify the AST to add things here. You wouldn't have to change anything.

AR: you could do that, it would mean introducing another custom section, to dump all the annotations there textually and reconstruct it, not sure this would satisfy the… syntactic context lost when going from text to binary, formatting, comments, sugaring. Cannot accurately place back annotations.

RT: It does seem text and binary formats that aren't compatible with each other.

AR: all these things can be encoded, no automated/standardized way of doing that, no clear use case for that. usually when you define annotation format, it corres to some custom section format you already define, and want to map there, the idea that you can do it generically fundamentally doesn't work, since you don't understand the semantics of the custom section, leaving it at a certain place doesn't mean it's correct, generally don't know how to solve this problem / out of scope of this simple thing we are trying to do, no concrete suggestions. in practice, t-b-t is not that relevant, b-t-b round-trip is the relevant direction we care about

AR: This proposal finally allows you to do that.

back to slides

NAS: My first exposure to this -- I'm worried about asymmetry of transformations. It sounds like that's an implicit req that wasn't stated, should have text -> binary -> text transformation.

AR: Not sure what you're asking...

NAS: Seems odd that we have ability to go from text -> binary, but not go...

AR: you can go both ways, and you need a tool that understands, idea is the tools can define custom sections and define text rep, you can go both ways if tools know, tools that don't know the custom sections cannot go both ways, the entire world doesn't have to understand everything.

NAS: makes sense if @custom is the only one.

AR: generic fallback for the case where tool doesn't understand, you lose info when you go from text to binary, the syntax is gone, you don't know where to put the @custom, much more difficult and more brittle, not sure if worth the effort

JG: There's an interesting question about what you do when you drop a custom section that you don't understand, already the case in tools -- using interface types as an example, if you had a tool that didn't understand the annotations, you would drop them on the floor. So you may say that if it is semantically relevant, you have to say that it's standardized. If your tool supports interface types, you'll have to understand the text format too -- you can't get around that.

AR: agree, the idea is that tools/conventions/layers of standards that defines custom annotations also defined text formats, if they care about that. if you introduce a custom section, you should standardize/define it

POLL

Move to Phase 2

SF F N A SA
10 19 12 0 0

Advanced to phase 2.

Presenter: Francis McCabe (Slides)

AR: can you say what you mean by not being an IDL?

FM: imagine someone said wasm type systems sucks, so you want a better system, so you invent a new system for wasm, you have c code, how do you embed c into this system. what happens informally is that the boundary between c and module becomes a kind of interface layer, where you map between the richer wasm types and what you really have. if you try and formalize that, you get something like interface types. you need a richer type system because the API designers need the richer system.

back to slides

AR: structs?

FM: it is in there, but not listed in the slides

back to slides

slides describing new interface types

RT: are those mutable or immutable?

FM: all immutable

back to slides

KM: for copying the strings, it seems that ... in order to copy it into their memory, how do you prevent writing into that module anywhere it wants?

FM: you can't, if you call "getChild", as part of the fn call, i will ensure that any data for the fn call is transmitted

KM: what do you mean by after the function call? After it returns?

FM: if i'm doing getelementbyid, after i have made the fn call, before it evens return, the id itself could be anything. the string would have to be copied to

LW: The default semantics is copy, if you can prove that it's safe you can optimize it.

KM: is there a guarantee that there is already memory

FM: You need multiple memories to model where one wasm module calling another, that's where you need it. When you're calling a host API, it's the host doing the lowering, you don't see how the host is doing it.

KM: is on the other end, some of these things seems like, for a lot of webapis, you won't be blocked (most of the blocks are for inter-wasm)

FM: We haven't decided what the Minimum Awesome Product is yet...

KM: a version that allows you to interop with DOM, but doesn't allow inter-wasm interop, as a feature this seems pretty important for a lot of wasm apps to interact with DOM in a meaningful way

FM: It's a P0, it's a high priority scenario. Accessing the DOM.

KM: if we have to wait on other proposals that have possible long lead times, it is worthwhile to not tie this proposal to those blockers

FM: We're not relying on anything that's well underway.

JG: subtle note in multiple memories, for static shared nothing linking, resolve at build time, ship 1 module instead of shipping multiple, requires multiple memories within a single wasm module. if adapter function is fully transparent. browser has to generate mem.copy, engine needs to implement multiple memories concept. For MVP we don't need to block on multiple memories.

FM: add one thing to that... we can be useful before everything is ready. But some things in flight which we need, which other people have not seen the full need for yet. For example func.bind, you need that for callbacks. It acts as a forcing function for other proposals.

KM: of the proposals you list, multiple memories seems the furthest away from implementation, thought func.bind was part of typed fun refs (?), but is not.

AK: To restate your point -- it seems you think DOM interaction is important use case, so reducing dependencies for that use case is more important.

KM: if you can split it into things that allow this to proceed i that it is useful for interacting with DOM, it is a worthwhile fork

AK: Useful feedback thanks.

FM: we have not drawn the line on min awesome product, haven't decide what is in or out. want to get enough of the design out, see how the whole picture will look

JG: interesting subtlty. When you have lifting operator, you don't know what the other side looks like. That means you don't see the lowering that corresponds. Only the engine can see. So engine can defer making copy of string because it's read-once. It doesn't have to follow other rules. If going linear memory -> gc language that "just works". If you're going linear memory string -> gc string that's not a memory copy, but it should still work.

AR: i like the whole thing with interface types, still skepticism about the adapter thing, due to feature creep, infinite amount of combinations you can use inside wasm, foresee a lot of pressure to keep adding stuff into adapter layer. do you have clearly defined criteria for what stuff to include and where to draw the line?

FM: This is a real issue -- I'm less worried about feature creep than you seem to be. Bottom line is that the data values you're working with are constrained by crossing an ownership boundary. It puts an upper-bound on features you're going to have. We have tried some technical ways of doing this -- no recursion, no loops. We need loops though. No need for recursion yet, but we need chains of interface type calls. The language of types does not permit recursive types. We're trying to limit the data language that limits the desire to feature creep.

AR: not so worried about that, but the primitives that correspond to whatever primitive thing language runtime uses for data types

JG: The last time this came up, should we support UTF16 strings came up. It should be straightforward to support. What we wound up doing, you must have pointer+length utf8 string. Our heuristic is "types of encodable wasm stuff", so there should be two strings.

AR: yes: how many string formats should be included? there are many ways you can represent strings in wasm memory

JG: Right, there's many ways to do strings in GC too. You can convert from utf16 to utf8 today, but you can't convert from utf8 to gc currently.

BT: i didn't see exactly how you solve the allocation problem, the picture is a module can provide some of the lowering code itself? for example, a module exports a function that accepts an IT string, so the module has to do the work of copying IT out to memory

JG: Provided by the module, probably generated by the toolchain -- depends on how the toolchain works.

BT: next q, is there a set of ops whereby you access the characters individually from IT string?

FM: No, at least not at the moment.

JG: in principal you could with exports, you can call some subset of functions

LW: When you're receiving the string, you can provide a malloc function, the engine does the copy to the malloc'd memory. There's cooperation between core wasm, and the adapter layer.

BT: prob worthwhile to take a look at details offline, are there examples?

LW: Yep, walkthrough and explainer have more info.

JB: i see a type system, yesterday there was a type system in GC proposal. is IT needed when you have GC types?

JG: Simplest reason is -- how would you have C talk to GC? At the interface boundary you need to have a lower-level ABI or convert to GC at boundary. Interface types allows you to defer that decision. Another advantage: if you're going 1-1, you don't know that statically at runtime, you can effectively annihilate the adapters be as cheap as possible so you don't have to re-encode. If you said "use gc all the time" then you wouldn't have to do that for Rust and C++ too. So this makes it harder for those languages.

FM: more fundamental aspect, neither core wasm or gc types include the notion of strings, that's not by accident. otoh, if you're expressing an api, you need strings

AR: It is a more high level type system... not just strings, even signed/unsigned.

LW: the goal of IT is to talk about high level types

RT: string is very different from the rest of the types…

FM: not really

RT: No agreement on string, utf8, utf16, null-terminated. Various other types, have known established combinators for constructing+deconstructing them. Fundamental difference in how they’re set up.

FM: at the level of tuple, you might agree that a tuple is a heterogeneous x-product, my language

RT: Not an agreement on how data is layed out, just known combinators. A thing that GC does ... agree for these types, copy from one tuple format to another copy. GC won't solve that problem. If you have adapters working, then you can reduce this to tuple of list etc. Then you can make it so there is just one copy. Known ways of copying these, so it does have value outside of "just using GC"

JG: another tagline, security model of a IPC, cost model of a regular function call

RT: A regular function call, but you don't agree on data rep, so you have to figure that out later.

AR: in some cases this might actually be serialization, language runtime has to do something you can actually lift, the worst case if you must copy 3 times, serialize + copy + deserialize. Worst case seems to be worst than if you just did serialization in the first place

FM: The serialization scenario isn't what I've been talking about. Where you deliberately put a network boundary -- for example. RPC. The constraints on the operators are such that they should be when you put them together, you don't need to serialize. If you don't have the adapter, then you're copying an arbitrary number of times anyway.

JG: more so in the case when you go from utf-16 in a module to utf-16 in another module, but in between there is a utf-8. that's the worst case scenario. in principle the constraints on the boundary aren't that bad. you shouldn't need to change code/codgen

TL: q about how the annihilation works, do you have a system for proving that it is going to be efficient or possible?

FM: the two-part answer is, have not written down a set of transformation rules, doing by examples so far. the examples so far are linear transforms, inductive on the structure. we haven't done the necessary work yet of making those transformation rules

LW: it is a goal that it is always annihilatable. That static use-def property says that this should be true.

TL: you're enforcing a simple data flow

FM: glue code in general has a tendency to dominate, trying to prevent that. e.g. in the utf-8 utf-16 scenario, haven't made up my mind. if we had decided on one, some work left t obe done on the application side, may need some pre-lifting and post-lowering, limited to what's absolutely necessary.

break for lunch

Process questions - Dealing with overlapping proposal

Presenter: Andreas Rossberg (Slides)

DG: seems a bit hard, how to say how long it is in phase 3

AR: Yes, I don't know.

LH: ref types and bulk memories circularly dependent

AR: We should avoid that in the future, we screwed up. exception handling proposal is a motivation of this, depending on multi-value and (?), Daniel already implemented eh in own repo, and merged in the dependent proposals, don't want to repeat this work.

AR: maybe there's even reason to have ... if there are more cases that depend on the union of proposals, so maybe we have auxiliary repos in the middle that do that. Don't want to propose too much here, but want to make sure folks working on keep it in mind. Whether this is a viable guideline, unless there are reasons.

TL: it seems like in principle, these issues can be solved by aggressively chopping up proposals into small proposals

AR: we tried this, but then we discovered gaps, and we have to do a lot of temporary workarounds, gets even more messy to merge later, have to undo stuff and rework stuff. seems like the opposite is true, if you have a monolithic proposal, it's much easier

TL: but for bulk memory for instance, if we chopped it in half. Then the memory instructions would have been unblocked.

AR: in that case we should have done that, we changed the layout of data in elements section, we could have done it separately, if we extend one but not the other in the same way. we should have split out all table instructions

AR: We could have also made reference types a dependency for bulk memory, there is no magic bullet. (recently learnt difference between magic bullet and silver bullet. Magic something that solves all your problems, Silver bullet solves problem for which there is no other way to solve it).

JR: Linearize as you enter phase 3? or a pre req to enter phase 4

AR: phase 3 is where you do the spec work, so that's natural point to decide.

FM: single repo where you put all your changes?

AR: Is it equivalent to that? No, you just determine which are your upstream and downstream repos.

DG: rebase 2 different repos instead of rebasing on the main wasm spec

FM: You want to rebase early and often?

AR: yes, essentially, do it early enough, so that you can more easily keep up to date, more easily able to resolve potential conflicts

TL: This mirrors a similar problem in the tools -- given that we have one feature enable, which ones can we assume are enabled? In general, linearization of proposals makes this simpler and tool UI simpler, but is restrictive. I don't think the tools have figured out a good solution either.

AR: suppose the answer depends on what kind of interference, the join case, do it earlier, cos of all the tests we want to be early. if it's just random interference, deal with merge conflicts, maybe that is easier earlier, maybe harder, since each rebase you have to do some of the conflicts. i'm thinking more of the join case, recent experiences

KM: can I make a related request -- most proposals add tests to the same files, you have to look at the commit log to see if there are related features there.

AR: some conflicting goals there, we plan to follow structure a file for each construct, when looking for sth, you can find the test easily, it's the name of the instr. usually that goes along with features. but there are features that extend the semantics

KM: Can we change the files into directories? Have a call directory with a bunch of call directories inside of it.

AR: in principle we could but shrugs

KM: It would help when you want to merge, you don't have to worry as much about mergining.

AR: agree, we should have a convention where we could have multiple test for similar constructs. idk how many of these cases we will have in the future. these comes up in multi value / memories, where we generalize some of the things we have, and filling in the gaps we left intentionally. we expect things like this to go down as we proceed

KM: hard to say, until we enumerate all future features :-) Right now we just copy the entire directory for a proposal. Otherwise you delete your tests.

AR: you're right, this came up before a number of times, maybe we should just split off new stuff

DanG: This also affects the testsuite repo -- we could use this as a way to detect merge conflicts. What if we had an early-warning system.

AR: idk if all of them show up in tests though

DanG: we do a full merge, so we got conflicts across the entire repo.

AR: yes that might help, although expect that some amount of merge conflicts will be there.

DanG: Maybe when this shows up you should rebase.

AR: might be a good way of going about this

DeeptiG: Linearizing introduces quite a bit of overhead -- what if we find a sequence for proposals that are independent.

AR: Just meant linearizing with respect to the conflicting proposals, not global

AR: found your (DanG) thought interesting, what happens when you have merge conflict?

DanG: right now we skip merge, file issue on the repo

AR: is your merging automated

DanG: run script by hand, pull in all proposals repo, try to merge, it merges across spec

BS: merges with spec, though right

DanG: need a way to specify what your deps are, if we have a way to determine conflict early and see this

AR: That sounds like a good idea.

DeeptiG: useful to follow up with a design issue to talk about when, how in the process?

WASI Embedding API, Interface Types and WASI

Presenter: Peter Huene (Slides)

FM: What about callbacks?

PH: List is non-exhaustive -- everything that interface types supports isn't listed there. But everything will be supported there.

FM: I don’t think you can represent callbacks in the interface_val_t union.

PH: Let's talk about that later, yes. For our use cases for now, we need integer and string.

??: Do we have to have the conversions from interface types to value types now?

PH: No. All the logic for the conversion between interface and wasm types is done in the adaptor functions that the engine implements.

??: In the body of a function that has interface types, do they both need to be consumed. Can the engine help with that?

PH: These values will be passed to the adapater function and that will realize that it can pass an i32 right down, or convert other things.

RT: Want some context. I know the role the enum of all the types has, but why values in the enum?

PH: The way the C API is designed is that the function callback signature is uniform.

AR: bikeshedding question: is there a reason you didn't call it wasm_interface_string_t, to keep layers clear?

PH: Yeah, could do that.

on slide 12

FM: Actually I think there's a misunderstanding here -- a string in interface types is a sequence of unicodes, it is independent of the representation, we're not mandating utf8 or anything else.

PH: I read it as “its unicode on the host side, but UTF8 in linear memory”

FM: Initially we have to pick something, but the fundamental notion of a string is that it is a sequence of unicodes.

PH: Might be a bit too restrictive for embedding.

LW: At the moment there is only one target encoding. If there are multiple encodings on the target side, there will be a direct transcoding on the embedder side.

PH: This interface is modeled for the current present encoding. If more encodings, then this would be extended.

FM: On the C-API side, you want to represent half of the contract. I'm expecting a bunch of utf-8 [PH: who's you?] If I'm using the C-API to access wasm data structures, then what should be in this part of the API is the C side of the contract. Or I have a requirement for utf-8 or char*, but that doesn't say anything about the other side of the contract. It's up to the fusion of the two adapters to bridge that contract.

PH: In the embedding scenarios we effectively have: engines need an adaptor.

FM: Wasm engine doesn't know what a string is...

PH: The host needs to know. How else would you pass the string?

FM: You have some library that does its own thing with strings inside wasm.

PH: Let me back up: If I take the C API and embed it in ruby, and I have ruby FFIing into C, and I have a ruby string, with UTF16 string. When I embed the engine I know the host representation, and I want to move that string into wasm. The C API is not just for C, it is for embedding Wasm into any languages.

FM: The C-API should be neutral with respect to the rep. of strings then.

PH: Yes, that’s why this is more generic.

FM: Someone has to convert.

PH: The host does. The C API doesn’t care, it’s a void*.

FM: Someone has to do that bridge work...

LW:: Maybe do that offline or let the presentation continue.

NAS: The contract part is the important part for me ... the wasm part can be anything, and if you want to run you on your host, then you need to be able to convert.

PH: What do you mean by host? [Browser] No, that’s not the use case for C API. Languages interacting with Wasm is the use case. They know how to convert their string into utf8, or any encoding that the adaptor wants to use.

NAS: What I'm struggling with, but the encoding is showing up in the adapter.

LW: If other encodings were allowed, then they'd show up in enums.

NAS: What do you mean by encoding on the wasm side?

LW: When it comes out of linear memory, there's only one choice for now. But the host has unbounded choices, so that captures the current state of the proposal. You the host is responsible to transcode into the format.

PH: What’s missing, we are abstractly talking about these adaptor functions, these are one-to-one to the memory-to-string and string-to-memory functions. (lost some).

back to slides, from slide 13 on

FM: If interface types are not supported by engine, then interface types would be ignored.

PH: that's what happens today, right?

FM: Not exactly, part of supporting an adapter is exposing malloc... if I'm publishing a module, then I don't want to expose my malloc. If the interface type section is ignored, you can call my malloc.

PH: Only if you export malloc?

FM: I have to so that I can use it from the adaptor.

LW: Malloc gets exported from the core module -- host sees all of it. If the host wants to put the burden of the other end of the C-API, someone has to do the work...

PH: If you opt out of interface types, then you have to do the same work that interface types is going to do for you.

LW: Someone has to implement it.

AK: Can you say more about enabling/disabling interface type?

PH: The host has to be aware of interface types, because of that I wanted interface types to be optional from the host perspective.

AK: Could I as an engine say you can't disable interface types?

PH: Certainly …? Not married to the way this works. I was trying to follow the existing C API model.

AK: Just wondering what was motivating this... bigger question here about how to evolve wasm and C-API and restrictions it puts on wasm.

PH: C API is the FFI…

AK: JS API is the FFI…

PH: The two together are all FFI?

back to slide 18

??: Which memory would these env vars live in?

PH: All host memory. This is read by the WASI implementation.

??: The value has to be copied into linear memory?

PH: Yes, but this happens when a WASI function is called.

<back to slide 22>

AR: What about environments that don’t have IO?

PH: You do nothing here, if the Wasm code writes to these handles then they error or something.

DanG: WASI assumes that there is stdin and stdout, so there might be a noop stream.

PH: Like /dev/null. Could expand to allow hosts to interact using a pipe-like thing.

back to slide 23

??: Is there a scenario where there is a reason to share file descriptors among instances? Is that well-defined?

PH: currently I think... Dan might know better. Currently we tie file descriptors to instances....

DanG: Each nanoprocess has its own file descriptors

back to slide 26

JB: Is there ever a reason to not go through all imports...?

PH: It's up to the embedder to define the best semantics. I expect WASI embedders to use WASI functions, so it would be an error. But it shouldn't be a requirement of the API.

Presenter: Clemens Backes (Slides)

AR: There are some differences slightly to the JS-API at this point -- in particular limits are flattened I think. Minor details.

CB: Yes, it is something we may continue to discuss.

back to slides

AR: (about naming initial memory size minimum in memory types) Moreover, initial would be misleading at this point. If you query an import and it says "initial" it would not be the initial size.

back to slides

KM: when you ask for memory, you ask on the memory, not on the module right?

CB: Only if you export a memory you would see it.

KM: when you get the memory back, is the minimum the maximum of the minimums?

AR: It is the current size.

CB: It is the minimum of the size specified in the export.

KM: when you reflect the size, you get a number. Is it the one you are created with?

CB: The information you get from the reflection API, it's what is specified in the module.

AR: When you reflect it matches the actual semantics, may sound weird, the limits are dynamic, it narrows over time, but it has to do with how we check limits on imports. With a memory the minimum is actually the current size.

CB: I expected that you get the import specification size...

AR: When you are asking for a first-class object, what you specified on the import doesn’t make sense after you imported. What you want to know is the current size, when you pass it onto another import. The type of a memory object or a table object gets refined over time, it is an interval, it can become a smaller type.

CB: That's a bit unexpected...

AK: We report current size as the minimum.

AR: That’s when you dynamically reflect

CB: Should that be the type of it though...?

KM: Seems like it's ok for now, since the current size is just "refined" type.

CB : Maybe we should have a method to query the current size.

AR: That’s not what you want. Let’s discuss offline.

TS: JFTR: I didn’t write the proposal, Andreas did.

POLL

Move proposal to Phase 2?

SF F N A SA
11 15 14 0 0

Result: Proposal advances to phase 2

break

Presenters: Deian Stefan, Hovav Shacham (Slides)

EP: Are we assuming that the target platform is constant time?

DS: We are assuming something where integer addition is constant time. Not making assumptions about floating time. On AR, you get that. If someone from Intel is here … hint hint.

back to slides

slide about importing untrusted code with secret

JR: This guarantee does not just prevent timing, but also breaking regular flow.

back to slides

JR: We just put a prefix on it, maybe that's ok.

AR: Why do you need different instructions? Information flow, ssubstructural. Shouldn’t the substructural aspect of type system be irrelevant to operation?

CW: In theory the type validation could work out that it's private or public. But this means at code generation you can determine whether the instruction should do extra work or not, without having to look at types.

AR: We rely on types in other places kind of. Why do we even have different instructions for different types. Usual answer: We don’t want overloading. But this isn’t overloading, as the semantics is the same.

DS: Would you pay in validation cost? Don't have to do different tracking.

JR: It work would out fine.

DanG: You have new opcodes, do you need new types? If you trust that it has correct crypto, why need new types?

CW: I guess it’s not enough for the functional correctness. But we need the types to tell the VM to not break the timing guarantees.

DanG: Operations aren't enough for that?

DS: also just the issue of basic leaks ...

DanG: But it would be a bug?

AR: The wasm type system is not there to aid the producer, it's there to guide the engine, it's not the purpose to generate code correctly, that would be an argument not to include this in the wasm type system. Instead, you advise the engine that here you have to be careful. You don't want the engine to have to check your work for you.

JR: I completely agree. But you’d lose on things hard for the engine to do. Secure leaking story, that’s gone .Maybe that’s fine. The other thing is that we want to protect the data, not the timing of addition. There might be other things happening. Compiler can infer from the instructions what’s secret. But it’s easier if we label the data.

AR: That's what I'm trying to understand .Whether the instructions generated differently ... function imports, perhaps?

DanG: Naively, you wouldn’t need the type system either.

RT: One thing you mentioned: types seem important, if I have s32 in register, I don't turn that into a branch. So in other settings, there is an optimzer that will do something fast if it knows it is zero. So at this point the engine needs to know that it can't optimize it (turn it into a branch).

DS: Yup.

ET: If you do GC, you definitely want the GC to not branch on the secret values.

JR: It may be possible without types... but clearer and obviously correct with new types.

CW: And more local too.

AR: Another question I have ... our type system is trivial now. Information flow type systems cross-cuts with new type structure. Does this have to be inserted everywhere?

DS: Not really. This is the dumbest information flow type system ever. I have seen the general case, we can talk about that, but that’s way too much to ask (for now).

AR: What would happen once you extend the language. Would you expect the GC features to never be secret?

DS: Yes. Right now the crypto can only target this subsect.

JR: He's concerned that whenever we add a new feature, is that more expensive to think about now. Maybe you can't put s32 inside GC objects...

RT : What stage of proposal is this?

JR: Stage 0 with impl and spec.

CW: The safe thing is for new features not to work with this type system. We have the ability to implement that now with what we have.

AR: When you add for example type constructors I need a kinding system to say which types are allowed where. It is not entirely free.

DG asking to take future discussion offline

DS: What is the level of interest here?

TS: Have you thought about toolchain implications?

DS: Some people are working on that already. Rust adds secret annotations, LLVM are also working on it.

TS: Includes targeting secret memory?

SD: With LLVM we have to work on that.

SD: How do we move?

DG: Add links to github etc in the meeting docs. We have CT meetings every couple of weeks, we can invite you there.

AR: Phase 0 would be good.

BS: Add a new repo, then we can discuss there.

WASI security model, reference types, interface types, and POSIX

Presenter: Dan Gohman (Slides)

AR: You also depend on function types proposal?

DanG: Yes, everything that IT depends on, we depend on as well.

back to slides

talking about handles in WASI

??: These handles, are they something at the wasm layer, or a layer below?

DanG: They will literally be reference types on the Wasm level.

??: Are indirect function calls also this type of handle?

DanG: Currently the call indirect table has anyref … when I say handle, I mean references to host objects. Maybe a function reference can be a handle, but that’s not what I am thinking of here.

Presenter: Andreas Rossberg (Slides)

talking about encoding of memory index

EP: caps the number of memories?

AR: No, it's LEB so it can be up to 4 billion memories of 4 gig each. Should be enough for a while :)

back to slides

talking about the performance hit of multiple memories

LH: i think it's higher than 3%-5% by adding indirections to memory.

??: This is the overhead if every memory access went through extra indirection?

LH: No, we removed the heap register, made the heap pointer a value in SSA. So it could spill/fill as necessary. Best simple solution, but we could probably do better.

RH: When you have a load instruction, and it doesn't have immediate for memory index, we assume 0?

AR: Yes, it defaults to zero. We allow that in the text format too.

RH: Does it make sense to sense default for most commonly used memory that is not zero... maybe an implementation hint?

AR: We talked about implementation hints… let me get to that.

RH: It wouldn't necessarily be a hint to the engine, but also binary saving size.

AR: We're not making anything worse, engines could do the same for memory zero. When you use more than 1 memory, you might do something different. Not necessarily a regression. The question of optimization hints has come up multiple times, need to design that more generally.

back to slides

??: we have a use case where having a pinned register matters -- is there a use case known where pinning is required. Use case where something like this would matter?

KM: No way to tell engines. It would only be useful if you can swap the pin. In this area I want the other memory.

??: That sounds like multi-memory -- not familiar with all the other proposals, just curious about marking the pinned memory in a register.

AR: That would be an optimization hint. There could be a hint to define the default memory.

JB: Have you considered where you don't add immediate to load and store, and then you can declare dynamically where they go to. This doesn't seem useful for use case, since you need to have a switch to choose which memory you want. [KM: dynamically index?] Scaling out beyond 4GiB, then I have to.

AR: That's a different thing, requires memory references. All the references! Something that is potentially useful, but it is another level of proposal, this is simpler.

TL: It's possible that engines treat memory 0 as fast memory, then you get better performance by not merging modules, by keeping it separate.

AR: defaulting memory zero to be the fast one has limitations, e.g. when merging. Serious limitations: imports come first, so local memory can’t be fast if you have to import.

TL: seems like an interesting thing that falls out of this... not fundamental though.

EP: Most of speed difference is memory 0 is because it is pinned to memory zero, most use of multi memory would be small, so it seems like maybe there isn't much register pressure.

AR: The case where it matters is when you merge and link modules and many have private memories.

BT: pinned registers: v8 doesn't pin, it has a convention that instance is passed in register, memory base is field in instance, one indirection off of that, all are up for register allocation, can spill off that. If memory 1/2/3 are off that instance, they could be the same speed.

AR: That also means that it can be different in different functions or scopes, e.g. based on live ranges.

ET: Swapping between memory -- memory pools for the next X stores, is there some limitation for having both use cases implemented. Allow both instruction kinds, generic stores and local memory indexed.

AR: If you use a technique where the engine uses the register allocator to figure it out it will just fall out for free. An explicit feature in the language that is contextual is always messy.

ET: In CT-wasm, you'd want load/store ops are just using these memory, so it's typed. This would be a way to remove the secure type from CT-wasm?

??: Now I’m confused.

JR: Don't worry about us :) We're trying to get rid of ways to do special memory.

AR: Which memory you are referring to is completely static.

ET: I was talking about memory space is special -- that's what I meant.

AR: It’s part of the memory type. If you have multiple memories you can have it as part of the memory type.

??: If you have multiple memories, if one is with guard pages and one is not, imagine that the code executes has different instruction sequence, strategy to deal with that?

AR: We want to eventually generalize memory types to page sizes, maybe put some information there to help the engine decide the strategy. That’s fine, that’s static, you know it, if it's in the type it’s stable across imports. Can be different per memory

??: Worst case is this would require codegen twice.

No.

POLL:

Move multiple memories proposal to Phase 3?

SF F N A SA
15 27 3 0 0

Proposal advances to phase 3.

Module Sharing and Encapsulation

Presenter: Ross Tate (pdf) (pptx)

talking about separate maintainers updating

AR: I would assume even that language runtime might be using third party code...

RT: Yes, definitely, can subdivide further.

back to slides

RT: Do we want to support this model?

AK: Ideally we want this. But it's not obvious to me that we'll immediately move to this world, updating without notification.

AR: I generally assume that there are many different ecosystems that make different choices about this. Sometimes have repositories of modules, semantic versioning... web you don't have that because it's all remote, untrusted, but there will be a large diversity of things that happen.

RT: One thing is: Suppose the runtime has a bug and they want to fix it. What process do we want to be involved?

FM: When you say update -- you mean update running application?

RT: No no, when loading the application, and the components from URLs. How much versioning would we want to bake into that URL.

AK: I don't think this is a wasm question -- it has consequences, but on the web you might cache based on content scheme, if you're addressing by content then updating by URL doesn't make sense anymore. As for the bug question, it depends on how you will update.

RT: I am not saying Wasm should be responsible for all these updates.

KM: Website admins don't want that, since they have QA processes that assumes things work. They don't want things to change out from underneath them. They want to make sure that all updates deploy at once.

RT: For people dynamically loading other modules, you essentially want to bake in the exact version?

KM: Normally they deploy themselves, might put on CDN, but they explicitly deploy to the CDN all the versions they want, they won't even say they want version 12 -- they will actually make their own copy, with their URL.

RT: So most people don’t dynamically link to jQuery?

KM: For the most part no... for the most part every website says I want this specific thing.

RT: This is what I want to know. We can assume that all these things step together.

AR: On the web.

KM: On the web, yes.

AR: For an ecosystem that has more trust over the sources, that might not be so remote.

BT: The solution is to use the blockchain, right Andreas?

AR: Not sure how that applies here. :)

RW: If you're talking about the web -- PWAs have a manifest, they specify what to pull down.

RT: This was phrased abstractly.

RW: They'll have to do something like this somehow, they have version control...

RT: I hear that the Web doesn’t need this.

AR: My use case is -- e.g. using it on desktop, operating system that uses wasm for applications. You have installed libraries and DLLs, you may still get DLL hell. it seems the world has moved away from this. But I can see places you want to do that.

EP: When we talk about the web, we talk about large high-production tools -- there are demo pages that are steps below, and we may care about what they want to. And making the life of developers easy is important.

RT: So incremental compilation is relevant? If two java programs want to communicate, do they have to link against the same runtime all the time?

KM: I think... if it were free to do that, we'd do it. But we're describing the web as it is now. It may be that web developers would be happier if they weren't constrained in this way.

EP: I did not understand what you meant with version universality. With webpack there are a huge number of versions of libraries. Drives me nuts. You can end up with huge variation.

RT: I'm saying that with language runtimes, they can't talk to each other anymore. So they have different reps. they can't talk anymore....

AR: They have to go through some more abstract layer, some IDL, then it is ok.

RT: They go through interface types, copying every time, so there's no benefit for that model.

LW: Shared everything linking is inherently more coupled.

JR: I imagine that ability to release updates to language runtime without breaking everyone would be nice. Decrease application size. Individual components might be nice, but not necessary.

RT: What does it mean to change the interface?

JR: I imagine you have a slide...

Ross reveals C# 1.1's runtime slide -- laughter

RT: It's fairly easy to import something without knowing all of its fields, don't need to know about all the other parts.

FM: if you change the data structure....

RT: No, not if you don't touch the other parts of that structure. Most of it is runtime info, other libraries only care about a small portion of the vtable. Rest of it doesn't matter.

FM: ARe you talking about .NET partial classes?

RT: No, I could put Java up there too.

FM: Having a class come from separate files, and then put them together?

RT: Not talking about that.

FM: It doesn't have to be vtables, but complicated data structures, you (the developer) have to decide whether to recompile or not.

RT: In C++/C-land, you have to recompile.

FM: It depends on how to lower a language onto wasm, what does wasm have to do with this?

RT: If another C# code has imported the aspects of the C# RTS, only the aspects it cares about.

: in C++ we say what's opaque and not-opaque. You have the same thing with opaque ref types. When it appears in the interface, we have to care. We push onto the developers.

KM: The example is pimpl, pointer to impl, in C, to hide details. It’s possible in the C# work to do the build-out at runtime.

RT: It means that everytime I do an export on vtable, I do a cross-module function call, we've thrown away tons of optimization opportunities, inline caching.

AR: We don’t want to do inline caching at all.

KM: Oh, it will happen in browsers…

moving off of inline caching

RT: My question is... do we want a way to abstract structure so we don't need to know the internals, so we don't have to recompile? So we don't need to sync

JB: structure meaning GC proposal structs, or structures in linear memory?

RT: The pieces are sharing a lot...

AR: The short answer is that we don't want to decide that, it should be up to language runtime.

RT: That would require a Java object being represented as an array of its uniformly-typed fields, with every field access being a doubly indirect lookup followed by a cast to the expected static type.

Wrapping up

Adjourn