Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing nested 'if-else-end' structures #1422

Closed
jpaulm opened this issue Jun 26, 2021 · 7 comments
Closed

Executing nested 'if-else-end' structures #1422

jpaulm opened this issue Jun 26, 2021 · 7 comments

Comments

@jpaulm
Copy link

jpaulm commented Jun 26, 2021

Apologies if this concern has been raised elsewhere...

It seems to me that the 'if-else-end' opcodes are not really appropriate for an assembler-level language, as executing nested 'if's will involve unnecessarily complex execution logic - I assume that, as WASM becomes more accepted, dedicated hardware will be coming onto the market. Up to now, I get the impression that WASM is mostly interpreted, but dedicated execution hardware seems like a logical next step...

To find the corresponding 'else' or 'end' for a given 'if', such a machine will have to scroll over multiple instructions, ignoring 'else's or 'end's at lower nesting levels. This seems unnecessarily complex, besides resulting in suboptimal performance. To get around this, the executing engine will have to build a tree structure - either before or during execution.

On the other hand, a conventional assembler language would just use traditional conditional or regular branch instructions (which WASM has also). Since WASM is IIUC designed to be generated, I see no reason why the generated code cannot simply use combinations of 'br' and 'br_if' . These are easy to generate, and do not require complex logic on the part of the executing engine.

I spent a large chunk of my working career working with Assembler-level languages, including building very effective Structured Programming macros for a large banking project. This division of labour (between basic instructions and macros) does not require complex logic in the executing engine, while the macros provide the mental model the designers of WASM apparently intended.

I look forward to feedback.

@kripken
Copy link
Member

kripken commented Jun 26, 2021

Overall, yes, you are 100% right that this would be a downside to running wasm on a CPU directly. But it was not a goal for wasm to be run directly on hardware, rather it is intended to be easily translatable to something that runs very efficiently on hardware. And structured control flow helps a lot there, see for example this detailed comment: https://news.ycombinator.com/item?id=22401470

I am not sure how wasm would run directly on a CPU for other reasons, such as the unbounded size of the value stack, the unlimited number of locals, etc. - maybe I'm missing something in your thinking? Overall, wasm's assumption is that VMs specialize the code for a particular CPU's capabilities (registers, etc.). Not doing so would limit speed, I think.

(Note also that wasm is not even optimized for running directly in an interpreter - fast interpreters like wasm3 first translate it to an efficient form.)

@conrad-watt
Copy link
Contributor

conrad-watt commented Jun 26, 2021

Hi Paul, there are a few things to unpack here.

First, as @kripken mentioned, Wasm was designed primarily to be compiled, rather than interpreted. This means that implementations are not expected to walk the bytecode during execution to find matching end blocks - br instructions and if-else conditionals are translated to direct jumps. There are a few people working with Wasm interpreters, but this is relatively fringe.

You're correct that Wasm's control flow constructs do not play well with dedicated hardware. Moreover, this isn't specific to just the if-else-end construct, but is more fundamental to the way Wasm defines block-based (structured) control flow. Because the immediate parameter of a Wasm br indexes an enclosing label (as opposed to denoting a PC address or bytecode offset), when executing a br targetting an enclosing block, the style of interpreter you're describing would also need to walk from the br to the end of the block, ignoring intervening end markers at lower nesting levels. So getting rid of if-else-end and restricting ourselves to br_if wouldn't change things much.

I'm aware there has been some limited hype around dedicated Wasm hardware, but I (completely personally) think such hardware would be trying to force a square peg into a round hole. I think it would be more appropriate for such hardware to work over a "Wasm-like" language with PC/jump-based control flow, requiring Wasm programs to be translated to this form before they can be natively executed. I could vaguely imagine people trying to do something wackier, like lookaside caching of br target bytecode offsets, but I think at that point one would be trying to support Wasm natively for the sake of it, rather than because it's the right choice.

As to why Wasm's control flow is designed this way, perhaps we could have made choices that would more efficiently facilitate direct interpretation/machine execution, but the priority was to create an easy-to-understand abstraction which would work with our type system and could be compiled efficiently (i.e. handled by existing Web VMs). This is tied to the high-level design choice to go with structured control flow, which has been extensively discussed (e.g. @kripken's link above, and here)

@jpaulm
Copy link
Author

jpaulm commented Jun 26, 2021

Thanks, gentlemen, for your clear and constructive answers! It looks as if, when working with WASM, I am going to have to drop the habits of decades of working with traditional assemblers and macro languages! :-) In my own defense, I feel that WASM is a higher-level construct than "assembler", and perhaps needs its own classification!

Thanks, and all the best!

@conrad-watt
Copy link
Contributor

conrad-watt commented Jun 26, 2021

We've often joked that WebAssembly is "neither Web nor Assembly"! I think Wasm is closer in spirit to an IR/"core calculus" that tries to bake in/enforce its invariants through its type system and program structure, as opposed to a native assembly language which relies on the producer obeying the desired invariants during code generation (e.g. with structured programming macros as described in your OP).

@jpaulm
Copy link
Author

jpaulm commented Jun 26, 2021

Like the Holy Roman Empire: "neither holy, nor Roman, nor an empire"!

My (medium) log-term goal was to find a language to replace the JavaScript in the client side of my https://github.com/jpaulm/javafbp-websockets project. I am quite comfortable with using Java, or indeed any type-safe language, on the server side, and HTML on the client side, but heartily dislike JavaScript! :-) Ergo, if WASM has potential longevity and wide "range", I thought I had better learn it! Maybe Rust...?!

Thanks, @conrad-watt and @kripken !

PS The ycombinator comments make a lot of sense - pity I didn't see them earlier! :-(

@titzer
Copy link

titzer commented Jul 30, 2021

FWIW I have a solution to making existing wasm bytecode efficiently interpretable (in-place, without modifying the bytes), which I have implemented in https://github.com/titzer/wizard-engine. The solution is to use a side-table computed during bytecode verification. I've implemented this three times now, once in V8's (testing only) wasm interpreter, once in Wizard's simple interpreter, and now again in Wizard's fast interpreter. The latest iteration achieves O(1) overhead for every bytecode instruction and has excellent performance.

@sunfishcode
Copy link
Member

The questions here appear answered; if there any further questions, please file a new issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants