Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: more examples in the Walt Explorer #101

Open
JobLeonard opened this issue Mar 10, 2018 · 6 comments
Open

Request: more examples in the Walt Explorer #101

JobLeonard opened this issue Mar 10, 2018 · 6 comments

Comments

@JobLeonard
Copy link

JobLeonard commented Mar 10, 2018

Feature Request

Overview

The on-line Walt explorer has more than one function: it shows the power of Walt and WebAssembly, can give inspiration for what to use it for, and works as a pseudo-tutorial or toolkit with "fundamental" solutions to specific problems. Adding more examples would help in all of these matters.

The "C-style union using arrays"-trick I used in #99 is an snippet that could be an advanced example.

Impact

Large (imo). First impressions matter. Also, it might be useful to note here that I have no prior experience in WASM (although I did program a game in z80 once), but nevertheless managed to figure out how to port a simple C-function using the few examples in the playground right now.

It also doubles as a "cookbook" to get people up and running, and the more examples the better.

Details

Walt is a mixture of "it's just JavaScript" and "it's just WebAssembly". If it is more accessible than the latter, it would help if people could learn Walt without prior knowledge of WebAssembly.

Writing documentation and tutorials is one option (issue #73 partially covers this), but the Walt explorer allows us to take a written tutorial and give it much more power. A well-documented piece of code can essentially work as an interactive tutorial. In the words of Bret Victor: it's time to stop drawing dead fish!

Here are some example galleries employed by other languages & frameworks:

Pixi.js examples gallery - http://pixijs.io/examples/#/basics/basic.js

Phaser.io examples gallery - https://phaser.io/examples

"A Tour Of Go" interactive tutorial - https://tour.golang.org/welcome/1

Go Playground - https://play.golang.org/

Basically, currently the explorer is like Go's on-line playground (minus the linking bit). Adding a dash of A Tour Of Go would make things even better.

Some potential examples and topics they would cover:

  • a radix sort implementation. Would cover:
  • some form of string manipulation (say, count the number of spaces in a string). Would cover:
    • passing non-trivial arguments
    • returning non-trivial data
  • Duff's Device, assuming that this would actually be performative, since this depends on the switch implementation and the flexibility of WASM in this regard ;) Would cover:
    • switch statement and its low-level machinery
    • performance benefit of loop unrolling

Due Date

Eh.. whenever someone has an example that fits it could be added, I guess?

@ballercat
Copy link
Owner

This is fantastic. I'll give it a read through tomorrow. Thanks for the other issue as well 👍

@JobLeonard
Copy link
Author

JobLeonard commented Mar 10, 2018

This is fantastic

No, this whole project is fantastic! I get to combine my love of low-level bit-fiddling and the ease and accessability of the browser environment? Oh my...

I edited in another potential example: Duff's Device.

If you have any pointers on how to pass/return an array of data (of unknown size), and of doing the same with a string, then I'd love to have a go at writing these examples myself - I can actually really use that radix sort, since I'm already using the JS versions as an optimisation somewhere. And the string one might open the door for porting LZ-string (on which I've also worked before)

@JobLeonard
Copy link
Author

Also, I have recently written some custom scatter plot code that contains what is likely the fastest bit of software-based sprite blitting code available in JavaScript. Because I can't be bothered to learn WebGL but love bit-fiddling.

/**
 * @param {tObject} context
 * @param {number} width
 * @param {number} height
 * @param {Uint32Array} xySprites
 * @param {Uint32Array} spriteData32
 */
BlitSprites.prototype.littleEndian = function (context, width, height, xySprites, spriteData32) {
	let {
		canvasDataU32,
	} = this;
	const {
		xMask,
		yShift,
		spriteLength,
	} = this;


	for (let cIdx = 0, // canvas index
		sp = 0, // sprite pixel
		cp = 0, // canvas pixel
		sa = 0, // sprite alpha
		ca = 0, // canvas alpha
		j = 0; j < xySprites.length;) {
		for (let xy32 = xySprites[j++],
			x = xy32 & 0xFFFF,
			y = (xy32 >>> 16) & 0x7FFF,
			spriteOffset = xySprites[j++],
			c0 = x + y * width | 0, // top-left corner of sprite on target canvas
			i = 0; i < spriteLength; i++) {
			sp = spriteData32[i + spriteOffset];
			sa = (sp >>> 24);
			if (sa) {
				// index in canvas array
				cIdx = c0 + (i & xMask) + (i >>> yShift) * width;
				// canvas pixel
				cp = canvasDataU32[cIdx];
				// sa is a byte, so in the range [0,256).
				// We introduce a tiny bias towards sa (the newer pixel),
				// to replace /255 with /256, which can be written as
				// >>8, which is faster
				ca = 0x100 - ++sa;
				// A bit of bitmask-trickery lets us use:
				// - 4 instead of 6 multiplies
				// - 6 instead of 9 and-bitmasks
				// - 4 instead of 6 additions
				// - 3 instead of 1 logical shift
				canvasDataU32[cIdx] = (
					( // red + blue channel
						((sp & 0xFF00FF) * sa +
							(cp & 0xFF00FF) * ca) >>> 8
					) & 0xFF00FF
				) | ( // green channel
						(((sp >>> 8) & 0xFF) * sa +
						((cp >>> 8) & 0xFF) * ca) & 0xFF00
					) | 0xFF000000;
				/*
				ca = 0x100 - ++sa;
				// index in canvas array
				cIdx = c0 + (i & xMask) + (i >>> yShift) * width;
				// canvas pixel
				cp = canvasDataU32[cIdx];
				canvasDataU32[cIdx] = ((
					(
						(cp & 0xFF0000) * ca +
						(sp & 0xFF0000) * sa & 0xFF000000
					) + (
						(cp & 0xFF00) * ca +
						(sp & 0xFF00) * sa & 0xFF0000
					) + (
						(cp & 0xFF) * ca +
						(sp & 0xFF) * sa & 0xFF00
					)
				) >>> 8) + 0xFF000000;
				*/
			}
		}
	}
	// put imagedata back on the canvas
	context.putImageData(this.canvasData, 0, 0);
};

Would love to port it to Walt and see if it makes things even faster :D. If so I think that would make a cool example as well.

Again, all I really need is a way to quickly pass Uint32Arrays (or at least the data in it) between WASM and the environment.

@JobLeonard JobLeonard changed the title Request: more examples in the Walt Eplorer Request: more examples in the Walt Explorer Mar 10, 2018
@JobLeonard
Copy link
Author

Ok, I just found out about the existence of WASMfiddle, through this SO answer about strings.

https://wasdk.github.io/WasmFiddle//?w590f

With your permission, I think I'll self-assign the string-example ;).

@ballercat
Copy link
Owner

String literals are not implemented yet, so a traditional string example like above is not possible. It's on the roadmap but is blocked by a few pre-requisites like the Data section #80, followed by the actual string parsing and encoding into the available data section. It's still possible to manually encode a string into memory and then call out to a javascript function to echo it, but I'm not sure it's worth the effort at the moment.

Overall, I like the idea of additional examples and a small re-design of the playground. I think the Pixi.js page is great and I would like the Walt playground to look similar. I wrote it originally over a weekend and it has outlived the original design especially as the number of examples is growing and getting more complex. I particularly like the idea of a guided tour or annotated source which serves as an extension of the documentation.

If you'd like to tackle additional examples feel free to go for it. I would be happy to accept additional example PRs, all related code lives in the walt-docs package. I would be interested in what issues you run into, as that is the main way I find bugs and their eventual fixes 🙃

@JobLeonard
Copy link
Author

If you'd like to tackle additional examples feel free to go for it.

If it's alright with you I'll use this issue in a "public notepad" for my thoughts on how to implement various examples then. As you might have noticed, I'm verbose (sorry), but I kind of work better when I try to write things out first before coding. And maybe it'll be an interesting read. Please jump in with suggestions from time to time :)

Example: count occurrences of character in a string

String literals are not implemented yet, so a traditional string example like above is not possible.

True. But I was thinking of counting occurrences of a character in a string. This would not require a string literal just yet - the string is passed in from the outside, and the returned value is a number. It's the reverse of the situation explained in this SO answer: "How can I return a JavaScript string from a WebAssembly function".

(Tangent: this example is purely that, an example. It is almost guaranteed to be slower than the JS-based solution - split().length-1 is beats every other option by a large margin, especially in Chrome. Also interesting is that Duff's Device is slower in Chrome, but faster in Firefox. However, this example works as a thought-exercise to figure out other string manipulations that would benefit from WASM. Plus, the process of working this problem out, I ended up opening the char and string issues, in case you're wondering where that came from)

Strings are UTF16-encoded in JavaScript, and individual characters can be converted to numbers via charCodeAt (UTF16 codepoints, 16 bit number) or codePointAt(0). The former is possibly faster (since we have less memory to go over), and the vast majority of characters will be 8 or 16 bits wide, but then we have the issue where 𠀬 (codepoint U+2002C) gets mistaken for , (codepoint U+0002C)). How to handle that is discussed further on.

  • the function is split into two parts: a JavaScript function (hereafter countChar), which prepares the string data, and calls the WASM part (hereafter _countChar)
  • _countChar contains a i32[] array as buffer (with the default single memory page it can up to 16384 elements, more than enough for most cases)
  • countChar uses that section of it as the backing array of a Uint32Array variable called strBuffer (native dwords are faster, and we don't have to worry about endianness this way)
  • calling countChar will first iterate over the string, filling up strBuffer with at most 16384*2 characters at a time (packing two per element), until the end of the string is reached. This will be marked with zero in strBuffer, effectively becoming a zero-terminated string
  • it then calls _countChar, which expects the UTF16 code of a character (for example, , is 44)
  • _countChar counts the number of occurrences of that value in strBuffer until it encounters zero or runs out buffer, and returns that value.
  • countChar adds this to the total count, and if it hasn't reached the end of the string yet continues with the next 256 characters.

Dealing with UTF16 pairs

This is the example on MDN's charCodeAt page on handling characters that take up more than 16 bits:

  var code = str.charCodeAt(idx);
  var hi, low;
  
  // High surrogate (could change last hex to 0xDB7F
  // to treat high private surrogates 
  // as single characters)
  if (0xD800 <= code && code <= 0xDBFF) {
    hi = code;
    low = str.charCodeAt(idx + 1);
  }

So for any code point between 0xD800-0xDBFF, we know the next charcode is part of a pair. The easiest place to deal with this in the JavaScript wrapper: we iterate over the string and skip any code pair.

// remember that strBuffer is a Uint32Array
let i = 0, j = 0;
while(i < str.length && j < strBuffer.length){
  let char = str.charCodeAt(i++) | 0;
  if (0xD800 <= char && char <= 0xDBFF) i++; // skip next char
  else strBuffer[j] = code << 16; // write char to buffer
  char = str.charCodeAt(i++) | 0; // if we went out of range, this forces `undefined` to 0
  if (0xD800 <= char && char <= 0xDBFF) i++; // skip next char
  else strBuffer[j++] = code; // write char to buffer
}
// ensure that the string is zero terminated
if (j < strBuffer.length) strBuffer[j] = 0;

.. of course, we need to write a special function for counting 32-bit characters, but how to handle that is probably fairly self-explanatory after this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants