Skip to content
Alon Zakai edited this page Feb 9, 2016 · 13 revisions

Split memory is an experimental compilation mode that splits the normal single typed array of memory into multiple chunks. It is a non-asm.js mode, enabled by building with -s SPLIT_MEMORY=N where N is the chunk size.

Warning: As just mentioned, this is an experimental mode. It is has not been much optimized, and was done mostly as a research experiment. In practice, almost all the memory benefits of this mode - not using a fixed memory size - can be achieved with memory growth in normal Emscripten output (the main downside is that it then depends on the malloc implementation to not fragment too much). Memory growth will be much faster than this mode, even without asm.js optimizations.

Note: For the future, WebAssembly will have memory growth from 1.0, meaning it will have full asm.js-like speed (or better) with an adjustable memory size. There is also an intention to add madvise-like capabilities, which would remove most of the problem of fragmentation as well, later on.

Benefits

  • Each chunk of memory is a separate allocation from the browser. That means that 32-bit browsers can use more memory, without worrying about issues with memory fragmentation. For example, 2GB of memory can reliably be allocated in this mode (if the machine has that memory), while 1GB is otherwise often the limit, and even 512MB can be unreliable. Note however that this is almost not an issue on 64-bit browsers, which are currently becoming the norm.
  • Each chunk of memory can be allocated from the browser when actually needed, and can be freed when no longer required. This reduces the problem of fragmentation which memory growth can encounter.
  • Existing data can be accessed as if it were in normal memory, by allocating a chunk with that data, and thereby avoiding copying the data into memory first, which is necessary with asm.js. See the section "Allocating a chunk with existing data"

Model

Each split chunk gets an independent malloc/free space, implemented by a dlmalloc mspace. That means we assume each malloc is provided by a single chunk, and therefore allocations cannot span chunks. It also means the chunk size must be big enough for the single largest allocation. (That includes filesystem allocations, so if you have large amounts of file data, you might want to run the file packager with --no-heap-copy to avoid a copy into the heap, or --lz4 to both compress the data and avoid the heap copy).

We implement HEAP*.subarray by returning a slice into the proper chunk. When an end offset is not provided, we slice to the end of the current chunk, relying on the fact that no allocation can span chunks. A runtime exception is thrown if an end offset is provided and it implies the slice would span more than 1 chunk.

Modifying non-compiled code

When this option is enabled, all heap accesses go through get32() etc methods, that access the proper slice for the pointer. We also fix up non-compiled code as well, modifying hand-written HEAP8[x] = 5 into set8(x, 5) and so forth, automatically. This is done on all the code that is seen at compile time, which includes all JS library code, --pre-js and --post-js, as well as EM_ASM code chunks.

However, code that we do not see at compile time cannot be fixed. If you have some external code in another script tag on the same page, you must make sure it uses the proper get*,set* methods if it accesses memory. Also, if you eval some code, as in emscripten_run_script, that is new code only seen at runtime.

Performance

Each memory access must first find the right slice of memory, which adds significant overhead. Also, this code generation mode is not asm.js-compatible. There is therefore a significant slowdown. The slowdown depends on the chunk sizes, it is generally lower for larger ones (around 2.5x on Firefox and 5.0x on Chrome). As non-asm.js, this code is harder to optimize and more variable, so you may see inconsistent performance, see e.g. these bugs:

Allocating a chunk with existing data

Each chunk of split memory has its own buffer of data, and you can in fact use an existing buffer, as shown here:

var success = allocateSplitChunk(2, existingBuffer);
assert(success);
var base = SPLIT_MEMORY*2; // each chunk is size SPLIT_MEMORY, so chunk 2 begins at 2*that value
var read = HEAP8[base + 10]; // reads from offset 10 in existingBuffer!
[..]
releaseSplitChunk(2); // allow other code to use this range of memory

This works both on handwritten JS as in that example (assuming the code is seen at compile time, see "non-compiled code" above), and in normal compiled code. Basically, the existing buffer is "mapped" into the normal linear memory the program sees.

Note that allocateSplitChunk returns whether it succeeded, as the chunk might already be used. You can check the buffers array to see which locations are free.