Skip to content

DOM Design

Ms2ger edited this page Feb 9, 2016 · 3 revisions

This page is retained for its historical value only. The AbstractNode approach has been replaced, and the copy-on-write DOM abandoned.

I have started work a new DOM implementation in the "dom" branch. It is currently building except for an internal compiler error, the fix for which is currently being upstreamed to Rust.

This new DOM implementation is based on structs instead of enums. The base type is AbstractNode, which is an opaque pointer with many accessors that let you query what type of node it is and downcast appropriately. (I'm thinking of changing AbstractNode to Node and changing the concrete Node to something else, though, because the word Abstract is littering the codebase.) When structs are allowed to inherit from other structs in Rust, then the implementation will become a bit simpler and will use less unsafe code internally.

The advantages of the new implementation over the previous one are (a) several layers of indirection are gone; (b) the copy-on-write interface is dramatically simplified, eliminating the need for answers to annoying questions like how to collect dead handles; (c) DOM nodes now only take up as much memory as they need to; (d) we can make borrowing of nodes sound, because the new interface is amenable to the dynamic checks described in Niko's "Imagine Never Hearing the Phrase 'Aliasable, Mutable' Again" blog post.

This new DOM implementation currently exposes an unsafe interface in a few ways:

(1) When creating a node (converting from the base type to AbstractNode), there is no check performed to ensure that the thing you passed in actually is a Node, and moreover that it is the node that you claimed it was. This can be fixed by providing safe wrappers around node constructors and moving the low-level node constructors into the trusted computing base (hereafter TCB). These constructor primitives are very simple operations, so the TCB should remain easy to audit for security. It can also be fixed to some degree by adding struct inheritance to Rust, as I plan to propose. But note that node construction must remain part of the TCB to some degree, because nodes are owned by the SpiderMonkey garbage collector and not the Rust garbage collector.

(2) Borrowing of nodes (i.e. downcasting from AbstractNode to a concrete Node subclass) currently does not perform the dynamic checks needed for safety. What this means is that it is possible to cause segfaults with certain combinations of mutations and pointer borrowing. The fix will simply be to add these checks. Once these checks are added to the TCB, segfaults should not be possible in the safe Servo code.

(3) There is nothing currently preventing Rust code in the script task from accessing (and racing on) layout data structures, and layout from accessing dirty nodes. I believe this can be fixed with phantom types: layout will see a type that prevents access to the dirty parts of nodes, and script will see a type that prevents direct access to layout info.

I believe there are solutions to each of these problems, and of course fixing them is a high priority for the project. But note that, as I described before, there will always be some unsafe code relating to node memory management as part of the TCB, because SpiderMonkey's garbage collector manages the nodes. The goals are (a) to use as little unsafe code as possible, and, most importantly, (b) to prevent the unsafeness from leaking out into script and layout code.

Finally, note that the copy-on-write scheme is not yet implemented; right now script will just block on layout. Fixing this is a high priority as well.

Relevant source files

Clone this wiki locally