Skip to content

Transcription Notes from Servo Architecture talk in Suwon

Jack Moffitt edited this page Nov 12, 2013 · 1 revision

Servo

  • j: talk about general servo architecture. i'll talk about a few pieces. will try to go slow
  • j: you know most of our goals. modern browser engine from scratch. when browsers were designed 15 years ago they made a lot of decisions that are no longer relevant. starting from scratch hopefully we can make better decisions, don't have to be burdened by old ones.
  • j: not completely from scratch. most code is new code, but the ideas have been in the heads of gecko engineers for a long time. we try to learn from both gecko and other browser vendors. lot's of experimentation.
  • j: a lot of what we're doing here are our current 'best guesses' for how to solve all the design constraints.

Rust

  • j:written in rust to take advantage of increased memory safety

parallelism on the web (slide 6)

  • j: this is an example o fthings that can be parallelized. lots of content but none of it interacts with each other. looks like there's a lot of parallesm on the web but browsers don't take advantage.

Browser Architecture (slide 7)

  • j: howh many of you have done web programming? programmed javascript?

Document Object Model (DOM) (slide 8)

  • j: when doing web programming in js we interact a lot with the 'dom tree'. bunch of html parsed turnend into this data structure. Document object points to first element ('html'). This is a very simple page.
  • j: red blocks are text nodes, blue are elements, green is the document object. each has different methods in js.

Browser Architecture (slide 9)

  • j: we load and parse html, generate dom tree, then calculate where all these pieces end up on screen. we run layout to calculate positions and sizes of everything in the dom tree. then generate a 'display list', a high-level list of graphics commands, send it to the renderer, which executes the commands in the display list to generate an image.
  • j: once it's on the screen we wait for events, which are then processed by the javascript engine, which may manipulate the dom tree, add nodes to tree, change colors.
  • j: once dom is modified we go back around again, calculate layout, generate display list, render, wait for events.
  • j: this cycle needs to be very fast. there are two ways to make this fast: 1) skip steps. e.g. if we make changes to the dom that don't need to display yet we can skip layout. if you make a change that does nothing we won't run layout. 2) make each of these things individually faster, the way we do that is through incremental computation. e.g. change the dom, instead of recalculating layout for the entire structure, instead of recalculating the entire layout we can just recalculate what changed.
  • j: try to do as little work as possible in each step. this is why incremental reflow is important.

Single Threaded (slide 10)

  • j: problem is this is a single-threaded set of steps. all browsers run the script, calculate layout, and back again. always in a single thread of execution.

Script & Layout (slide 11)

  • j: one way to make this faster is to do both of these things at the same time. a key idea in parallel.

Concurrent Script & Layout (slide 12)

  • j: run javascript, make changes to dom, start layout and immediately begin running script again. hopefully everything can be twice as fast. unfortunately there are javascript functions that return layout functions. e.g. getBoundingClientRect needs to wait until layout is calculated before returning.
  • j: in other cases can run in parallel at same time. currently in servo they are not running in the same time because the COW DOM is not finished

The Problem (slide 14)

  • j: We need the DOM tree to do layout calculations, figure sizes, colors, etc. but also need the dom on the script side because the user code is changing all the time. so DOM needs to be shared. any synchronization we add to make DOM available in multiple threads is perf we lose to chromium. so we can't use locks or other typical structures because they are too slow. it's not that they are objectively slow, but compared to no synchronization they are.

Copy-On-Write DOM (slide 15)

  • j: COW DOM: layout always looks at the blue nodes. this data is immutable. when script makes a change to a node - here changing 'a' to color red - we copy all the node data. script can see the red modified nodes, but layout continues to see the old versions.

Copy-On-Write DOM (slide 16)

  • j: if we continue to make changes we do more copies, updating dirty pointers

Copy-On-Write (slide 18)

  • j: when layout is finished we can go back through and clean up, copy all the dirty script data pointers and copy them back to the clean pointers, since layout is no longer using them. next time around, layout will see these new pointers and script will start making new pointers.

C.O.W. DOM Safety (slide 19)

  • j: use phantom types to enforce type safety: layout and script see different versions of these data structures. have to be sure we don't let one of these pointers escape, don't use layout view in script task, or layoutview in script task.

Flow Tree (slide 21)

  • j: same dom tree here. when doing layout we don't use the dom tree itself. we make a data structure called the 'flow tree'. A flow is 'how something should be layed out'. block flow is things one top of each other. inline flow is wrapped content like text. there are others but these are the main ones. for each node in the tree we walk from top to bottom. as walking the tree we build flows.
  • j: it's similar to dom but for a different purpose. notice that even though there are multiple inline dom nodes they create only a single inline flow. blocks nest, but inline don't

Flow Tree Boxes (slide 22)

  • j: flow tree have render boxes (red squares). block flow has single render box, tracks borders, margin, padding, etc. inline flow has vector of render boxes, one for each child, so all the text will end up in an inline flow.
  • j: key thing: flow tree on left here - once we create one it's immutable, render boxes though may change. reason: if we have a lot of text in an inline flow we have to do line breaking, measure test break it into lines. after line breaking we may have different render boxes. left hand side changes - right hand doesn't.

Calculating Layout (slide 23)

  • j: here's another flow tree. when calculate layout go through flow tree calculate sizes and positions of all render boxes. generate display list. in gecko and chromium, they have a similar strucutre (frame tree and render tree), they walk tree to calculate layout with a virtual method called 'layout' or 'reflow'. it's a virtual method call that recursively calls on its children. problem is that there's no restriction on what these calls can do; if you want to calculate info and need info from a node in a different part of the tree you can just grab it. because everything is possible in gecko layout, everything happens; it's a big problem for parallelization because the pattern used to access memory is uncontrolled. very hard to parallelize layout in other browsers.
  • j: in servo we have the same structure but the way we calculate layout involves predictable memory access. type system in Rust makes sure that we only use layout info from only children (never parents or siblings). in cases where that's true it can be parallelized.

slide 24

  • j: three calcluation passes for layout. 1) bottom to top (bubble widths) calcluate how wide things should be. 2) top to bottom (assign widths), 3) bottom to top (assign height).
  • j: if you know you're not going to access data from all over tree these are easy to parallelize
  • j: in top-down you can spawn a task per child, bottom up you can spawn tasks for the bottom row, then a row at a time. very easy to parallelize
  • j: this is implemented except we're not doing the passes in parallel yet.

JAVASCRIPT

  • j: js is a big component of servo, also biggest piece we don't write. we use spidermonkey, same as gecko. it's a bunch of C++, need to use it from Rust. I'll talk about how we interact with DOM.

DOM In JS (slide 27)

  • j: js is the way developers interact with the DOM. also need the DOM in Rust because layout must be really fast and to do that we can't be walking js objects.

DOM Implementation (slide 28)

  • j: can't just implement it in JS or just in Rust. DOM APis in standards like HTML5 are defined in WebIDL. we create the DOM structures in Rust code and use a generator that use WebIDL to create Javascript wrappers that call into the Rust DOM implementation.

DOM Example (slide 29)

  • j: example: document.createElement. this is javascript code.

WEBIDL (slide 30)

  • j: this is the WebIDL for this DOM API. interface for 'document', inherits from 'node'. interesting thing is def of 'createElement'. code generator uses these attributes to generate Rust.

CODEGEN Wrapper (slide 31)

  • j: code generator generates this code. takes all of these unsafe pointers. code is ugly because it's auto-generated, but somewhere in this code it gets a pointer to the Rust object and calls 'CreateELement'.

Rust Implementation (slide 32)

  • j: this is the 'CreateElement' implementation, written by a human.

DOM Memory (Current) (slide 33)

  • j: this is how memory is structured. JS Object contains a pointer to the Rust implementation of the actual object. for Document there's a Rust structure called 'Document' with the data to implement these APIs.
  • j: it's expensive to create these wrappers. when we create dom nodes we also create the js wrappers. this is not ideal because most nodes won't be used from js. wastes a lot of memory. gecko creates the wrappers lazily.

DOM Memory (Future) (slide 34)

  • j: we have a different idea. want to 'fuse' the js and rust objects in memory. the rust structure lives inside the js object. these js objects allocate extra pointers to store hot attributes and such. so we're going to put as much of the node data structure inside the js object. we'll still create them eagerly but it will be a single allocation pcwalton: chromium is experiminting with this in their 'oilpan' project. they are doing it for Dart.
  • j: josh has a branch that does this. seems to work, don't know the status. this is where we're going

DO Memory Management (slide 35)

  • j: in gecko js objects are GC'd by js, and C++ objects are reference counted. if a GC object points to a C++ object and vice-versa, it creates a cycle between the two systems. in gecko they have a cycle collector that tries to detect cycles between JS and C++. it's very complicated so we don't want to do it. in Rust we want all DOM memory to be owned by the JS GC. way this works is JS has tracehooks: calls a function in Rust code which then enumerates what other structures it is pointing to.
  • j: eliminates cycle collector from browser architecture. should be simpler and safer. nice also because JS GC is very good.
  • pcwalton: cycle collector causes perf problems - GC pauses + cycle collector pauses. lots of work to make them talk together. webkit and blink doesn't have a cycle collector but it doesn't work well, results in vulnerabilities, hard to understand.

Parallel iframes (slide 36) & Same-Origin iframes (slide 37)

  • j: first feature we've added that other's don't have: parallel iframes. iframes are like putting one page inside another. outside frame here is one web page, other two are other pages on same site (a.com). this is called 'same origin' because they are on the same page.
  • j: when you do iframes like this they have the same dom, all pages can communicate. when this happens we have to run all scripts in the same task. problem with this - if on the left side you're running some code that takes a long time and on the right side you're running code, then only one can be running at a time.

Cross-Origin iframes (slide 38)

  • j: in 'cross-origin' case where two iframes come from different domains, then they can't talk to each other - they are 'sandboxed', isolated. because they are isolated they can run in parallel. they run in different js engines and at the same time. we run both iframes in parallel. no other browser does this, but we do.
  • j: there's an attribute in HTML for iframes called 'sandbox' for same-origin iframes. if you do this then the iframes can't communicate. sandboxed same-origin iframes are like cross-origin iframes and can be run in parallel.
  • j: (chrome sandbox iframe demo)
  • j: in servo the multiplication is happening right now but the animation doesn't stop. first example of a new feature that we're able to introduce to the web.
  • j: other thing we have. iframe on left and iframe on right. the link on the right goes to a page that crashes. on servo we can not only detect the crash but limit it to the iframe. when the iframe crashes the other iframe keeps running.

Partial Layout (slide 40)

  • j: partial layout. no browser has this. some on chrome are talking about adding it, so we may want it to servo.

Partial Layout (slide 41)

  • j: when calculating layout info, when changing font size of 'a' node, and ask 'where is it on screen'. normally have to wait for the full layout.
  • j: in the case where we ask for something high in the tree, we don't need to calculate the whole tree, just enough to locate the 'a' node: partial layout

Normal Layout (serial) (slide 42)

  • j: normal layout runs script, layout, more layout, then returns result back to script

Partial Layout (Serial) (slide 44)

  • j: in partial layout we short-circuit layout and return the result immediately. in gecko and chrome they have a problem implementing this.... to implement this in gecko and chrome you have to make it so that doing less work is worth the perf cost of doing some of the work twice.

Partial Layout (Concurrent) (slide 45-46)

  • j: in servo however layout runs in parallel, so layout can just send a message as soon as we have the result, then we can continue running layout. other browsers have to skip that calculation and come back to it later. so when we want to draw to screen we don't have to run layout again.
  • j: we don't have a perf cost for this optimization. we can do it without the tradeoff that others' do. even if this wasn't faster, the code is very clear. there's a benefit to structuring the code like this - send a message once we have the result and keep going.
  • j: very complex optimization for traditional browsers. very simple for us.
Clone this wiki locally