Workweek graphics stack

SDL + graphics stack real talk

jack: should we ditch glut/glfw for egl, sdl, or something of its nature? also, should we implement bas' idea of a new API for browsers?
pcwalton: no thoughts beyond the mailing thread.
jack: context is samsung wants to ditch azure dependency, and bas discussed this new API built directly on openGL and would be really fast.
zwarich: until you add svg.
jack: problem with gecko is that it draws so many different things, it's a multi-year project. we could bring up a new graphics library without too much extra effort if it's small.
pcwalton: worried about scope creep. if someone on the graphics team wants to do it, adding it for servo would be really nice. not only because we don't draw as much, but also because we do display lists per node. gecko builds a display list, throws it away, lots of recursive calls that don't relate to dom tree - for parallelism, servo's way is lots of display items that are merged as progress up the tree. in theory we could retain display list items from layout to layout. bas was saying that retaining items can retain gpu items they correspond to, which is very nice. text could retain cached texture atlast for glyphs for fonts, could have items retain gpu state. when coupled with incremental reflow, can know when font is no longer needed and can trash gpu resource. with imm. mode api like skiagl, can't do that and need cache of most recently used fonts. with retained mode api like bas' idea (and cached display items), we could have more precise information about what items need to be stored on gpu. much easier to do on servo. blink is thinking about doing it with skiapicture-per-element due to stacking contexts.
jack: we have really heavyweight library we're not using much of, designed around single-threaded rendering which already causes problems for us.
pcwalton: would like to switch to cpu rendering by default. good PR, download, install, be happy.
zwarich: big problem of gpu-backed 2d renderer is all bugs for glteximage2d - typical text rendering is mosaic of chunks of larger texture. right way to do it is uniform grid of 32x32. for each glyph, if it's over the box size, split it up. if you implement it that way using teximage2d, performance problems on mobile. none of the extensions to make it good are standardized. interested in what azure or skia are doing on linux.
SimonSapin: There was an article about how Android's font renderer does GPU rendering: https://medium.com/on-coding/androids-font-renderer-c368bbde87d9
jack: remember, samsung driving this discussion talking about deploying on specialized hardware under their control.
zwarich: at a hardware level, this problem shouldn't exist. (something about cache flushing having no impact). most drivers try to get state back into state for immediate rendering, will flush cache on every texture render(?). purely problem caused by opengl's interface. might be better if can fill up a texture and not use teximage2d, need fallback path to do what current renderers are doing. don't know how to do that in a platform-independent way.
pcwalton: on android there's surfacetexture, but there's also android fragmentation. these are arguments against gpu in general. lots of gfx team gung-ho about gpu rendering. we should do cpu-rendering by default. if you have control over the hardware you're in a better position. orthogonal to creating a new graphics API - don't feel we have the resources to do this. I would strongly push the gfx team to experiment with servo first. could write the library in rust! question about canvas and SVG - canvas needs to be immediate mode API regardless, since the API is immediate mode, so you need to use azure or something.
zwarich: tangent was just one aspect of writing graphics library. just that one problem is a hard one. writing a new (??) wouldn't be too terrible, and would be faster than what we're currently doing.
pcwalton: could switch to cpu rendering and use all cores.
brson: preference for distributing work for painting?
pcwalton: spawn a task per tile. could cache tasks, but surprising if it matters.
jack: send message with index of tile and pointer to display list?
pcwalton: yes
zwarich: if running on a GPU, already doing heavy-weight synchronization. spawning tasks shouldn't be comparable.
pcwalton: (gfx technobabble)
zwarich: on many platforms doesn't make sense to render on multiple threads since just takes locks in the driver.
pcwalton: bas' problem with the atlas is due to immediate mode API, don't know whether to get rid of it, so use heuristics. whereas we know what's on the page, but can't tell skia about it. we can do better if we hold on to gpu resources via retained api.
gw: with some knowledge don't have to build atlas per thread. you can share them.
kmc: save on gpu memory
gw: when rendering on 1080p will be blown away by gpu
kmc: re: gpu scheduling being disaster, could work with vendors and talk about using servo as good demo for improvements.
pcwalton: I want to be able to reserve cores.
kmc: talk to amd and nvidia about this, probably have contacts.
pcwalton: should try coregraphics accelerated on mac, should be better than skia. apparently safari is shipping that today.
mbrubeck: firefox uses azure with direct2d/3d backends on windows. there's a reason we ended up with azure.
jack: andreas thinks we should use ANGLE. we need it for webgl eventually anyway.

Graphics stack (several days later, with special guest Martin Robinson)

jack: You're working on overflow next: how is that going?
martin: As working on test cases for overflow, I found compositor and flowtree bugs. The two I was looking at this week are just related to reflow. I fixed those, and the test case I made (third bug) is the overflow bug. Once I figure out that (which is probably another reflow bug), I will start looking at displayport and then culling the display lists. This has been good because it helped me figure out how reflow works and how absolutely positioned elements work.
pcwalton: I've added some code. Hopefully will land incremental reflow this work.Also added the beginning of a display list optimization pass. Although, your work will be on a different level because there's display list optimization on it, but also display port optimization that avoids creating the display list items in the first place.
martin: Now for fixed pos elements, it looks like overflow is calculated as if the display port is at 0,0. Does that seem correct?
pcwalton: For fixed position elements, they get a separate layer, so you can use 0,0 and it should be fine... maybe check via mail to the mailing list on what to do there so we can ask roc and/or bz? I don't know what the right thing to do with position:fixed is off the top of my head.
pcwalton: I think you're going to always want to render all position:fixed things and never cull them out. So, how do we know to descend into elements that contained fixed position things. Maybe we should just have a list of them on the root, since that's their containing block anyway and always go into the unconditionally. You want to descend into containing block that intersect with the viewport. In this case, it's the root, so you'll always want to.
martin: In addition to abs_descendents...
pcwalton: Yes, fixed_descendents
martin: The bug was with fixed & abs positioned. When static, they're still relative to their containing block. The idea you proposed sounds good for display lists creation.
jack: Other topics? In layers refactoring?
pcwalton: We figured out why rendering is so slow, at least in CPU rendering mode. The problem is that skia wasn't clipping well, so now we do it in a display list optimization. The upshot of that is that we don't have to worry about performance too much now in the refactoring. What's not good is maintainability, so we should focus on that - clarity, etc. Our perf is OK now with the current design, which is good because it's one less thing to worry about. At some point, I will try to look into why we're so racy when uploading textures and try to solve that. Well, maybe, if I have time...
jack: Other thing was SDL vs. EFL stuff. Best thing to do long-term there? cameron mentioned concerns with SDL as the wrong way to go.
pcwalton: Shouldn't use SDL and just use a better set of code. It's only a few hundred lines of opengl code. But SDL may be a reasonable choice for setting up things.
czwarich: I just don't want to use SDL for the layers. For Windowing... it's better than glfw.
larsberg: I was in favor of EFL because of DRM support built-in (for cross process texture sharing on Wayland, etc.) and because zmike offered to do the work for it.
pcwalton: If it can do cross-process texture sharing on Linux then that's an advantage over everything else.
larsberg: zmike says that already landed and supports Windows also.
mrobinson: Doesn't Ozone have support for cross-process surface sharing?
mrobinson: http://www.chromium.org/developers/design-documents/ozone
larsberg: Has DRM support for cross-process texture landed?
zmike: Yes, and wayland support has been in for 2 years now.
pcwalton: It can do cross-process texture sharing on linux, right?
zmike: DRM, yes, though no way other than that.
pcwalton: I don't think there's anything other than that...
czwarich: Mac, too? Or linux only?
zmike: I believe it's linux-only...
pcwalton: we have code for Mac
martin: DRM is intel but not nvidia or amd, right?
pcwalton: I think nvidia is hopeless anyway.
martin: Should be possible with the x composite extension... and in wayland, you can make an embedded compositor. Not as flexible as DRM, but works in webkit.
pcwalton: Haven't tried x composite extension before, but it should work.
martin: What's the problem with texfrompixmap right now?
pcwalton: Ideally, I really want the content process to have no connetion to x server at all. Reason is that X is super insecure and that's bad for an untrusted content process. DRM works well for this b/c for DRI3, it's reasonably secure. But if you have to have a connection to the x server, you don't have a lot of security with that content process anyway. On nvidia, I don't know of anything other than having a texture shared via x. Maybe don't be multi-process on nvidia?
martin: wayland is a bit easier because you can use embedded wayland compositors
pcwalton: Not as scared of having a wayland client as an x client. On linux, we use DRM where available for multi-process servo. If you have nvidia blob, single-process servo.
martin: rasterization to compositor?
pcwalton: Could try to IPC them to the display lists, but that's a lot of overhead. No other browser has tried to do that. Maybe wiregl in chromium... but that's kind of at a different level. I just don't want to do a ton of work in the design jsut for nvidia blob on linux. Seems uninteresting for Servo right now.
martin: Chromium is moving to this model (impl-side painting). Main reason is so they can do eager painting of exposed regions.
pcwalton: We can already do that because the render task is in a separate thread. I'm very curious what that will do to chromium's perf... seems like a big hit.
czwarich: Complicates caching.
pcwalton: So skia pictures over the wire?
martin: Yes. Skia pictures.
pcwalton: Seems like a lot of overhead. By contract, our display lists are just a pointer and one atomic operation.
larsberg: I also asked zmike about the footprint of EFL, and he says based on Samsung's metrics it has a smaller footprint than SDL and other frameworks, and doesn't depend on the rest of Enlightenment.
pcwalton: Also fears about impl-side painting - security!
czwarich: Wouldn't want to do it unless it was written in rust.
pcwalton: Yeah, moving skia into the sandboxed process seems dangerous
martin: Might just be in the compositor thread and not in the gpu process (unless you're in the opengl version because of the driver).
pcwalton: Nothing else to chat about. Let's investigate efl!
jack: If we use EFL, does that affect embedding? Do apps that embed Servo need to use EFL too?
larsberg: We'll have to ask zmike.
jack: Specifically, if we want to make a ServoView component... that's my only remaining question.
acavalcanti: EFL is really modular. Can probably select a smaller selection of the libraries.
jack: the question is: can we create a GTK servo view widget, does it hide the EFL details?
acavalcanti: Are you just using EFL to replace Skia for rasterization? Or for other stuff?
jack: Just replace what glut/glfw are doing. Open window, get a framebuffer... basic mouse/keyboard events. That's about it.
acavalcanti: It should be possible to do what you want as far as I can tell.
jack: Experimentation is needed. glfw certainly has the same issues, at the very least.
bjz: Which issues?
jack: doesn't work on android
bjz: Yeah, that's the annoying thing. SDL has much better Android support.
jack: Viewport-only display lists? Only thing left.
pcwalton: Yeah, already did, unless there's something specific.
jack: Guess that's it then, thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workweek graphics stack

SDL + graphics stack real talk

Graphics stack (several days later, with special guest Martin Robinson)

Clone this wiki locally