Skip to content

Workweek rasterization

jdm edited this page Nov 11, 2014 · 1 revision

Servo workweek - rasterization

  • zwarich: Currently using Skia (or SkiaGL). We used to use the Skia GPU backend via Azure. We've had the fascade that we supported different backends, but they all crash.
  • gw: Removed the option.
  • jack: Used to work with Cairo, but now do Skia and SkiaGL.
  • zwarich: Still use Azure, but hardcode to Skia. We're in a situation where we're using an abstract API but using a specific rasterizer under the hood. We're also using Skia in a really bad way - every time we go to draw, we create a DrawTarget. We still do that on the CPU, but it's not as bad as the behavior on the GPU where those allocations are really expensive.
  • jack: Created DepthBuffer, StencilBuffer, etc. on every DrawTarget creation
  • zwarich: We might not even need those. Couple of things to decide. 1) Do we want to keep using CPU rasterization or switch back to a GPU rasterizer? 2) Do we want to keep using Moz2D and Skia through it, use Skia directly, or some other rasterizer directly? We talked about writing our own rasterizer, which makes sense because web pages only render solid colored rectangles, image, and display text. That's very similar to our DisplayList, and enticing to write our own. But, if you need the full Canvas API surface, it gets complicated (e.g., curved rendering), and even CSS features need blend modes, etc. Feeling we had the last time was that it was not worth going down this path, since we will have to support Canvas, and probably soon. Implementing all of that ourselves is probably not the best idea. But periodically new 2D rasterizer projects are brought up, including a) a new effort by nical, b) EFL...
  • larsberg: Not EFL, after our discussions at the EFL dev days. I believe the new effort by nical is still nascent.
  • zwarich: Anyway, those are the questions. Thoughts?
  • kmc: On keeping CPU rendering at all - yes.
  • zwarich: More of which should be the default? Now that GPU is not...
  • gw: If power management is the goal, GPU is important.
  • matt: Chrome makes a tile by tile decision to choose between Skia and Ganesh.
  • jack: What are the heuristics?
  • matt: No; Interested in looking at it.
  • jack: Intuition the last time I looked was that rendering text should be fast. But, SkiaGL was incredily slow.
  • mrobinson: The version of Skia in Servo? For performance decisions, the Servo version is not accurate. It's old.
  • gw: I think pcwalton found that Skia's slowness in font rendering was due to blowing the font cache.
  • jack: We've talked about dropping Azure a few times before. Having multiple backends was the bonus - having D3D, etc. But when I've talked to the gfx team, they were in favor of the stateless API approach. They thought that was an important abstraction because each of the backends have a different way of approaching it.
  • kmc: This is an abstraction that doesn't exist?
  • jack: No, it's what Moz2d gives us - a stateless API. Unlike Cairo.
  • kmc: But if we're using Skia's API...
  • matt: Skia does not have an API. They implement whatever Chrome/Blink need. We had discussions at a recent workweek to encourage people to use Moz2D over Skia...
  • zwarich: Skia's native API is C++.
  • matt: Two years ago ARM started working with Szeged on 2D rendering on top of OpenGLES. Took WebKit and ripped out the graphics context backend so that we could run the browser without rendering anything. Then slowly started implementing the APIs in GLES. The project evolved with some people in Samsung Research UK. Szeged has been approved to open source it; should appear soon. Project was to show we could do efficient 2D rasterization on top of purely 3D APIs. Goal was because the Khronos APIs are falling behind and the GPU vendors are not interested in driving those. There's NVPath, but that's Nvidia. I think the conclusion is that we can do 2D rasterization well. You can get up to 12x perf improvement by doing 2D rendering using the 3D APIs wikipedia, falling leaves, etc. Pushing to get Google to use it in Ganesh; love to get Servo to pick it up; going to push hard on WebKit in China.
  • zwarich: Issue in OpenGL APIs with 2D rendering is partial texture updates. So, when you issue commands, choosing between immediate update vs. when you do the next draw calls is hard. Looking at the Skia pages online, it appears they have a lot of issues with partial texture updates as well. Not a problem on ARM GPUs?
  • matt: It is a problem on embedded GPUs. I don't have an answer to that specific question, but can put you in touch with people who do. There are some issues here.
  • zwarich: Issue is that they're problems with the API, not with the hardware.
  • gw: Can you not use pixel buffer objects?
  • zwarich: AFAIK, there isn't anything. There are vendor-specific extensions, so you could poke at individual vendor settings.
  • gw: I have used pixel buffers for streaming updates of video.
  • matt: 5x faster render time on scrolling wikipedia. It removed checkerboards significantly.
  • zwarich: In the past, CPU vs. GPU rasterization, it's been within 2x.
  • mrobinson: Probably depends on CPU and GPus?
  • matt: On Chromebook with a recent ARM. Cortex A15.
  • zwarich: Interesting!
  • jack: Should be open source soon!
  • matt: I can get you in touch with the tech lead.
  • mrobinson: As far as we're concerned in Servo on our rasterizer, if these techniques are a lot faster, I assume they'll just be integrated piecemeal into Skia. Or are they separate?
  • zwarich: There's just options of basically: 1) Skia directly; 2) Moz2D to Skia (primarily to isolate the C++ changes); 3) use these techniques from ARM
  • jack: How is using Moz2D different from using Skia directly? We'd have to wrap it anyway?
  • zwarich: Because when we want to use some Skia features, we have to also tweak Moz2D.
  • jack: Oh, so basically Moz2D is strictly more complicated?
  • zwarich: Moz2D is annoying because it's another wrapper, but it's also kinda nice if the Skia C++ wrapper is changing because they might provide us a stable C API.
  • jack: If Moz2D changes slowly?
  • zwarich: That would be my hope.
  • kmc: Since we haven't been updating Skia, we also don't know how much it hurts.
  • jack: Last time I updated Skia, I just pulled master on Skia and Moz2D and it "just" worked. But AFAIK, the Skia backend is not even actively used from Moz2D.
  • kmc: Why is it there?
  • jack: GPU rasterization on Linux. Which is not on by default. I think Andreas suggested using Skia directly and ANGLE.
  • kmc: The ARM work does suggest that would be practical.
  • jack: Is that what Chrome does? Just use Skia directly?
  • zwarich: Yes. So, Skia directly is an option. Anyway, a lot of the reasons we were blocked from using GPU by default was that we didn't have a level of control over the GPU buffers that we want. The render task should not be allocating and deallocating render targets, as it prevents us from sharing them between page. And it introduces a second caching layer. This design means we cannot use Moz2D because we cannot use their surfaces in a platform-independent way of managing buffers. There are some specific APIs for D3D that might work...
  • jack: That's why I added those specific APIs.
  • kmc: We've already made a fair number of modifications to those APIs to make this work, right?
  • jack: Not a fair amount; just a function to Azure that calls a new file in Skia.
  • zwarich: There was something that didn't work correctly for allocating the buffers in the compositor and passing it over...
  • jack: Skia wants to create a framebuffer with all those things created. When you want to pass that to the compositor, it just wants the pixel data, so we detach the FBO, etc. from it and pass the render buffer to the compositor. But then if you want to reuse that, you have to recreate it all...
  • mrobinson: On Linux (with pixmaps) you can just pass it, so you don't need to attach all the stuff. You just need modes to lock/unlock the pixmap.
  • jack: We were just trying to get something working.
  • mrobinson: The choice seems to be between using Azure and modifying it for our use case vs. using Skia directly and rebuilding the abstractions we use in Azure.
  • jack: What are we using in Azure?
  • mrobinson: There are some Azure DrawTarget abstractions and primitives that we use, which have different names in Skia. So we'd need an abstraction.
  • jack: AFAIK, it's just terminology changes.
  • kmc: Stateful vs. stateless API, too, right? Canvas 2D is stateful, right?
  • jack: Yes.
  • kmc: Our displaylist drawing code is still really small. Boxes, gradients, images. So it might not be a big deal to switch everything to the stateful API.
  • jack: Why didn't Firefox want to support stateful APIs?
  • kmc: And Coregraphics.
  • larsberg: Firefox moved away from stateful APIs due to them being a bugfest, according to the notes from gfx.
  • zwarich: Skia is closer to Canvas-style than to the Moz2D style. Also hard to decide on those. But I'd like to have GPU rendering supported and working as well as possible. Even if we use CPU rendering for the time being, have to make sure GPU rasterization works with them.
  • gw: I agree; maybe the plan should be to update Skia + azure, benchmark on SkiaGL vs. our CPU code.
  • matt: Also a blackberry GLES-enabled backend. GENTL (it's on github). That's been live since Blackberry 10 was released.
  • Simon: Have done tests of GPU rasterization on Chrome. Looked at Alexa50, etc. on the Note3. Found that for the Alexa sites, there was a 4% regression for GPU vs. CPU. But for the smoothness/jank tests, GPU was much better.
  • mrobinson: The goals with the Chrome team is about smoothness.
  • jack: How do you handle scheduling the GPU rasterization? Couldn't you starve it?
  • zwarich: But couldn't the CPU preempt the work for issues with interactivity? It's surprising that the GPU was beating it. I'd expect the CPU to show white/checkboard, but keep scrolling and for the GPU to run out of resources and be unable to composite.
  • matt: This question came up in the graphics workweek, too. My understanding is that the pipeline can be preempted based on task priorities. So, if you have a higher priority (e.g., the compositor), then even if there's a bunch of rendering tasks in the queue, you can drop those.
  • jack: The graphics driver knows this?
  • matt: The process can interrupt. When there are commands on the GPU already, another process can interrupt.
  • kmc: Hard to do.
  • matt: Supported by all embedded GPUs.
  • gw: GLES3 has fences
  • matt: Still 4 years before Android will mandate GLES3. Until that happens, GLES2 is a safer bet.
  • zwarich: Trying to figure out what we can do using the standard APIs.
  • larsberg: Isn't matt saying there is a standard?
  • zwarich: Vendor extension?
  • matt: No.
  • Simon: Khronos at SIGGRAPH this year announced an aggressive timetable for standardizing a bunch of new APIs.
  • kmc: Need to keep the door open to CPU and GPU rasterization.
  • zwarich: I'd like the default, at least on OSX, to move to GPU rasterization, because if we can't make that work, there's an architectural problem.
  • gw: The OSX build machines should run and pass the GPU ref tests.
  • larsberg: Looks like I have my work cut out for me...
  • jack: Have to have a logged-in user anyway, so it's not a big deal. The Linux builders, though, crash on the GPU issues.
  • zwarich: Related to our toolkit discussions, the reason you need to be logged in is because of GLFW. If we didn't do that, we could do headless GPU reftests.
  • jack: Nothing like that for X on Linux.
  • kmc: Desktop EGL on some drivers.
  • jack: Want to update Skia to the newest version. Want GPU rendering the default on OSX. Get GPU reftests passing..
  • zwarich: And headless!
  • jack: And benchmark how close we can get this. Anything else?
  • larsberg: Mixed CPU/GPU rendering.
  • pcwalton: Culled display lists.
  • mrobinson: Chrome gave up on per-tiles because pixel perfection would miss on seams between those tiles.
  • kmc: So we couldn't have CPU and GPU rendering match.
  • jack: Couldn't get highest-quality text rendering on the GPU anyway.
  • zwarich: Two issues: 1) gamma-correct with the background. That works out well - shadows, etc. look great. 2) Subpixel AA (per-channel alpha) has been dying out anyway as browser have been more GPU accelerated because there's no GPU that gives you efficient alpha per color channel. So, it's been dropped further and further. Still the question of whether you want to do gamma-correct text compositing. GPUs have specific functionality for gamma, and there are also pretty good fallbacks, though not as good as subpixel AA. Don't know if anyone is doing that, though I know Chrome+Skia abandoned subpixel. They do some tricks with the APIs on various platforms to fake it. I think we don't need to worry about subpixel.
  • mrobinson: And HIDPI doesn't change it much.
  • kmc: Is subpixel AA used on mobile?
  • zwarich: No. And you rotate the device. There are some subpixel AA that are rotation-agnostic, but the standard ones do not.
  • kmc: So, since mobile doesn't use it and HIDPI becoming more high-level...
  • zwarich: You don't even know during rendering what orientation you are in.
  • mrobinson: Does subpixel AA interact correctly with subpixel positioning?
  • zwarich: In CoreGraphics you have to re-render at each quantized position.
  • larsberg: Can we do one layer on GPU and one on CPU?
  • zwarich: On OSX, if you use IOSurface, you can render into it on CPU (though it might create a copy). Shouldn't be too hard if we architect for making GPU rasterization and CPU rasterization work well, we should be able to change the architect to make it work.
  • mrobinson: If we do CPU rasterization, would it be better to use shared memory until the last possible minute and upload at the last possible minute instead of having a backing shared memory buffer?
  • zwarich: Depends on the hardware. On NVidia, the DMA memory-shared textures, unless it's a video frame, if you read from it repeatedly, you will saturate the PCI bus. If you're on an Intel integrated, it all works great because you eliminate a copy. OpenGL sort of assume you're copying all your data.
  • jack: Also need a mixed CPU + GPU rendering test. And a call with the folks from ARM about the new API work that they've been doing on 2D rasteration on top of 3D APIs.
  • gw: Was there any conclusion to whether we should use azure or skia directly?
  • jack: Send an RFC to the list + chat with the Gecko GFX folks to make sure we want to remove Moz2D. There doesn't seem to be much of a reason to have Moz2D around. It's not hard to remove, but will we regret it? Just updating Skia will help us investigate things, though.
  • zwarich: This discussion has made me strongly in favor of ripping out Moz2D.
Clone this wiki locally