Skip to content
This repository has been archived by the owner on Nov 1, 2021. It is now read-only.

wlr-screencopy: Allow repeated capture #48

Open
jb-wisemo opened this issue Jun 12, 2019 · 10 comments
Open

wlr-screencopy: Allow repeated capture #48

jb-wisemo opened this issue Jun 12, 2019 · 10 comments

Comments

@jb-wisemo
Copy link

This is a proposed backward-compatible optimization to the wlr-screencopy-unstable-v1 protocol based on experience with similar interfaces in other systems:

The goal is to minimize the number of API round trips for repeated screen capture, such as for recording it to a video stream, or any other use.

In zwlr_screencopy_manager_v1:

  • Add a setable property "repeatable" (defaults to false) which becomes an implicit argument to "capture_output". In the next incompatible protocol version, change "repeatable" to either an additional boolean argument to "capture_output" or a flag bit in a "flags" argument to "capture_output" (either is fine, a flags argument is more easily expandable for other future features).

In "zwlr_screencopy_frame_v1" change the following:

  • In the overall description replace the last paragraph by:

A frame object may be created as repeatable by the screencopy_manager, it is permitted for a client to create more than one frame object, repeatable or not. A repeatable frame object may be more costly than a non-repeatable frame, but should be more efficient than repeatedly creating and destroying frame objects.

Once a "ready" event is received for a non-repeatable frame, the client should destroy the frame.

Once a "failed" event is received for a frame, the client should destroy the frame.

  • In the flags enum, add a bit indicating that the frame object was created as repeatable.

  • In the description of the "copy" request add this:

If the frame is marked as repeatable, this request can be repeated once the "ready" event has been received, as often as the client wants, each such request will return the next available composited frame, possibly (at the compositors discretion) waiting until a change occurs. If the frame is not repeatable, has already failed, or is no longer applicable to the underlying compositor state, such a repeated "copy" request will fail with the "already_used" error.

Repeated "copy" requests may specify the same wl_buffer arg value, the compositor may at its discretion consistently reject other values with the "invalid_buffer" error, clients may at their discretion adapt their behavior accordingly.

The "buffer" and "flags" events are not repeated for subsequent copy requests, reducing the continued interaction to "copy" requests and "ready" events until a "failed" event occurs or the client voluntarily makes a "destroy" request.

  • In the description of the "ready" event change

After receiving this event, the client should destroy the object.

to

After receiving this event, the client should destroy the object, unless it is marked as repeatable, in which case the client may issue a fresh "copy" request

@ascent12
Copy link
Member

I was toying around with a similar idea myself awhile ago, but I never got around to submitting it or trying to implement it yet: https://gist.github.com/ascent12/62c3b591a8abdf3017410e6f61c10f44
Basically the idea was to enqueue many copy requests, and the compositor would give back one buffer every time it draws a frame. And release fences were thrown in there too.

@jb-wisemo
Copy link
Author

My proposal is about eliminating that cycle of reallocation for every frame (maybe at 100fps), instead having the compositor repeatedly update the same few shared buffer(s).
Having looked further at the Wayland protocol, it seems that even passing the unchanged wl_buffer object across would cause some significant overhead in object lookups, thus perhaps a 0-argument "copyagain" request instead of special semantics for 2nd and later "copy" requests.

@ascent12
Copy link
Member

I doubt the wayland protocol itself is going to be any sort of noteworthy bottleneck. Maybe it would be if you were trying to push over 10000 FPS, but the GPU is going to get in the long before that's an issue
wl_surface.frame works fine at high framerates, and that has a new object allocation associated with it too.

@emersion
Copy link
Member

emersion commented Jun 12, 2019

Add a setable property "repeatable" (defaults to false) which becomes an implicit argument to "capture_output".

-1, this changes the semantics of the protocol too much, making implementations complicated. I'd much rather see something similar to how linux-dmabuf is handling this: make it necessary for the first capture to have a roundtrip, then optimize for the case in which buffer parameters don't change (e.g. by adding copy_imm).

wlr-writeback-unstable.xml

-1, the compositor needs to decide the buffer format.

I doubt the wayland protocol itself is going to be any sort of noteworthy bottleneck.

Agreed.

@emersion
Copy link
Member

Also: reminder that screencopy is going to be slow anyway because it's downloading frames from the GPU. It would be nice to have benchmarks before over-optimizing.

@ascent12
Copy link
Member

-1, the compositor needs to decide the buffer format.

In the dma-buf case, I don't think it's a particularly big deal for the format to change in the copy. Maybe not for wl_shm, though.
The buffer format bit is hardly something I care about or will defend very hard and I don't think that whole protocol I threw together needs to be accepted as-is.

The main idea is just queuing a bunch of frames to make it easier to use the interface for capturing video and not accidentally missing frames.

@emersion
Copy link
Member

The main idea is just queuing a bunch of frames to make it easier to use the interface for capturing video and not accidentally missing frames.

I don't understand why this is a concern for screen capture and not for frame submission?

If a client really needs not to miss a frame, just do the processing in a different thread or something.

@ascent12
Copy link
Member

ascent12 commented Jun 12, 2019

It's just nice to remove some of the synchronization required between the client and the server. If the system is under high load, and the recording process doesn't get scheduled in time, it may end up missing a frame because it couldn't get its request in time.
But I'm making some pretty weak arguments; this alone isn't really justification to make a breaking change to the protocol.

I don't understand why this is a concern for screen capture and not for frame submission?

It is somewhat a concern for things like video players. I'm sure they'd love a different way to schedule a frame instead of needing to precisely call wl_surface.attach at the right time, and trying to guess what the compositor's deadline is. But this is off-topic, and I think there are some VERY old discussions on wayland-devel about this.

@emersion
Copy link
Member

emersion commented Jun 12, 2019

If the system is under high load, and the recording process doesn't get scheduled in time, it may end up missing a frame because it couldn't get its request in time.

Same for submitting frames of e.g. a video game. I'm really not convinced that queuing would improve anything. Pretty sure it's not the bottleneck.

I'm sure they'd love a different way to schedule a frame instead of needing to precisely call wl_surface.attach at the right time, and trying to guess what the compositor's deadline is.

They don't need to do it "at the right time": the presentation-time protocol allows them to know when the next frame will be displayed. They just need to submit a frame when they receive a frame event.

Also, given that almost nobody uses presentation-time, I'm not sure video players are that interested (yet?).

@emersion
Copy link
Member

emersion commented Nov 1, 2021

wlr-protocols has migrated to gitlab.freedesktop.org. This issue has been moved to:

https://gitlab.freedesktop.org/wlroots/wlr-protocols/-/issues/48

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

3 participants