Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a virtual machine with a virtio-gpu device #558

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nesteroff
Copy link
Contributor

@nesteroff nesteroff commented Apr 11, 2024

Description of changes

Currently Ghaf uses Waypipe to run graphical application in virtual machines. It works quite well over VSOCK but it might be not very optimal in terms of performance as it serializes all data including pixel buffers over sockets. Also, Waypipe has some security implications, see this section of the man page.

As an alternative solution designed specifically for virtual machine we can evaluate using crosvm virtual machines with a virtio-gpu device. It requires a proxy Wayland compositor running in the guest, which can be either Sommelier or Wayland-proxy-virtwl.

This PR adds a virtiogpu-vm virtual machine which can be enabled for generic-x86_64 target. It has both Sommelier and Wayland-proxy-virtwl installed as well as Waypipe. The idea is that it can be used as a playground to test virtio-gpu device implemented in crosvm. It's possible to run applications using all available methods with the following scripts: run-sommelier, run-wayland-proxy, run-waypipe.

My current observations show that Sommelier is a bit unstable. Sometimes it crashes when moving windows, etc. but Wayland-proxy-virtwl seems to work quite well. Comparing to Waypipe the performance looks better and the system load is lower but more testing is required.

Checklist for things done

  • Summary of the proposed changes in the PR description
  • More detailed description in the commit message(s)
  • Commits are squashed into relevant entities - avoid a lot of minimal dev time commits in the PR
  • Contribution guidelines followed
  • Ghaf documentation updated with the commit - https://tiiuae.github.io/ghaf/
  • PR linked to architecture documentation and requirement(s) (ticket id)
  • Test procedure described (or includes tests). Select one or more:
    • Tested on Lenovo X1 x86_64
    • Tested on Jetson Orin NX or AGX aarch64
    • Tested on Polarfire riscv64
  • Author has run nix flake check --accept-flake-config and it passes
  • All automatic Github Action checks pass - see actions
  • Author has added reviewers and removed PR draft status

Testing

  • Enable virtgpu-vm for the generic-x86_64 target by uncommenting virtualization.microvm.virtgpuvm.enable = true;
  • Build the image, boot Lenovo X1 and connect to the VM: ssh virtgpu-vm.ghaf
  • Run Chromium or Firefox using different methods, test performance, stability, system load, etc:
ghaf@virtiogpu-vm$ run-sommelier chromium --ozone-platform=wayland --disable-gpu
ghaf@virtiogpu-vm$ run-wayland-proxy chromium --ozone-platform=wayland --disable-gpu
ghaf@virtiogpu-vm$ run-waypipe chromium --ozone-platform=wayland --disable-gpu
ghaf@virtiogpu-vm$ run-sommelier firefox --display none
ghaf@virtiogpu-vm$ run-wayland-proxy firefox
ghaf@virtiogpu-vm$ run-waypipe firefox

@nesteroff nesteroff temporarily deployed to internal-build-workflow April 11, 2024 12:12 — with GitHub Actions Inactive
@Mic92
Copy link
Collaborator

Mic92 commented Apr 11, 2024

I also played around with wayland-proxy-virtwl and also did add some features needed for Gaming (talex5/wayland-proxy-virtwl#75 talex5/ocaml-wayland#40). Things that I noticed: Chromium worked sort of but firefox crashed on startup. That didn't happen with waypipe. I think the whole thing is not yet compatible with any GPU stuff. I couldn't actually notice any performance benefit when I was testing it with qemu rutagaba vs waypipe, also this was just the perceived performance not doing any measurements.

@Mic92
Copy link
Collaborator

Mic92 commented Apr 11, 2024

I think crosvm might be more stable than the stuff in qemu. I did back than not test crosvm at all because I don't think it has audio support on Linux. Than there is also this cloud-hypervisor fork: https://spectrum-os.org/software/cloud-hypervisor/

@nesteroff
Copy link
Contributor Author

I also played around with wayland-proxy-virtwl and also did add some features needed for Gaming (talex5/wayland-proxy-virtwl#75 talex5/ocaml-wayland#40). Things that I noticed: Chromium worked sort of but firefox crashed on startup. That didn't happen with waypipe. I think the whole thing is not yet compatible with any GPU stuff. I couldn't actually notice any performance benefit when I was testing it with qemu rutagaba vs waypipe, also this was just the perceived performance not doing any measurements.

That's cool! To me it's even hard to read OCaml to be honest. I just tested Firefox and it seems to work fine in this configuration. At least it starts and it's possible to open tabs, etc. It's hard to measure performance but I noticed that when watching a 4K video full screen the system load is not as high as when using waypipe.

Performance problems become really noticeable in the lenovo-x1 configuration when the compositor is running in the GUIVM. In this scenario the data goes from one VM to the host and then from the host to another VM through vsockproxy. There is a lot of extra IO and 4K video becomes more like a slide show. I wonder if there is a way to make wayland-proxy-virtwl support the use case when the compositor is in its own VM?

@nesteroff
Copy link
Contributor Author

I think crosvm might be more stable than the stuff in qemu. I did back than not test crosvm at all because I don't think it has audio support on Linux. Than there is also this cloud-hypervisor fork: https://spectrum-os.org/software/cloud-hypervisor/

Yes, I believe Alyssa developed the initial version when she worked with us. It's a really good patch but it uses the same GPU backend implemented in crosvm so I assume it should work more or less the same.

@nesteroff
Copy link
Contributor Author

Updated to support autostart of the VM with virtio-gpu

@Mic92
Copy link
Collaborator

Mic92 commented Apr 23, 2024

I wonder if there is a way to make wayland-proxy-virtwl support the use case when the compositor is in its own VM?

I think something related nested compositing was added only very recently to the wayland protocol or was related to having x11 window manager in a wayland session? I cannot remember exactly.

@Mic92
Copy link
Collaborator

Mic92 commented Apr 23, 2024

That's cool! To me it's even hard to read OCaml to be honest.

Yeah. It's also not my favorite language and the person maintaining the project also writes advanced abstractions at least for my level. Anyway, I happen to build up some understanding of the wayland protocol, in case there is some need for it.

@nesteroff
Copy link
Contributor Author

I think something related nested compositing was added only very recently to the wayland protocol or was related to having x11 window manager in a wayland session? I cannot remember exactly.

Nested compositing seems to work fine on Wayland nowadays but I'm not sure if it's something that we could use for the Lenovo X1 target. It has a single compositor but it's running in its own VM with GPU passthrough instead of running on the host. I guess we can try to figure out how to share blobs and pipes over virtio-gpu in the guest-to-guest scenario instead of guest-to-host if it's even feasible.

@nesteroff nesteroff marked this pull request as ready for review April 29, 2024 14:39
@nesteroff nesteroff requested a review from mikatammi May 2, 2024 10:28
Signed-off-by: Yuri Nesterov <yuriy.nesterov@unikie.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants