Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory sharing between virtual machines #473

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jkuro-tii
Copy link
Contributor

@jkuro-tii jkuro-tii commented Feb 13, 2024

A draft PR for a memory-sharing solution used for socket to socket data sending between virtual machines. It may
be used for wayland displaying.

The purpose of this PR is to share the code and try to find answers to the problem that arose.

Jira ticket:
https://ssrc.atlassian.net/browse/SP-3805

Documentation:
https://ssrc.atlassian.net/wiki/spaces/~62552e6ffdb60b006927ad98/blog/2022/09/29/612958326/Memory+sharing+between+virtual+machines
https://ssrc.atlassian.net/wiki/spaces/~62552e6ffdb60b006927ad98/pages/825720835/Wayland+displaying+with+shared+memory

Scenario that executes properly

Lenovo P1 with Fedora FC38 installed. For displaying and application VM's, there are two
separate vm-debug virtual machines with shared memory driver and the memsocket application
extra installed.

Actual
Playing YouTube is smooth, up to a resolution of 1440. 
CPU consumption by the memsocket application is below 5%, usually 2-3%.

Regular Ghaf Lenovo X1 scenario

Build Ghaf for the Lenovo X1 target and run it. 
Connect to WiFi, run Chrome, and play YouTube videos.

Actual
video is choppy, the memsocket application uses around 100% CPU.

Expected
Playing YouTube is smooth, with a resolution of up to 1440. 

Another test was performed with a Lenovo X1 installed with FC and running virtual machines.
extracted from the Ghaf build.
The result was the same as with Ghaf, i.e., video quality was unacceptable and CPU consumption
by the memsocket application was high, also around 100%.

Observations
Performance measurement was done both on the GuiVM and host. 
The perf data files:
perf_host.tar.gz
perf_guivm.tar.gz

Results on GuiVM
`
Samples: 387K of event 'cpu-clock:pppH', Event count (approx.): 96985250000
Children Self Command Shared Object Symbol

  • 99.96% 0.00% memsocket [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe
  • 99.96% 0.00% memsocket [kernel.kallsyms] [k] do_syscall_64
  • 99.84% 0.00% memsocket [kernel.kallsyms] [k] __sock_sendmsg
  • 99.83% 0.00% memsocket [kernel.kallsyms] [k] unix_stream_sendmsg
  • 99.82% 0.00% memsocket libc.so.6 [.] __send
  • 99.82% 0.00% memsocket [kernel.kallsyms] [k] __x64_sys_sendto
  • 99.81% 0.00% memsocket [kernel.kallsyms] [k] __sys_sendto
  • 99.71% 0.00% memsocket [kernel.kallsyms] [k] skb_copy_datagram_from_iter
  • 99.70% 0.01% memsocket [kernel.kallsyms] [k] _copy_from_iter
  • 99.69% 99.55% memsocket [kernel.kallsyms] [k] copy_user_enhanced_fast_string
  • 89.30% 0.00% memsocket [kernel.kallsyms] [k] copy_page_from_iter
    0.14% 0.12% memsocket [kernel.kallsyms] [k] __softirqentry_text_start
    0.14% 0.00% memsocket [kernel.kallsyms] [k] __irq_exit_rcu
    `

Results on host
`
+Samples: 152K of event 'cpu_atom/cycles/', Event count (approx.): 5979663105

  • Children Self Command Shared Object Symbol
  • 78.00% 0.00% .qemu-system-x8 libc.so.6 [.] __GI___ioctl
  • 76.53% 0.00% .qemu-system-x8 [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe
  • 76.53% 0.02% .qemu-system-x8 [kernel.kallsyms] [k] do_syscall_64
  • 70.29% 0.00% .qemu-system-x8 [kernel.kallsyms] [k] __x64_sys_ioctl
  • 70.28% 0.00% .qemu-system-x8 [kvm] [k] kvm_vcpu_ioctl
  • 70.27% 2.07% .qemu-system-x8 [kvm] [k] kvm_arch_vcpu_ioctl_run
  • 31.98% 2.91% .qemu-system-x8 [kvm_intel] [k] vmx_vcpu_run
  • 21.27% 21.20% .qemu-system-x8 [kernel.kallsyms] [k] native_write_msr
  • 17.35% 0.80% .qemu-system-x8 [kvm] [k] kvm_vcpu_halt
  • 13.89% 0.19% .qemu-system-x8 [kvm] [k] kvm_vcpu_block
  • 12.63% 0.23% .qemu-system-x8 [kernel.kallsyms] [k] schedule
  • 12.41% 0.41% .qemu-system-x8 [kernel.kallsyms] [k] __schedule
  • 10.47% 0.33% .qemu-system-x8 [kvm] [k] kvm_load_host_xsave_state
  • 10.19% 0.17% .qemu-system-x8 [kvm] [k] kvm_load_guest_xsave_state
    `

Both measurements indicate that the time is spent while executing the kernel __sock_sendmsg syscall,
when the send() function is used to write a buffer located in the shared memory into
wayland socket.

@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow February 13, 2024 10:17 — with GitHub Actions Inactive
@jkuro-tii jkuro-tii force-pushed the jkuro-add-wayland-shm-lenovo_rebase branch from d6017d1 to eef36be Compare February 13, 2024 10:19
@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow February 13, 2024 10:19 — with GitHub Actions Inactive
@jenninikko
Copy link
Collaborator

Are you planning to add documentation to the repository?

Copy link
Contributor

@vilvo vilvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please make PR description according to the template - https://github.com/tiiuae/ghaf/blob/main/.github/pull_request_template.md
  2. Please fill the PR template checklist, describe testing and remove "Draft" status.
  3. Re-request review

@jkuro-tii
Copy link
Contributor Author

The purpose of this PR is to share the code and try together to solve the mentioned problem of high kernel overhead.

@jkuro-tii
Copy link
Contributor Author

Getting pert data. perf is a standard Linux tool (https://perf.wiki.kernel.org/index.php/Main_Page). In this PR the perf component is included in the host, GuiVM and AppVM VMs. The first stage of using perf is collecting execution data (e.g. sudo perf record -a -g -p PID) for certain period of time. The data is written into the perf.data file. Then the data can be visualized using the sudo perf report subcommand. perf report uses /proc/kallsyms for obtaining information about kernel symbols. If you want to process the data offline, copy /proc/kallsyms and refer to it with the perf record --kallsyms=<file> option. The files that I attached contain both perf.data and kernel symbols files.

To check the memsocket application performace on GuiVM:

  1. Find the PID of the memsocket service: systemctl status --user memsocket.service | grep "Main PID"
  2. Collect perf data: sudo perf record -a -g -p PID
  3. Process it with perf report.

To check GuiVM's qemu performance on the host:

  1. Start terminal
  2. Login to host: ssh 192.168.101.2
  3. Find PID of qemu instance running GuiVM: systemctl status microvm@gui-vm.service | grep "Main PID"
  4. collect perf data: sudo perf record -a -g -p PID
  5. Process it with perf report.

All files kept in the /home/ghaf directory on the host are stored on the Ghaf boot disk. It may be used for accessing them on another computer. E.g. files from the GuiVM machine can be copied to host using scp _file_ ghaf@192.168.101.2: and accessed offline.
Sample reading about perf:
1, 2

@vilvo
Copy link
Contributor

vilvo commented Feb 14, 2024

To check the memsocket application performace on GuiVM

Thanks. I can reproduce your perf measurement and perf report analysis in the Youtube playing scenario. In addition, the user experience is terrible.

Behind the link of the description, there an interesting mention on

Result
Throughput socket to socket: 
2.8 GB/s = 22.4 Gb/s
using ssh: 140 MB/s = 1.1 Gb/s

Lenovo Thinkpad P1 Intel i7 @2.5 GHz

Sending 1 MiB buffer 8 K times (total 80 GiB sent)

Can you please share reproducible iperf test description? I'd like to reproduce and establish Lenovo X1 Carbon Gen11 baseline throughput. Another thing we could then check is to parametrize the buffer size and find where we hit the ceiling.

@jkuro-tii
Copy link
Contributor Author

I did iperf measuring, with the unsock tool - in order to tunnel TCP traffic over socket and shared memory.

Test configuration:
chromium-vm: iperf -c with unsock -> [socket] -> memsocket->
[shared memory]
guivm: memsocket -> [socket] -> iperf -s with unsock

Performance achieved:
Lenovo X1 with Ghaf: ~90 Mbits/sec
Lenovo P1 with Fedora Linux and two Ghaf vm-debug machines (my development environment): ~32 Gbit/s

result.client_x1.log
result.server_x1.log
result.server.log
result.client.log

@vilvo
Copy link
Contributor

vilvo commented Feb 19, 2024

I did iperf measuring, with the unsock tool - in order to tunnel TCP traffic over socket and shared memory.

Test configuration: chromium-vm: iperf -c with unsock -> [socket] -> memsocket-> [shared memory] guivm: memsocket -> [socket] -> iperf -s with unsock

Performance achieved: Lenovo X1 with Ghaf: ~90 Mbits/sec Lenovo P1 with Fedora Linux and two Ghaf vm-debug machines (my development environment): ~32 Gbit/s

result.client_x1.log result.server_x1.log [result.server.log]

(https://github.com/tiiuae/ghaf/files/14308083/result.server.log) result.client.log

[ghaf@ghaf-host:~/shmsockproxy/app]$ UNSOCK_ADDR=127.0.0.1 UNSOCK_DIR=`pwd` LD_P
RELOAD=~/unsock/libunsock.so.1.1.0  iperf -c 127.0.0.1 -p 1234

This indicates you have compiled some dependency in the current working directory (pwd?) of shmsockproxy app to use some version(which git commit hash?) of libunsock with iperf. From above instructions it's unclear on how to replicate the test setup as I rather not do guessing.

Optionally the test tools you have used could be nix packaged and included in the PR -debug build.

Would it make sense to add some simple unit throughput test through memsocket instead? What do you think?

@vilvo
Copy link
Contributor

vilvo commented Feb 19, 2024

If I try a bit further, to compile the memsocket on ghaf-host I get a lot of gcc warnings on -Wformat-extra-args and -Wformat-overflow= (just a small snippet below).

[ghaf@ghaf-host:~]$ nix-shell -p git stdenv

[nix-shell:~]$ cd shmsockproxy/app/

[nix-shell:~/shmsockproxy/app]$ gcc memsocket.c
memsocket.c: In function ‘server_init’:
memsocket.c:175:8: warning: too many arguments for format [-Wformat-extra-args]
  175 |   INFO("server initialized", "");
...
memsocket.c: In function ‘wayland_connect’:
memsocket.c:50:19: warning: ‘%s’ directive writing up to 255 bytes into a region of size 235 [-Wformat-overflow=]
...

Edit. git was used to clone:

[nix-shell:~]$ git clone https://github.com/tiiuae/shmsockproxy.git
[nix-shell:~]$ cd shmsockproxy/
[nix-shell:~/shmsockproxy]$ git log --oneline -1
8b1bf60 (HEAD -> main, origin/main, origin/HEAD) Removed forking, as systemd sends SIGTERM

@vilvo
Copy link
Contributor

vilvo commented Feb 19, 2024

Also, I think this is relevant to performance comparison between X1 and desktop - #340 (review)
So we can now use Ghaf internal NVMe instead of external SSD over USB3.

@jkuro-tii
Copy link
Contributor Author

If I try a bit further, to compile the memsocket on ghaf-host I get a lot of gcc warnings on -Wformat-extra-args and -Wformat-overflow= (just a small snippet below).

memsocket.c: In function ‘wayland_connect’:
memsocket.c:50:19: warning: ‘%s’ directive writing up to 255 bytes into a region of size 235 [-Wformat-overflow=]
...


Edit. `git` was used to clone:

[nix-shell:]$ git clone https://github.com/tiiuae/shmsockproxy.git
[nix-shell:
]$ cd shmsockproxy/

Fixed.

@jkuro-tii
Copy link
Contributor Author

chromium -> [socket] -> waypipe -> [socket] -> memsocket ->

[shared memory]

-> memsocket send() -> [socket] -> waypipe -> wayland

If I try a bit further, to compile the memsocket on ghaf-host I get a lot of gcc warnings on -Wformat-extra-args and -Wformat-overflow= (just a small snippet below).

[ghaf@ghaf-host:~]$ nix-shell -p git stdenv

[nix-shell:~]$ cd shmsockproxy/app/

[nix-shell:~/shmsockproxy/app]$ gcc memsocket.c
memsocket.c: In function ‘server_init’:
memsocket.c:175:8: warning: too many arguments for format [-Wformat-extra-args]
  175 |   INFO("server initialized", "");
...
memsocket.c: In function ‘wayland_connect’:
memsocket.c:50:19: warning: ‘%s’ directive writing up to 255 bytes into a region of size 235 [-Wformat-overflow=]
...

Edit. git was used to clone:

[nix-shell:~]$ git clone https://github.com/tiiuae/shmsockproxy.git
[nix-shell:~]$ cd shmsockproxy/
[nix-shell:~/shmsockproxy]$ git log --oneline -1
8b1bf60 (HEAD -> main, origin/main, origin/HEAD) Removed forking, as systemd sends SIGTERM

I added binary library and a few commands by hand into chromium and gui VMs. Is it OK it I attach them here, or you want to include them in the Ghaf build?

@vilvo
Copy link
Contributor

vilvo commented Feb 19, 2024

I added binary library and a few commands by hand into chromium and gui VMs. Is it OK it I attach them here, or you want to include them in the Ghaf build?

This command:

[ghaf@ghaf-host:~/shmsockproxy/app]$ UNSOCK_ADDR=127.0.0.1 UNSOCK_DIR=`pwd` LD_P
RELOAD=~/unsock/libunsock.so.1.1.0  iperf -c 127.0.0.1 -p 1234

from here indicates that some commands are also required on ghaf-host instead of only chromium-vm and gui-vm if one wants to reproduce the iperf throughput measurements. Now I'm quite not sure if that's the case or not. So I still have some trouble following the testing procedure and if we are talking about the integrated YouTube-streaming case or the iperf measurement case. So it's probably more clear if we add reproducible test setup in the Ghaf build for -debug-target and clarify instructions.

@jkuro-tii
Copy link
Contributor Author

jkuro-tii commented Feb 19, 2024

The memsock app in the gui-vm forwards all the data coming from shared memory to Waypipe, Thus, we can't run socket tests in a regular setup. If we want to perform them, memsock should be stopped in gui-vm: sudo systemctl stop --user memsocket.sevice. Then it can can be started by hand with custom socket path. The socket created by it can be then used by libunsocket + iperf.
I uploaded sample scripts - update your build memsocket binary location (e.g. systemctl status --user memsocket.service); I hope they'll clarify how to do it . Otherwise I'll prepare the build.

iperf.tar.gz
I ran also:
time dd if=/dev/zero of=/dev/ivshmem
time dd of=/dev/null if=/dev/ivshmem
in Chromium and Gui VM's. It seems it's something wrong with GuiVM - dd achieves 17.7 MB/s, whereas on Chromium VM it's 1.6 GB/s. Kernel configuration of both machines is identical.
It's in line with playing Youtube: high overhead happens only on GuiVM.

@vilvo
Copy link
Contributor

vilvo commented Feb 19, 2024

The memsock app in the gui-vm forwards all the data coming from shared memory to Waypipe, Thus, we can't run socket tests in a regular setup. If we want to perform them, memsock should be stopped in gui-vm: sudo systemctl stop --user memsocket.sevice. Then it can can be started by hand with custom socket path. The socket created by it can be then used by libunsocket + iperf. I uploaded sample scripts - update your build memsocket binary location (e.g. systemctl status --user memsocket.service); I hope they'll clarify how to do it . Otherwise I'll prepare the build.

Ok, thanks. Confirms my understanding of the real scenario vs. perf test scenario. I tried to rebase your PR with the main-branch to install your PR to NVMe (instead of external SSD) using the installer that was merged on Friday. There was a merge conflict. Would you mind resolving that and updating the draft PR branch?

iperf.tar.gz I ran also: time dd if=/dev/zero of=/dev/ivshmem time dd of=/dev/null if=/dev/ivshmem in Chromium and Gui VM's. It seems it's something wrong with GuiVM - dd achieves 17.7 MB/s, whereas on Chromium VM it's 1.6 GB/s. Kernel configuration of both machines is identical. It's in line with playing Youtube: high overhead happens only on GuiVM.

This is good approach. Super simple and confirms the scenario. So there's nothing throttling the write on chromium-vm side, we get good throughput. The reader number indicates it can't handle reading as fast. If the same happens in the Youtube scenario (e.g. lost frames) I'm guessing the decoder will request resending of the frames which is not only visible in jittery playback but should be also visible in increased network usage. Another scenario to test is the other way round. dd so that the gui-vm writes and the chromium-vm reads. The problem with dd approach is that we don't see how much sent data we are losing. If it was done using a known big file (instead of /dev/zero) - e.g. random generated 20GB with calculated md5sum, we could control the block send size and iterations, regenerate the read data into a file and calculate a corresponding checksum.

@jkuro-tii jkuro-tii force-pushed the jkuro-add-wayland-shm-lenovo_rebase branch from eef36be to 6224584 Compare February 20, 2024 08:21
@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow February 20, 2024 08:21 — with GitHub Actions Inactive
@jkuro-tii
Copy link
Contributor Author

It doesn't seem we are facing data loss. Connection between remote and waypipe instances is multichannel, data is sent both directions. After portion of data sent from AppVM, GuiVM sends a short package of data, probably it's confirmation. If its not received by AppVM, there is no further transmission. Also if we faced data loss, GUI stuff - windows, menu, scroll bars would be damaged. We don't face it.

I've pushed my branch rebased with main.

@vilvo
Copy link
Contributor

vilvo commented Feb 20, 2024

I've pushed my branch rebased with main.

Thanks, I installed your rebased draft branch build to X1 NVMe with:

nix build .#packages.x86_64-linux.lenovo-x1-carbon-gen11-debug-installer
dd if=result/iso/nixos-23.11.20231218.3a9928d-x86_64-linux.iso ...
<boot and install to nvme0n1>

The problem with dd method is the numbers tell very little. Writing stops with

[ghaf@gui-vm:~]$ time dd if=/dev/zero of=/dev/ivshmem
dd: writing to '/dev/ivshmem': No space left on device
32769+0 records in
32768+0 records out
16777216 bytes (17 MB, 16 MiB) copied, 0.688331 s, 24.4 MB/s

real	0m0.692s
user	0m0.000s
sys	0m0.690s

@vilvo
Copy link
Contributor

vilvo commented Feb 20, 2024

During the Youtube streaming scenario there's very heavy load in QEMU processes as seen from htop on ghaf-host.
The message signaled interrupts on chromium-vm and gui-vm are roughly ~50-60/s as seen with watch -n1 "cat /proc/interrupts"
qemu-load-with-youtube-streaming
When you stop the streaming, there's no QEMU load.

@vilvo
Copy link
Contributor

vilvo commented Feb 20, 2024

We're not using aio=native with QEMU. Please check
https://hex.ro/wp/blog/kvm-qemu-high-cpu-load-but-low-cpu-usage/
I checked the strace during streaming and we've got the same - littered with events=POLLIN and a ton of anon_inode:[eventfd]s under the QEMU /proc/<pid>/fd

@vilvo
Copy link
Contributor

vilvo commented Feb 20, 2024

And this presentation in the blog post link of the previous comment is extremely relevant for optimizing our case - https://vmsplice.net/~stefan/stefanha-kvm-forum-2017.pdf

@jkuro-tii
Copy link
Contributor Author

jkuro-tii commented Feb 21, 2024

I designed an utility for testing connection.
On one side (e.g. chromium-vm) it sends a file to a socket, on the other side (e.g. gui-vm) it receives it, without any writing. The transmission is protected by CRC-8.
Both sockets are connected with the memsocket app.
It's possible to specify how many data to send, and to force send broken CRC (by adding anything as 4th parameter), e.g.:

memtest socket_path /dev/random 10M
memtest socket_path /dev/random 10M anything # for sending broken CRC

First start it on the receiving site (any of the VMs):
memsocket -c ./test.socket & # run it once
memtest ./test.socket

Now file(s) can be send from another VM:
memsocket -s ./test.socket 2 & # run it once

memtest ./test.socket /dev/random 10M
memtest ./test.socket /dev/random 10M wrong_crc

It's prints some statistics.
Before using it it's necessary to stop the memsocket service: systemctl stop memsocket.service on the gui-vm and the other peer VM (e.g. chromium)! The service directs all data to the Waypipe's socket, so it must be stopped.

Sample sessions:

[ghaf@chromium-vm:]$ memtest
Usage: memtest socket_path for receiving.
memtest socket_path input_file [size CRC_error] for sending.
Before using stop the memsocket service: 'systemctl stop --user memsocket.service'.
For receiving run e.g.: 'memsocket -c ./test.sock &; memtest ./test.sock'.
To send a file run on other VM: 'memsocket -s ./test.sock 3 &; memtest ./test.sock /dev/random 10M.'
To force wrong CRC on sending: 'memtest ./test.sock /dev/random 10M xxx
'
[ghaf@chromium-vm:
]$ systemctl stop --user memsocket.service

[ghaf@chromium-vm:~]$ memsocket -s ./test.sock 3 &
[1] 776

[ghaf@chromium-vm:~]$ memtest ./test.sock /dev/random 10M
Sent/received 10485761 bytes 2 Mbytes/sec
real 5.030s
user 0.010s
sys 0.030s

[ghaf@chromium-vm:]$ memtest ./test.sock /dev/random 100M
Sent/received 104857601 bytes 4 Mbytes/sec
real 27.640s
user 0.250s
sys 0.320s
[ghaf@chromium-vm:
]$ memtest ./test.sock /dev/random 100M
Sent/received 104857601 bytes 244 Mbytes/sec
real 0.410s
user 0.200s
sys 0.200s

[ghaf@chromium-vm:~]$ memtest ./test.sock /dev/random 100M ui # to brake the CRC
Sent/received 104857601 bytes 263 Mbytes/sec
real 0.380s
user 0.170s
sys 0.200s

[ghaf@chromium-vm:~]$ memtest ./test.sock /dev/random 100M
Sent/received 104857601 bytes 263 Mbytes/sec
real 0.380s
user 0.190s
sys 0.180s


Usage: memtest socket_path for receiving.
memtest socket_path input_file [size CRC_error] for sending.
Before using stop the memsocket service: 'systemctl stop --user memsocket.service'.
For receiving run e.g.: 'memsocket -c ./test.sock &; memtest ./test.sock'.
To send a file run on other VM: 'memsocket -s ./test.sock 3 &; memtest ./test.sock /dev/random 10M.'
To force wrong CRC on sending: 'memtest ./test.sock /dev/random 10M xxx

[ghaf@zathura-vm:~]$ systemctl stop --user memsocket.service

[ghaf@zathura-vm:~]$ memsocket -c ./test.sock &
[1] 670

[ghaf@zathura-vm:~]$ memtest ./test.sock
Waiting for a connection.
Connected...
Sent/received 104857602 bytes 159 Mbytes/sec
real 0.630s
user 0.270s
sys 0.030s
Read 104857601 bytes. CRC OK
Connected...
Sent/received 104857602 bytes 244 Mbytes/sec
real 0.410s
user 0.200s
sys 0.010s
Read 104857601 bytes. CRC OK
Connected...
Sent/received 104857602 bytes 263 Mbytes/sec
real 0.380s
user 0.180s
sys 0.020s
Read 104857601 bytes. TRAMISSION ERROR!
Connected...
Sent/received 104857602 bytes 263 Mbytes/sec
real 0.380s
user 0.190s
sys 0.010s
Read 104857601 bytes. CRC OK
``

@jkuro-tii jkuro-tii closed this Feb 21, 2024
@jkuro-tii jkuro-tii reopened this Feb 21, 2024
@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow February 21, 2024 10:16 — with GitHub Actions Inactive
@jkuro-tii
Copy link
Contributor Author

I noticed that sending data between gui-vm and any other machine (zathura-vm, chromium-vm) is significantly slower then sending between app VMs (e.g. zathura-vm and chromium-vm).

On my desktop I use qemu version 8.0.5. The full run command line (it's a Ghaf vm-debug build):
exec /nix/store/ppfkk3h94b3s3vhifp55q80kr5y8bzvc-qemu-host-cpu-only-8.0.5/bin/qemu-kvm -cpu max \ -name ghaf-host \ -m 1024 \ -smp 1 \ -device virtio-rng-pci \ -net nic,netdev=user.0,model=virtio -netdev user,id=user.0,"$QEMU_NET_OPTS" \ -virtfs local,path=/nix/store,security_model=none,mount_tag=nix-store \ -virtfs local,path="${SHARED_DIR:-$TMPDIR/xchg}",security_model=none,mount_tag=shared \ -virtfs local,path="$TMPDIR"/xchg,security_model=none,mount_tag=xchg \ -drive cache=writeback,file="$NIX_DISK_IMAGE",id=drive1,if=none,index=1,werror=report -device virtio-blk-pci,bootindex=1,drive=drive1 \ -device virtio-keyboard \ -usb \ -device usb-tablet,bus=usb-bus.0 \ -kernel ${NIXPKGS_QEMU_KERNEL_ghaf_host:-/nix/store/sgv6vmzrbkghz8a0hr05garms07cqz2m-nixos-system-ghaf-host-23.05.20230927.5cfafa1/kernel} \ -initrd /nix/store/sgv6vmzrbkghz8a0hr05garms07cqz2m-nixos-system-ghaf-host-23.05.20230927.5cfafa1/initrd \ -append "$(cat /nix/store/sgv6vmzrbkghz8a0hr05garms07cqz2m-nixos-system-ghaf-host-23.05.20230927.5cfafa1/kernel-params) init=/nix/store/sgv6vmzrbkghz8a0hr05garms07cqz2m-nixos-system-ghaf-host-23.05.20230927.5cfafa1/init regInfo=/nix/store/w7b65bipqayzsi6w2ksjvd1yi7m8b0x3-closure-info/registration console=ttyS0,115200n8 console=tty0 $QEMU_KERNEL_PARAMS" \ $QEMU_OPTS \ "$@"

@jkuro-tii
Copy link
Contributor Author

I ran my tool in several configuration. Results:

  • there is no data lost or data corruption in any scenario
  • transfers between chromium-vm and zathura-vm are fast - ~250 MBytes/s
  • transfers between any of the above VMs and gui-vm are very slow. They hardly reach 12 MB/s

Shutting down weston gave no result. But doing that and eliminating passing-through the GPU solved the problem. After that transfer to/from gui-vm have the same high speed as inter app-vm's.

I'll continue with different virtual PCI devices setup, and try to use qemu own monitor in order to find the bottleneck.

@jkuro-tii
Copy link
Contributor Author

jkuro-tii commented Feb 26, 2024

The driver already contains interrupts counters. I verified number of interrupts sent by one VM against numbers listed in /proc/interrupts/. There are in line, i.e. number of interrupts raised by one VM is equal to number /proc/interrupts in the other VM as well as counted by the device driver. It means that there is no interrupt loop, missed interrupts.

Delays and huge CPU load happen only when GPU is pass-through. Without it CPU load is less then 5%.

@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow March 19, 2024 11:09 — with GitHub Actions Inactive
@jkuro-tii jkuro-tii had a problem deploying to external-build-workflow March 19, 2024 11:09 — with GitHub Actions Failure
@jkuro-tii jkuro-tii force-pushed the jkuro-add-wayland-shm-lenovo_rebase branch from 379f0cf to d8e1c2a Compare March 19, 2024 11:11
@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow March 19, 2024 11:11 — with GitHub Actions Inactive
@jkuro-tii jkuro-tii had a problem deploying to external-build-workflow March 19, 2024 11:11 — with GitHub Actions Failure
@jkuro-tii jkuro-tii changed the title Memory sharing for wayland displaying Memory sharing between virtual machines Apr 5, 2024
@jkuro-tii jkuro-tii force-pushed the jkuro-add-wayland-shm-lenovo_rebase branch from d8e1c2a to f0ca97a Compare April 5, 2024 09:03
@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow April 5, 2024 09:03 — with GitHub Actions Inactive
@jkuro-tii jkuro-tii had a problem deploying to external-build-workflow April 5, 2024 09:03 — with GitHub Actions Failure
@jkuro-tii
Copy link
Contributor Author

jkuro-tii commented Apr 5, 2024

Added some improvements:

  • allocating shared memory area using huge pages (2MB size)
  • adding to the qemu ivshmem driver ability to map the above memory directly into VM physicall address space (currently it's mapped into PCI memory area). The physical memory address is set in kernel command line.

@jkuro-tii jkuro-tii force-pushed the jkuro-add-wayland-shm-lenovo_rebase branch from f0ca97a to 5ea3c2c Compare April 5, 2024 09:20
@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow April 5, 2024 09:20 — with GitHub Actions Inactive
@jkuro-tii jkuro-tii had a problem deploying to external-build-workflow April 5, 2024 09:21 — with GitHub Actions Failure
@jkuro-tii jkuro-tii marked this pull request as ready for review April 5, 2024 09:46
@jkuro-tii jkuro-tii force-pushed the jkuro-add-wayland-shm-lenovo_rebase branch from 5ea3c2c to 2967781 Compare April 23, 2024 10:21
@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow April 23, 2024 10:21 — with GitHub Actions Inactive
@jkuro-tii jkuro-tii had a problem deploying to external-build-workflow April 23, 2024 10:21 — with GitHub Actions Failure
Signed-off-by: Jaroslaw Kurowski <jaroslaw.kurowski@tii.ae>
@jkuro-tii jkuro-tii force-pushed the jkuro-add-wayland-shm-lenovo_rebase branch from 2967781 to dec7369 Compare April 25, 2024 06:24
@jkuro-tii jkuro-tii temporarily deployed to internal-build-workflow April 25, 2024 06:24 — with GitHub Actions Inactive
@jkuro-tii jkuro-tii had a problem deploying to external-build-workflow April 25, 2024 06:24 — with GitHub Actions Failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants