Notification of VM Events #8547

izhouwu · 2024-01-19T08:02:00Z

When any of the following VM event occur, the hypervisor and device model shall send an event in JSON format through the device model command monitor. Events shall encode the event type and the information mentioned below.

Changes to RTC (offset of new vRTC value to the old value is required).
Crashes due to triple fault.
Watchdog timeout.
Shutdown.
Reboot.

izhouwu · 2024-01-19T08:02:48Z

[External_System_ID] ACRN-9966

Some DM's virtual timer devices use CLOCK_REALTIME as either clock counter source or period timer source. Including: - virtual RTC - virtual PIT - virtual HPET According to Linux Manual, CLOCK_REALTIME is the 'wall clock' which is affected by discontinuous jumps in the system time. The issue is that service VM system time could be changed, either by root user manually or by NTP automatically calibration. When that happens, DM's virtual timer devices which relays on CLOCK_REALTIME will experience discontinuous time jump, and become inaccurate. It would affect both time stamp read value and period timer. Especially when service VM system time is moved backwards, WaaG's system software will lost response and be stalled for quite a long time. To solve this issue, we need to switch CLOCK_REALTIME to CLOCK_MONOTONIC. As it represents: 'A nonsettable monotonically increasing clock that measures time from some unspecified point in the past that does not change after system startup' Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

This patch creates vm_event support in HV, including: 1. Create vm_event data type. 2. Add vm_event sbuf and its initializer. The sbuf will be allocated by DM in Service VM. Its page address will then be share to HV through hypercall. 3. Add an API to send the HV generated event. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>

The sbuf will be used by DM to send and receive vm_events. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com>

This patch adds vm_event sbuf and notification initialization. We have 2 types of event source: DM and HV, and they are slightly different: - Sbuf for DM event source is a memery page shared between threads. Event notifications are delivered by userspace eventfd. - While for hv event source, sbuf is a memery page shared with HV. Its address(GPA) is shared to HV through hypercall. Its notifications are generated by HV upcall, then delivered by kernel/userspace eventfd. A sbuf message path acts like a one way ‘tunnel’, so a data structure ‘vm_event_tunnel’ is created to organize those sbufs. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

This patch creates a thread for vm_event delivery. The thread uses epoll to poll event notifications, then read out the msg data queued in sbuf. An event handler is called upon success receiving. Both HV and DM event sources share the same process. Also vm_event tx API for DM event source is added in this patch. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

This patch added vm_event support in command monitor, so that vm_event can be sent to a client (e.g., Libvirt) through the monitor. As the command monitor works in socket server mode, the vm_event sending process is designed in this way: 1. If a client wishes to receive vm_event, it issues a REGISTER_VM_EVENT_CLIENT command to the monitor. 2. Command monitor then handles the REGISTER_VM_EVENT_CLIENT command. If it is legitimate, the client is registered as as vm_event receiver. The command monitor then send a ACK to the client, and keeps the socket connection. 3. When a vm_event is generated, the command monitor send it out through the socket connection. 4. Only one event client is allowed. 5. The registration is cancelled on socket disconnection. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

The default event handler generates the vm_event message in json format, then emit it through command monitor. The event data json txt is currently leaved as blank. When a specific event type is implemented, its event data generate handler can be added correspondingly. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

The idea of event throttle is to allow only curtain mounts of vm_events to be emitted per second. This feature is implemented with an event counter and a timer_fd periodic timer. Event counter increases until it reaches the throttle rate limit, then the periodic timer resets the counter in each time window. Events exceed the throttle rate are dropped. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

Through it is best to halt the RTC before changing date/time, still some OSes just write date/time while RTC is not halted. Currently the DM vRTC has already dealt the situation where openBSD writes century byte out side of vRTC halt by updating vRTC time on century byte writes. Now WaaG is found writing all date/time regs outside of vRTC halt. Because those date/time writes are not updated instantly, WaaG’s vRTC time is not actually changed. This bug has not affected anything till now when we are adding support to RTC change vm_event. To make WaaG’s vRTC work properly, this patch adds vRTC time update on all date/time writes outside of vRTC halt. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

The dm vrtc has been using time(NULL) as the vrtc base time. When service VM system time is adjusted, the vrtc will experience time jump which will make the vrtc time inaccurate. Change the source of base time to monotonic time can resolve this issue, as the monotonic time is not setable. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

When a guest OS performs an RTC change action, we wish this event be captured by developers, and then they can decide what to do with it. (e.g., whether to change physical RTC) There are some facts that makes RTC change event a bit complicated: - There are 7 RTC date/time regs (year, month…). They can only be updated one by one. - RTC time is not reliable before date/time update is finished. - Guests can update RTC date/time regs in any order. - Guests may update RTC date/time regs during either RTC halted or not halted. A single date/time update event is not reliable. We have to wait for the guest to finish the update process. So the DM's event handler sets up a timer, and wait for some time (1 second). If no more change happens befor the timer expires, we can conclude that the RTC change has been done. Then the rtc change event is emitted. This logic of event handler can be used to process HV vrtc time change event too. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

This patch adds support for HV vrtc vm_event. RTC change event is sent upon each date/time reg write. Those events will be handled in DM. DM will try to emit an RTC change event(to Libvirt) based on its strategy. Only support post-launched VMs. The DM event handler has already implemented the rtc chanage event. Those events will be processed the same way as vrtc events from DM vrtc. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>

When the virtual PM port is written, we can infer that guest has just initiated a poweroff action. So we send a poweroff event upon this port write. The DM event handler will try to emit it (to Libvirt). Developers can write app/script to decide what to do with this event. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>

In the triple fault handler, post-launched VMs are instantly turned off. Now a vm event is generated simultaneously. So that developers can capture the event and decide what to do with it. (e.g., logging and populating diagnostics, or poweroff VM) Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>

sbuf_put copies sbuf->ele_size of data, and puts into ring. Currently this function assumes that data size from caller is no less than sbuf->ele_size. But as sbuf->ele_size is usually setup by some sources outside of the HV (e.g., the service VM), it is not meant to be trusted. So caller should provide the max length of the data for safety reason. sbuf_put() will return UINT32_MAX if max_len of data is less than element size. Additionally, a helper function sbuf_put_many() is added for putting multiple entries. Tracked-On: projectacrn#8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>