Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching workspaces via Openbox keybindings leads to missed X events #48

Open
M4he opened this issue Apr 4, 2024 · 7 comments
Open
Assignees
Labels
bug Something isn't working question Further information is requested

Comments

@M4he
Copy link

M4he commented Apr 4, 2024

First of all, thank you for your work on cortile! This is an excellent addition to classic WMs like Openbox and Xfwm.

Unfortunately, I'm experiencing frequent hiccups on my setup that can mostly be attributed to cortile not receiving all events from the X connection occasionally. I'm running LXQt with Openbox on NixOS using a multi-monitor setup with 2 monitors.

The issue happens when switching workspaces via Openbox's keybindings. Sometimes (roughly every 5~10 switches), cortile does not register the workspace change. This leads to all kinds of glitches because cortile doesn't update its internal state properly, so cycling the tiling mode applies to the wrong workspace, windows not snapping back into position etc.

While trying to identify the problem I added more log output to cortile at places where X events are initially received and processed. I noticed that occasionally some events are never received by cortile at all, for example the _NET_CURRENT_DESKTOP property change from the root window when changing the workspace.

At first I thought it was about my picom config and all the stuff going on (fade animation, blur) that choked the X connection and event queue somehow but then I noticed it doesn't happen when changing the workspace via external means (wmctrl or the workspace switcher of the LXQt panel) instead of Openbox's keybindings. See the reproduction steps below.

For the time being I can resort to wmctrl and other external means for workspace switching but it would be great to get to the bottom of this and fully support Openbox's native behavior.

Unfortunately, I have no clue yet why this is happening at all. Since the workspace indicator of LXQt's panel keeps updating correctly everytime (it also listens to the same X events afaik) it does not seem like the X events are not emitted in general, just that cortile doesn't receive them occasionally when they originate from Openbox keybinding actions.

How to reproduce

Consider the following Openbox keybindings:

<keybind key="A-1">
  <action name="GoToDesktop">
    <to>1</to>
  </action>
</keybind>
<keybind key="A-2">
  <action name="GoToDesktop">
    <to>2</to>
  </action>
</keybind>
<keybind key="A-Escape">
  <action name="GoToDesktop">
    <to>last</to>
  </action>
</keybind>

To reproduce:

  1. Switch to the second workspace and change the tiling layout in cortile so that the tray icon displays a different layout than on workspace 1.
  2. Continuously switch between workspace 1 and 2 by either alternating between Alt+1 and Alt+2 keypresses or repeatedly press Alt+Escape to switch back and forth.

Eventually, cortile will miss events which is visible by the tray icon not updating the layout indicator after changing the workspace. For example, if workspace 2 was set to maximized and the keypress switched to workspace 1, the icon will keep displaying the maximized layout.

Now in contrast, binding Alt+1 and Alt+2 outside of Openbox (e.g. via LXQt's hotkey handler) to wmctrl -s 0 and wmctrl -s 1 respectively or using the scroll wheel on the workspace indicator of the LXQt panel to continuously switch workspaces back and forth does not seem to trigger the issue at all.

@leukipp leukipp self-assigned this Apr 5, 2024
@leukipp leukipp added question Further information is requested bug Something isn't working labels Apr 5, 2024
@leukipp
Copy link
Owner

leukipp commented Apr 5, 2024

Thanks for your efforts and detailed description, nailing the problem down:

I noticed that occasionally some events are never received by cortile at all, for example the _NET_CURRENT_DESKTOP property change from the root window when changing the workspace.

At first I thought it was about my picom config and all the stuff going on (fade animation, blur) that choked the X connection and event queue somehow but then I noticed it doesn't happen when changing the workspace via external means (wmctrl or the workspace switcher of the LXQt panel) instead of Openbox's keybindings.

This could be an upstream issue, that can only be fixed in the xgb or xgbutil library which cortile relies on.

Trying to reproduce this bug on an environment will cost me some time, so I would kindly ask you to run the minimal example below to confirm that this is caused by xgb dependencies.

minimal example

main.go

Please build via go build main.go and run the binary.

package main

import (
	"fmt"

	"github.com/BurntSushi/xgb/xproto"

	"github.com/BurntSushi/xgbutil"
	"github.com/BurntSushi/xgbutil/ewmh"
	"github.com/BurntSushi/xgbutil/xevent"
	"github.com/BurntSushi/xgbutil/xprop"
	"github.com/BurntSushi/xgbutil/xwindow"
)

func main() {
	X, err := xgbutil.NewConn()
	if err != nil {
		fmt.Println("Connection to X server failed")
		return
	}

	root := xwindow.New(X, X.RootWin())
	root.Listen(xproto.EventMaskSubstructureNotify | xproto.EventMaskPropertyChange)

	fmt.Println("Listen")

	xevent.PropertyNotifyFun(StateUpdate).Connect(X, root.Id)
	xevent.Main(X)
}

func StateUpdate(X *xgbutil.XUtil, e xevent.PropertyNotifyEvent) {
	aname, err := xprop.AtomName(X, e.Atom)

	fmt.Println(aname)

	if err != nil {
		fmt.Println("Error retrieving atom name")
		return
	}

	if IsInList(aname, []string{"_NET_CURRENT_DESKTOP"}) {
		fmt.Println("--------- _NET_CURRENT_DESKTOP --------->", CurrentDesktopGet(X))
	}
}

func IsInList(item string, items []string) bool {
	for i := 0; i < len(items); i++ {
		if items[i] == item {
			return true
		}
	}
	return false
}

func CurrentDesktopGet(X *xgbutil.XUtil) uint {
	currentDesk, err := ewmh.CurrentDesktopGet(X)

	if err != nil {
		fmt.Println("Error retrieving current desktop")
	}

	return currentDesk
}

While trying to identify the problem I added more log output to cortile at places where X events are initially received and processed.

The minimal example should include the cortile implementations (root.go) you mentioned, where the issue original arises.

Please let me know, if the --------- _NET_CURRENT_DESKTOP ---------> output still reports the wrong (previous) workspace. It may be some race condition where _NET_CURRENT_DESKTOP fires, but ewmh.CurrentDesktopGet(X) still reports the old state.

@leukipp leukipp assigned leukipp and M4he and unassigned leukipp Apr 5, 2024
@M4he
Copy link
Author

M4he commented Apr 6, 2024

Thanks for creating the minimal example! I tried it and running it at the same time as cortile, I cannot reproduce the issue in the minimal example whereas it keeps happening in cortile.

As a next step I added the code below to the example to include event listener callbacks for all clients/windows mimicking what cortile does. I thought that maybe having a lot of listeners to different X clients in a single program might be the reason for the issues.

But even with the code shown below added, the example still registers each and every workspace change correctly while at the same time cortile frequently fails to do so. The root cause must be something else ...

code patch for attaching client handlers in the minimal example
@@ -25,9 +25,42 @@ func main() {
fmt.Println("Listen")

xevent.PropertyNotifyFun(StateUpdate).Connect(X, root.Id)
+    windows, err := ewmh.ClientListStackingGet(X)
+    if err != nil {
+		fmt.Println("Error retrieving client list: ", err)
+	}
+    for _, win := range windows {
+        fmt.Println("attaching window handlers")
+        attachClientHandlers(X, xwindow.New(X, win))
+    }
xevent.Main(X)
}

+func attachClientHandlers(X *xgbutil.XUtil, c *xwindow.Window) {
+	c.Listen(xproto.EventMaskStructureNotify | xproto.EventMaskPropertyChange | xproto.EventMaskFocusChange)
+
+	// Attach structure events
+	xevent.ConfigureNotifyFun(func(x *xgbutil.XUtil, ev xevent.ConfigureNotifyEvent) {
+		fmt.Println("Client structure event")
+	}).Connect(X, c.Id)
+
+	// Attach property events
+	xevent.PropertyNotifyFun(func(x *xgbutil.XUtil, ev xevent.PropertyNotifyEvent) {
+		aname, _ := xprop.AtomName(X, ev.Atom)
+		fmt.Println("Client property event ", aname)
+	}).Connect(X, c.Id)
+
+	// Attach focus in events
+	xevent.FocusInFun(func(x *xgbutil.XUtil, ev xevent.FocusInEvent) {
+		fmt.Println("Client focus in event")
+	}).Connect(X, c.Id)
+
+	// Attach focus out events
+	xevent.FocusOutFun(func(x *xgbutil.XUtil, ev xevent.FocusOutEvent) {
+		fmt.Println("Client focus out event")
+	}).Connect(X, c.Id)
+}
+
func StateUpdate(X *xgbutil.XUtil, e xevent.PropertyNotifyEvent) {
aname, err := xprop.AtomName(X, e.Atom)

@leukipp
Copy link
Owner

leukipp commented Apr 7, 2024

But even with the code shown below added, the example still registers each and every workspace change correctly while at the same time cortile frequently fails to do so. The root cause must be something else ...

...just a guess in the dark, please comment those 4 lines and report back:
https://github.com/leukipp/cortile/blob/main/input/mousebinding.go#L50-L53

@leukipp
Copy link
Owner

leukipp commented Apr 7, 2024

Please also try the version from the xgb-jezek branch, where github.com/BurntSushi/xgb* is replaced with the github.com/jezek/xgb* fork, which provides those features:

I've forked the XGB repository from BurntSushi's github to apply some
patches which caused panics and memory leaks upon close and tests were added,
to test multiple server close scenarios.

Much of the code has been rewritten in an effort to support thread safety and multiple extensions.

It is thread safe and gets immediate improvement from parallelism when
GOMAXPROCS > 1. (See the benchmarks in xproto/xproto_test.go for evidence.)

@M4he
Copy link
Author

M4he commented Apr 7, 2024

But even with the code shown below added, the example still registers each and every workspace change correctly while at the same time cortile frequently fails to do so. The root cause must be something else ...

...just a guess in the dark, please comment those 4 lines and report back: https://github.com/leukipp/cortile/blob/main/input/mousebinding.go#L50-L53

That's it! After commenting those lines I cannot reproduce the issue anymore.

@M4he
Copy link
Author

M4he commented Apr 7, 2024

Please also try the version from the xgb-jezek branch [...]

No difference to the main branch's behavior in terms of the issue observed.

@leukipp
Copy link
Owner

leukipp commented Apr 7, 2024

Thanks for your feedback! A proper fix will be included in the next cortile release.

M4he pushed a commit to M4he/cortile that referenced this issue Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants