Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8285893: Hiding dialog and showing new one causes dialog to be frozen #1449

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

beldenfox
Copy link
Contributor

@beldenfox beldenfox commented May 4, 2024

This PR is based on a discussion that happened over in PR #1324. Some of this explanation is copied from that thread.

When exitNestedEventLoop is called on the innermost loop the invokeLaterDispatcher suspends operation until the loop finishes. But if you immediately start a new event loop the previous one won't finish and the dispatcher will jam up and stop dispatching indefinitely.

When the invokeLaterDispatcher is told that the innermost loop is exiting it sets leavingNestedEventLoop to true expecting it to be set to false when the loop actually exits. When the dispatcher is told that a new event loop has started it is not clearing leavingNestedEventLoop which is causing the jam. Basically it should follow the same logic used in glass; leaving the innermost loop updates a boolean indicating that the loop should exit but if a new loop is started the boolean is set back to a running state since it now applies to the new loop, not the previous one.

I suspect the invokeLaterDispatcher exists in part to deal with the specifics of how deferred runnables are handled on the Mac. I investigated this a bit and wrote up some comments in the Mac glass code.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8285893: Hiding dialog and showing new one causes dialog to be frozen (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1449/head:pull/1449
$ git checkout pull/1449

Update a local copy of the PR:
$ git checkout pull/1449
$ git pull https://git.openjdk.org/jfx.git pull/1449/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1449

View PR using the GUI difftool:
$ git pr show -t 1449

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1449.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 4, 2024

👋 Welcome back mfox! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 4, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr Ready for review label May 4, 2024
@mlbridge
Copy link

mlbridge bot commented May 4, 2024

Webrevs

@kevinrushforth
Copy link
Member

As with the earlier PR, this will need a lot of testing and careful review.

@Maran23 I'd be interested in your thoughts on this approach.

/reviewers 2

@openjdk
Copy link

openjdk bot commented May 6, 2024

@kevinrushforth
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

@kevinrushforth
Copy link
Member

This looks like a safe, targeted fix for the problem. It will need to be tested on all platforms with a full run of all system tests.

@kevinrushforth
Copy link
Member

I get the following internal assertion failure when running the test on Linux:

NestedEventLoopTest > testCanExitAndThenEnterNewLoop FAILED
    java.lang.AssertionError: Internal inconsistency - wrong EventLoop
        at javafx.graphics@23-ea/com.sun.glass.ui.EventLoop.enter(EventLoop.java:108)
        at javafx.graphics@23-ea/com.sun.javafx.tk.quantum.QuantumToolkit.enterNestedEventLoop(QuantumToolkit.java:656)
        at javafx.graphics@23-ea/javafx.application.Platform.enterNestedEventLoop(Platform.java:310)
        at test.javafx.stage.NestedEventLoopTest.lambda$testCanExitAndThenEnterNewLoop$29(NestedEventLoopTest.java:326

It ran cleanly on Mac and Windows.

@beldenfox
Copy link
Contributor Author

I'll look into the Linux failure. The core EventLoop code passes an object into the application's leaveNestedEventLoop and expects to see that object returned from the applications's matching enterNestedEventLoop call. On Mac and Windows this object is passed through to glass as an argument on the stack. On Linux this value is handled as an application global. I suspect the Linux bookkeeping isn't robust enough to handle this case but it will take a bit to nail down the details.

@beldenfox
Copy link
Contributor Author

I now understand what’s going wrong on Linux but I don’t know how to fix it.

When the test case tries to leave the innermost loop (let’s call it A) the cross-platform Java code calls Application.leaveNestedEventLoop to tell glass that the loop should exit. But before the loop can exit the test case starts a new loop leading to a call to Application.enterNestedEventLoop.

On Mac the leaveNestedEventLoop call sets a few global state variables in glass and the enterNestedEventLoop call wipes out all those changes. So as far as glass is concerned the leaveNestedEventLoop call was never made. This works because the cross-platform Java code is holding onto an object that represents loop A; it’s down one level in the event loop stack and marked as LEAVING. When that object reaches the top of the event loop stack the core will once again call Application.leaveNestedEventLoop to exit loop A.

On Linux leaveNestedEventLoop makes changes in glass that cannot be erased or undone (it calls gtk_main_quit which updates internal GTK state that we don’t have access to). In the end GTK will exit the loops in the correct order but it will exit loop A before the core code has a chance to call Application.leaveNestedEventLoop again. This is throwing off the bookkeeping including some debug checks.

(JavaFX struggles with this scenario because the original test case behaves in such an unexpected way. While processing a click event on a modal stage the event handler hides that modal and then immediately starts a new modal. The original modal can’t really go away; the handler for the click event originated there and is still in progress. That first modal might not be visible but it can’t truly close until the second modal is dismissed.)

@Maran23
Copy link
Member

Maran23 commented May 16, 2024

I'll look into the Linux failure. The core EventLoop code passes an object into the application's leaveNestedEventLoop and expects to see that object returned from the applications's matching enterNestedEventLoop call. On Mac and Windows this object is passed through to glass as an argument on the stack. On Linux this value is handled as an application global. I suspect the Linux bookkeeping isn't robust enough to handle this case but it will take a bit to nail down the details.

There is one additional change I made, which might be relevant for Linux:
https://github.com/openjdk/jfx/pull/1324/files#diff-af779aafb50953f57cab2478dd220d0322592b60e92065cf658644866572b7e7R117

Worth to check. I remember that the code there was problematic to me.

Otherwise this looks good.
#1324 also cleans up the unused return value which is allocated in the C++/Objective-C side, but never really used on the Java side. So might be worth to do at some point, but I would agree to do the minimal changes first.

@Maran23
Copy link
Member

Maran23 commented May 23, 2024

On Linux leaveNestedEventLoop makes changes in glass that cannot be erased or undone (it calls gtk_main_quit which updates internal GTK state that we don’t have access to). In the end GTK will exit the loops in the correct order but it will exit loop A before the core code has a chance to call Application.leaveNestedEventLoop again. This is throwing off the bookkeeping including some debug checks.

That is exactly what I also figured out. If a second new nested event loop is started directly after the first one was requested to leave, the native code will never 'return'.
That's why this code exists: https://github.com/openjdk/jfx/blob/master/modules/javafx.graphics/src/main/java/com/sun/glass/ui/EventLoop.java#L118
which I slightly modified to not run invokeLater in my PR.

Otherwise it would never return any value and code after enterNestedEventLoop is never called:

        Object o = Platform.enterNestedEventLoop(this);
        System.out.println(o); // <- never called

And since you fixed the event jam so that we are not stuck anymore and Linux is implemented different with a global stateful variable, we will get that 'inconsistency', which I don't think is relevant or a problem here, hence I removed it in my PR.

I also don't found any other way to deal with that situation, as you also stated, there is nothing else we can do, and while the other OS implementation have some kind of while loop where we can try to implement something, on the Linux side we completely rely on gtk_main and gtk_main_quit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfr Ready for review
3 participants