Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8322619: Parts of SG no longer update during rendering - overlapping - culling - dirty #1310

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

FlorianKirmaier
Copy link
Member

@FlorianKirmaier FlorianKirmaier commented Dec 21, 2023

In some situations, a part of the SG is no longer rendered.
I created a test program that showcases this problem.

Explanation:

This can happen, when a part of the SG, is covered by another Node.
In this part, one node is totally covered, and the other node is visible.

When the totally covered Node is changed, then it is marked dirty and it's parent, recursively until an already dirty node is found.
Due to the Culling, this totally covered Node is not rendered - with the effect that the tree is never marked as Clean.

In this state, a Node is Dirty but not It's parent. Based on my CodeReview, this is an invalid state which should never happen.

In this invalid state, when the other Node is changed, which is visible, then the dirty state is no longer propagated upwards - because the recursive "NGNode.markTreeDirty" algorithm encounters a dirty node early.

This has the effect, that any SG changes in the visible Node are no longer rendered. Sometimes the situation repairs itself.

Useful parameters for further investigations:
-Djavafx.pulseLogger=true
-Dprism.printrendergraph=true
-Djavafx.pulseLogger.threshold=0

PR:
This PR ensures the dirty flag is set to false of the tree when the culling is used.
It doesn't seem to break any existing tests - but I'm not sure whether this is the right way to fix it.
It would be great to have some feedback on this solution - maybe guiding me to a better solution.

I could write a test, that just does the same thing as the test application, but checks every frame that these nodes are not dirty - but maybe there is a better way to test this.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issues

  • JDK-8322619: Parts of SG no longer update during rendering - overlapping - culling - dirty (Bug - P4)
  • JDK-8332040: Parts of SG no longer update during rendering - overlapping - culling - dirty (CSR) (Withdrawn)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1310/head:pull/1310
$ git checkout pull/1310

Update a local copy of the PR:
$ git checkout pull/1310
$ git pull https://git.openjdk.org/jfx.git pull/1310/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1310

View PR using the GUI difftool:
$ git pr show -t 1310

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1310.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 21, 2023

👋 Welcome back fkirmaier! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Ready for review label Dec 21, 2023
@mlbridge
Copy link

mlbridge bot commented Dec 21, 2023

Webrevs

@FlorianKirmaier FlorianKirmaier changed the title 8322544 Parts of SG no longer update during rendering - overlapping - culling - dirty 8322619 Parts of SG no longer update during rendering - overlapping - culling - dirty Dec 21, 2023
@FlorianKirmaier
Copy link
Member Author

Fixed wrong ticket-number in title and commit

@openjdk
Copy link

openjdk bot commented Dec 21, 2023

@FlorianKirmaier Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@kevinrushforth
Copy link
Member

/reviewers 2

@openjdk
Copy link

openjdk bot commented Dec 21, 2023

@kevinrushforth
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

Copy link
Member

@kevinrushforth kevinrushforth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see an automated (probably robot-based) test, if it is feasible.

This will need careful review.

Reviewers: @arapte @lukostyra

@@ -0,0 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert the addition of this file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the accidental change!

@@ -2003,6 +2003,7 @@ protected void doRender(Graphics g) {
if ((bits & DIRTY_REGION_CONTAINS_OR_INTERSECTS_NODE_BOUNDS) == 0) {
// If no culling bits are set for this region, this group
// does not intersect (nor is covered by) the region
clearDirtyTree();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to look at this closely to ensure that there are no functional regressions, and that the performance impact is minimal.

@FlorianKirmaier
Copy link
Member Author

FlorianKirmaier commented Dec 21, 2023

If someone who understands the rendering could provide some feedback on what is happening - than I can probably make a better fix, and write a reasonable unit-test for this bug.

removed some text with edit, because after thinking about it, it was wrong

@hjohn
Copy link
Collaborator

hjohn commented Dec 22, 2023

The fully covered node which was dirty is now marked clean in the PR when it is culled. Now that it is clean, what happens when it is uncovered (being careful not to change it in any other way)? Does it get rendered correctly, or does it miss the last change made to it while it was fully covered?

If that works correctly still, then I think this is the right fix. Culling the node is (IMHO) the same as rendering it (there is nothing to render) so it should be clean after being culled.

@FlorianKirmaier
Copy link
Member Author

It is worth noting that the idea of cleaning the dirty tree, for the culled node, is also done at line NGNode:1414, but for some reason, it doesn't catch the case from this bug.

@FlorianKirmaier
Copy link
Member Author

FlorianKirmaier commented Jan 16, 2024

I tried out a bit, by removing the overlapping area in my test program.
I added the changed program to the ticket.
Didn't notice anything wrong.
There are probably many ways to test for regressions.

Can't guarantee that there are no cases that go wrong - but if something doesn't work, then it probably happens more rarely than the original bug.

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 20, 2024

@FlorianKirmaier This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@FlorianKirmaier
Copy link
Member Author

push
It's still a important bug.
If someone can guide me on how to write tests for this, i would put in the effort.
Maybe we could make snapshot based tests? That should work on all platforms, right?
(Like testing whether a specific RGB value appears)
Has something in this direction been done somewhere?

@lukostyra
Copy link
Contributor

Maybe we could make snapshot based tests? That should work on all platforms, right?

We have a whole suite of tests which use similar methods to perform checks, called robot tests. Since you want to check if a node was properly updated, and the best way to do that seems to be checking RGB values, I think a test for this case belongs right there.

You could use robot utility to check color values of specific pixels on the screen. I think the best way to go about it would be to do something similar to what your test bug apps do (the ones attached to the JBS issue). Ideally I would try and look for first some form of confidence check (if correct controls are "on" and "off"), then progressing the animation and seeing if controls updated as intended.

All robot tests are located in tests/system/src/test/java/test/robot and then they follow package names. I'm not 100% sure where this test should go though (I assume .../javafx/scene?).

@openjdk
Copy link

openjdk bot commented Mar 13, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title 8322619 Parts of SG no longer update during rendering - overlapping - culling - dirty 8322619: Parts of SG no longer update during rendering - overlapping - culling - dirty Apr 10, 2024
@bridgekeeper
Copy link

bridgekeeper bot commented Apr 10, 2024

@FlorianKirmaier This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@eduardsdv
Copy link
Contributor

I analysed the bug and can confirm the fix.
I summarized it in the following table.
First column is the node structure, the others the steps: update nodes, render, update nodes, render, ...

Nodes (L - left and R - right) / Steps 1st -> update Circles 2nd -> render 3rd -> update Lines 4th -> render
HBox dirty clearDirty dirty clearDirty
HBox >StackPane-L dirty clearDirty dirty clearDirty
HBox >StackPane-L > Group-L dirty clearDirty dirty clearDirty
HBox >StackPane-L > Group-L >Line-L dirty clearDirty
HBox >StackPane-L > Group-L >Circle-L dirty clearDirty
HBox >StackPane-R dirty clearDirty *1 *3
HBox >StackPane-R > Group-R dirty dirty dirty
HBox >StackPane-R > Group-R > Line-R dirty *2 dirty
HBox >StackPane-R > Group-R > Circle-R dirty dirty dirty dirty

*1: > no culling bits for first dirty region (Circle-L) -> skip children -> clearDirty() is not called on children
> no root path for the second dirty region (Circle-R) -> no rendering and root.clearDirtyTree() stops on StackPane-R (the children stay dirty: Group-R and Circle-R)

*2: > Line-R.markTreeDirty() skips marking of further parents because Group-R is already dirty -> the StackPane-R stays clean

*3: > the dirty regions are not collected because the StackPane-R is clean -> Line-R is not rendered and the dirty-flag is not reset


The error is that clearDirty() is not called in the second step on the children of StackPane-R (see *1). Subsequent changes in Line-R and Circle-R are not propagated to the root node because Group-R is already dirty (see *2) and therefore StackPane-R remains clean. The last changes are therefore ignored in the next rendering phase (see *3).


We can simply say: the dirty flags must be deleted on all nodes after rendering.

If this is true, the test can simply check this instead of screenshots or similar.

…y are skipped due to visible==false or opacity==0
@eduardsdv
Copy link
Contributor

After further investigation and testing I would suggest to remove all clearDirty() and clearDirtyTree() calls and put only one clear-dirty pass after rendering is done (at the end of ViewerPainter.paintImpl(final Graphics backBufferGraphics)).

Since the markDirty() method does both, sets the dirty flag and propagates it to the ancestor nodes, it is better if there is only one method for deleting the dirty flag from the node and all its children. Therefore, I would remove the clearDirtyTree() method and move its content to the clearDirty() method.

This way we can guarantee that all dirty flags are removed after the rendering is completed.
We also guarantee that a clean parent with dirty children will never occur (the bug this PR addresses).

Furthermore, my quick test shows that fewer clearDirty() calls are required in the new version. Bonus: We are even faster.

    public void clearDirty() {
        if (dirty != DirtyFlag.CLEAN || childDirty) {
            dirty = DirtyFlag.CLEAN;
            childDirty = false;
            dirtyBounds.makeEmpty();
            dirtyChildrenAccumulated = 0;

            if (this instanceof NGGroup) {
                List<NGNode> children = ((NGGroup) this).getChildren();
                for (NGNode child : children) {
                    child.clearDirtyTree_();
                }
            }
        }

        if (getClipNode() != null) {
            getClipNode().clearDirty();
        }
    }

…n if they are skipped due to visible==false or opacity==0"

This reverts commit 5f9f157.
@eduardsdv
Copy link
Contributor

I added a test that shows that the @FlorianKirmaier's fix works. You can start it with: gradlew -PFULL_TEST=true :systemTests:test --tests NGNodeDirtyFlagTest.

To avoid such errors in the future, I would still suggest the refactoring, which I wrote about in my last comment.

@kevinrushforth and @arapte please review.

@eduardsdv
Copy link
Contributor

I created an alternative solution for this bug. See PR: #1451

@openjdk openjdk bot added the csr Need approved CSR to integrate pull request label May 10, 2024
@kevinrushforth
Copy link
Member

/csr unneeded

@openjdk
Copy link

openjdk bot commented May 11, 2024

@kevinrushforth The CSR requirement cannot be removed as CSR issues already exist. Please withdraw JDK-8332040 and then use the command /csr unneeded again.

@kevinrushforth
Copy link
Member

/csr unneeded

@openjdk openjdk bot removed the csr Need approved CSR to integrate pull request label May 11, 2024
@openjdk
Copy link

openjdk bot commented May 11, 2024

@kevinrushforth determined that a CSR request is not needed for this pull request.

@FlorianKirmaier
Copy link
Member Author

Sorry for creating the CSR, it was an accident.

Now that Eduard has both created an alternative solution, but also has reviewed that this solution is correct, and provided an unit-test - I think this (or the other) PR is ready to move forwards.

@kevinrushforth
What do you think?

This is still quite a fundamental bug which is probably quite prevalent,
because when it happens it is really hard to notice it.
But it probably happens more often than we think.

@kevinrushforth
Copy link
Member

Sorry for creating the CSR, it was an accident.

No problem.

Now that Eduard has both created an alternative solution, but also has reviewed that this solution is correct, and provided an unit-test - I think this (or the other) PR is ready to move forwards.

@kevinrushforth What do you think?

Yes, I've asked @arapte to be the primary reviewer on this. I'll leave it to his judgment as to which of the two approaches to take forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfr Ready for review
5 participants