Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix performance issue in CoroutineDescendantAxis #609

Open
JohannesLichtenberger opened this issue May 10, 2023 · 3 comments
Open

Fix performance issue in CoroutineDescendantAxis #609

JohannesLichtenberger opened this issue May 10, 2023 · 3 comments

Comments

@JohannesLichtenberger
Copy link
Member

https://github.com/sirixdb/sirix/blob/master/bundles/sirix-kotlin-api/src/main/kotlin/org/sirix/axis/CoroutineDescendantAxis.kt

It should read mainly from the page cache (BufferManagerImpl => caches) / in the test, actually always, but I assume it'll make I/Os all the time. That's why it's more than 100x slower than the standard "serial" DescendantAxis.

The idea of the axis is to prefetch right sibling nodes in parallel in a second read-only trx.

So we'll have to debug and profile...

@JohannesLichtenberger
Copy link
Member Author

It seems in the Producer there's a right sibling key stack, which is an issue with the proposed algorithm (I thought maybe we could process all right siblings in parallel (perhaps we can also take the descendantCount of each inner node into account and only fetch in parallel if the count is beyond a threshold.

@Kiriakos1998
Copy link

Hello, @JohannesLichtenberger, just realized that perhaps it was not clear that I looked into this issue eventually. So I specifically added a timestamp in the testAxisConventions in AbsAxisTest.
iteration -> 1001400
iteration -> 4000400
iteration -> 4000400
iteration -> 87000500
iteration -> 87000500
iteration -> 97001000
iteration -> 97001000
iteration -> 97001000
iteration -> 98001800
iteration -> 98001800
Here is the result in Nanos per iteration. Perhaps the way to go is to see which methods are specifically involved in these iterations. Also, it will be interesting to see which line of code inside the while loop is creating these delays.

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented May 31, 2023

I think the implementation itself is not good. First of all the producer seems to explicitly block until it's finished and only afterwards the next right sibling of a node during preorder is traversed to send the nodes to the axis. Second, AFAICS in the producer itself the right sibling key stack is used instead of a kind of fork join approach.

I'm also not sure if thread synchronization itself might be an issue as well as starting of new transactions...

Maybe there's also a better approach for parallelization of preorder traversal with the DOM like pointer based encoding (firstChild/lastChild/rightSibling/leftSibling/parent) of every node.

The other issue with the slow IO using based backend might also be interesting...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

2 participants