Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange performance difference between xmldiff:compare() and deep-equal() #4050

Open
xatapult opened this issue Sep 29, 2021 · 7 comments
Open
Assignees
Labels
performance bottlenecks, opportunities for rewriting, optimization xquery issue is related to xquery implementation
Milestone

Comments

@xatapult
Copy link

Describe the bug
We have tried to do XML compares using xmldiff:compare() and the XPath equivalent deep-equal().

For comparing two equal documents of ~200Kb xmldiff:compare() on my machine taks 60msecs, deep-equal() over 9 seconds!

Expected behavior
The functions vary slightly in interface, but just comparing two documents should roughly take the same amount of time. And if not 9 seconds is a bit too much I think.

To Reproduce
Unzip, import in eXist, execute test-equals.xql (I'm doing this from oXygen)

Context (please always complete the following information):

  • OS: W10
  • eXist-db version: 5.3
  • Java Version Amazon Corretto 1.8.0_202

Additional context

  • How is eXist-db installed? Jar
  • Any custom changes: various but none seem relevant (mostly external SQL database things)
@xatapult
Copy link
Author

TEST.zip

@xatapult xatapult changed the title [BUG] Strange performance difference between xmldiff:compare() and deep-equal() Sep 29, 2021
@joewiz joewiz added performance bottlenecks, opportunities for rewriting, optimization xquery issue is related to xquery implementation labels Sep 30, 2021
@joewiz joewiz added this to the eXist-5.3.1 milestone Sep 30, 2021
@dizzzz
Copy link
Member

dizzzz commented Oct 10, 2021

AFAIK for xmldiff:compare() two files are serialised and compared with an external tool; I suspect that deep-equal() uses a different mechanism and might compare more?

@line-o
Copy link
Member

line-o commented Nov 17, 2021

I have recently played around with fn:deep-equal#2 and am wondering what it actually should do.
It does not do well comparing two node-sets from my testing. I ended up serialising both elements and calling deep-equal on that.

@dizzzz dizzzz self-assigned this Nov 18, 2021
@adamretter adamretter modified the milestones: eXist-5.3.1, eXist-5.3.2 Dec 16, 2021
@PieterLamers
Copy link

@dizzzz @adamretter I just heard that there are plans to ditch xmldiff:compare. I'd really hope for this issue to be looked into before this is effectuated.

@adamretter
Copy link
Member

@PieterLamers I don't think we would intentionally remove it if there is not a good replacement. However, please remind me as we prepare eXist 7, so we don't remove it by accident as part of our tech debt cleanup

@adamretter adamretter modified the milestones: eXist-5.3.2, eXist-6.0.2 Feb 14, 2022
@adamretter
Copy link
Member

adamretter commented Sep 9, 2022

@PieterLamers So we definitely are not removing xmldiff:compare - as we just added a new implementation of it here - #4554

@adamretter
Copy link
Member

@xatapult I reran your tests on my laptop:

  1. With the new implementation (New implementation of the XmlDiffModule #4554) of xmldiff:compare: 228ms
  2. With the new implementation (Support for XQuery Tumbling and Sliding Window #4529) of fn:deep-equals: 12871ms

It looks like we would still need to optimise fn:deep-equals...

@adamretter adamretter modified the milestones: eXist-6.0.2, eXist-6.1.1 Jan 15, 2023
@adamretter adamretter modified the milestones: eXist-6.1.1, eXist-7.0.1 Feb 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance bottlenecks, opportunities for rewriting, optimization xquery issue is related to xquery implementation
Projects
None yet
Development

No branches or pull requests

6 participants