Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Overview] Improving performance #846

Open
7 tasks
rubensworks opened this issue Aug 4, 2021 · 1 comment
Open
7 tasks

[Overview] Improving performance #846

rubensworks opened this issue Aug 4, 2021 · 1 comment

Comments

@rubensworks
Copy link
Member

rubensworks commented Aug 4, 2021

This issue contains an overview of all (known) performance issues in Comunica, and steps to resolve them.

Sub-optimal query plan

The following issues already have concrete solutions, and just need implementation:

The following are just ideas:

  • Call optimize bus right before each Bind Join subquery? This could optimize join plans.
  • Always use NLJ if no overlapping vars.
  • Prefer SHJ over HJ
  • Check contentlength in response headers to know how many indexes to create in rdf-store

Federation:

  • In the directors query (see examples in http://query.linkeddatafragments.org/ ), we have 2 filters, but they are combined into 1 filter during query parsing. This makes filter pushdown more difficult. Should we decouple them so our filter pushdown optimizer can handle it?
  • Look at FedShop RSA queries to see which optimizations can still be done.
  • In new bind-based join actors: don't push down if there are no common variables. As seen in https://arxiv.org/pdf/2102.03269.pdf (3 conditions on page 8)

Memory issues over files

Low-level optimizations

Other

@rubensworks rubensworks added this to To do (prio:low) in Development via automation Aug 4, 2021
@rubensworks rubensworks added this to Needs triage in Research via automation Aug 4, 2021
@github-actions
Copy link

github-actions bot commented Aug 4, 2021

Thanks for reporting!

@rubensworks rubensworks added this to Triage in Maintenance Aug 4, 2021
@rubensworks rubensworks moved this from Needs triage to High priority in Research Aug 4, 2021
@rubensworks rubensworks pinned this issue Aug 4, 2021
@rubensworks rubensworks moved this from Triage to In Progress in Maintenance Aug 27, 2021
rubensworks added a commit that referenced this issue Sep 29, 2021
This is as prerequisite for more complex join algorithms,
such as the Bind-Join.

Required for #846, #552
rubensworks added a commit that referenced this issue Sep 29, 2021
This abstracts the logic from our bgp-left-deep-smallest actor
and implements it as a join operation.
This allows the optimizations of this logic to be used in
non-BGPs as well.

Closes #427

Required for #846
rubensworks added a commit that referenced this issue Oct 21, 2021
All join algorithms now use it to estimate the joined cardinality.

Related to #846
@rubensworks rubensworks moved this from To do (prio:low) to To do (prio:high) in Development Oct 26, 2021
@rubensworks rubensworks removed this from In Progress in Maintenance Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development
  
To do (prio:high)
Research
High priority
Development

No branches or pull requests

1 participant