Dynamic inner joins can create excessive number of requests #1196

jeswr · 2023-04-13T03:07:20Z

Issue type:

As shown here performing dynamic inner joins can result in orders of magnitude more requests to a TPF endpoint than required to select the entire dataset. This can cause major performance degradations as network latency is the main bottleneck in such situations.

We need to find a way to make sure that cases such as those in the example repo perform the join by just doing an inner join on the patterns ?s ex:worksFor ?o1 and ?o1 ex:name ?o; rather than doing a dynamic inner join.

🐌 Performance issue

Description:

Environment:

The text was updated successfully, but these errors were encountered:

github-actions · 2023-04-13T03:07:44Z

Thanks for reporting!

rubensworks · 2023-04-13T06:59:12Z

Looks similar to #548.

Note that this is mainly a research problem (for which some solutions exist already), not really an implementation problem.

Adding it to the list in #846

jeswr · 2023-04-14T01:09:44Z

Note that this is mainly a research problem (for which some solutions exist already), not really an implementation problem.

Yup - my 2c on what seems to be a key missing heuristic is the estimated time required to retrieve all the data for a quad pattern (which would be Math.ceil(cardinality / pagesize)); and for dynamic joins estimating the time required to do the join in terms of number of request required (which would be (cardinality of first stream) * Math.ceil( (approx cardinality of each stream requested as part of the join) / (pageSize) )).

Is there a way of achieving this as part of the addition of Dataset cardinality work in #1194?

rubensworks added the performance 🐌 label Apr 13, 2023

rubensworks added this to Triage in Maintenance Apr 13, 2023

rubensworks mentioned this issue Apr 13, 2023

[Overview] Improving performance #846

Open

7 tasks

rubensworks moved this from Triage to To Do (prio:low) in Maintenance Apr 19, 2024

rubensworks removed this from To Do (prio:low) in Maintenance Apr 19, 2024

rubensworks added this to To do (prio:low) in Development via automation Apr 19, 2024

rubensworks removed this from To do (prio:low) in Development Apr 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic inner joins can create excessive number of requests #1196

Dynamic inner joins can create excessive number of requests #1196

jeswr commented Apr 13, 2023 •

edited

github-actions bot commented Apr 13, 2023

rubensworks commented Apr 13, 2023

jeswr commented Apr 14, 2023 •

edited

Dynamic inner joins can create excessive number of requests #1196

Dynamic inner joins can create excessive number of requests #1196

Comments

jeswr commented Apr 13, 2023 • edited

Issue type:

Description:

Environment:

github-actions bot commented Apr 13, 2023

rubensworks commented Apr 13, 2023

jeswr commented Apr 14, 2023 • edited

jeswr commented Apr 13, 2023 •

edited

jeswr commented Apr 14, 2023 •

edited