New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only first page of Hydra paged collection returned #1180
Comments
Thanks for reporting! |
@rubensworks Is this an easy fix on your side? If not, I will implement a workaround on ours. |
Not sure yet, will look into it next week. |
Support for paged collections (non-TPF/QPF) in Comunica is not well-tested at the moment, so issues like these are not surprising. Some notes to self: Simpler query that also fails to produce results: SELECT *
WHERE {
<https://opendata.picturae.com/dataset/dre_a2a_webservice> <http://purl.org/dc/terms/identifier> ?i.
<https://opendata.picturae.com/dataset/dre_a2a_webservice> <http://purl.org/dc/terms/issued> ?issued.
} The problem is that the linked hypermedia iterator is overwriting metadata per new page. In this case, each page is defaulting to the none-source-type, which provides exact cardinalities for matches in that page (while TPF falls uses Hydra cardinality). This causes the empty-join actor to be used, which returns an empty result stream. One solution would be to merge (and test) the Also, if comunica/comunica-feature-link-traversal#102 is the same problem, we will want to change the none-source-type to not emit exact cardinalities, but only lowerLimit cardinalities. Furthermore, the empty-join can then not be used for lowerLimit's. |
This caused problems related to dataset-level cardinalities that were found in the initial source being overridden without proper accumulation with exact cardinalities from later sources Closes #1156 Closes #1180 May be related to comunica/comunica-feature-link-traversal#102
This caused problems related to dataset-level cardinalities that were found in the initial source being overridden without proper accumulation with exact cardinalities from later sources Closes #1156 Closes #1180 May be related to comunica/comunica-feature-link-traversal#102
@rubensworks This issue was closed, but the situation persists where
returns only the first page (100 results), not all (209 results). |
Probably a regression in v3. We should make sure to add a proper integration test for this case. But at least we know where to fix the problem now. |
@ddeboer This has been fixed in release 3.1.1! |
Issue type:
Description:
Assume a paginated Turtle file at https://opendata.picturae.com/catalog.ttl?page=1. (This uses the legacy
hydra:PagedCollection
rather than the newerhydra:PartialCollectionView
, but Comunica seems to support both.)A query using two predicates returns the expected output (a count of 209 resources that are spread over 3 pages):
However, when adding a third predicate to the query (that all resources have), things fall apart:
Only 100 resources, so only the first page of them, are returned. All resources have the predicate
dct:issued
. We can validate this by replacingdct:identifier
withdct:issued
, which returns 209 results again.Environment:
The text was updated successfully, but these errors were encountered: