Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malformed QPF with literal subject #1163

Open
langsamu opened this issue Mar 2, 2023 · 5 comments
Open

Malformed QPF with literal subject #1163

langsamu opened this issue Mar 2, 2023 · 5 comments

Comments

@langsamu
Copy link

langsamu commented Mar 2, 2023

Issue type:

  • 🐛 Bug

Description:

The issue

Comunica translates some SPARQL queries to malformed QPF queries where the subject parameter is a literal, though

The subject MUST either be a variable or an IRI

-- Triple Pattern Fragments 3.2
-- Quad Pattern Fragments 3.2

Some servers ignore such QPF queries but others reject them.
When such a server responds with 400, Comunica does not recover and query execution fails.

What should be done

I think Comunica should not even attempt to send QPF requests with literal subjects because

  • they're malformed,
  • they always return empty results and
  • they might fail entirely.

Instead I think Comunica should just ignore these requests.
(If Linked Data Fragments ever supported generalized RDF then this would need to change.)

Assuming you agree, I'd gladly try my hand at a first contribution, but I'd need some guidance.
Unfamiliar as I am with the codebase, I naively think this might be as easy as a condition on the term type of the QPF request subject, if any.

To reproduce

Assume a QPF endpoint https://example.org/qpf that exposes the following data:

BASE <http://example.org/>

<s> <p> "o" .

Assume a SPARQL query

SELECT * {
  GRAPH ?g {
    ?s1 ?p1 ?o1 .
    ?o1 ?p2 ?o2 .
  }
}

Comunica translates this query into two calls:

  1. https://example.org/qpf
  2. https://example.org/qpf?subject=%22o%22

Environment:

I'm using

http://rdf.js.org/comunica-browser/versions/v2.3.0/engines/query-sparql/comunica-browser.js
@github-actions
Copy link

github-actions bot commented Mar 2, 2023

Thanks for reporting!

@rubensworks rubensworks added this to Triage in Maintenance Mar 2, 2023
@jeswr
Copy link
Member

jeswr commented Mar 2, 2023

Assuming you agree, I'd gladly try my hand at a first contribution, but I'd need some guidance. Unfamiliar as I am with the codebase, I naively think this might be as easy as a condition on the term type of the QPF request subject, if any.

It should be possible to implement this by adding a condition here which returns an empty set of quads if any of the terms do not match the expected term type (in particular you could enforce " The subject MUST either be a variable or an IRI; the predicate MUST either be a variable or an IRI; the object MUST either be a variable, an IRI, or a literal; the graph MUST either be a variable or an IRI").

To return an empty set of quads with the correct metadata you should use this logic. I would suggest refactoring that logic into a emptyWithMetadata function.

@rubensworks
Copy link
Member

As of today, nothing I'm aware of prevents Comunica from working with generalized RDF.
As such, implementing the suggested change may restrict some valid use cases (e.g. querying over N3 files).

But the TPF spec does indeed make the explicit restriction, so we should handle at least that part.

I'm wondering if we should not go for a more high-level approach, and introduce a proper generalizedRdf option, which is false by default. And if false, we can modify our quadpattern query operation actor so that it always returns empty results when literals appear in non-object positions.
This would still enable users to force generalized RDF mode if they want to do so.

@jeswr
Copy link
Member

jeswr commented Mar 3, 2023

I'm wondering if we should not go for a more high-level approach, and introduce a proper generalizedRdf option, which is false by default. And if false, we can modify our quadpattern query operation actor so that it always returns empty results when literals appear in non-object positions.

Just to clarify - do you mean implement this in addition to doing an explicit restriction in the case of the TPF/QPF spec. I think the in addition to part is important in order to allow use-cases such as querying over the union of a TPF/QPF endpoint and an N3 document.

@rubensworks
Copy link
Member

do you mean implement this in addition to doing an explicit restriction in the case of the TPF/QPF spec

No, I was referring to just applying this restriction at a higher level. To avoid us having to implement the same restriction at different places.

I think the in addition to part is important in order to allow use-cases such as querying over the union of a TPF/QPF endpoint and an N3 document.

But then we would be mixing regular RDF with generalized RDF, and crossing two formal boundaries, which would probably open up another can of worms. We may want to avoid going into this, without knowing for certain that this is a valid use case.

@rubensworks rubensworks added this to Triage in Development Mar 28, 2023
@rubensworks rubensworks removed this from Triage in Maintenance Apr 19, 2024
@rubensworks rubensworks moved this from Triage to To do (prio:low) in Development Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development
  
To do (prio:low)
Development

Successfully merging a pull request may close this issue.

3 participants