Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null values in RDF Lists #835

Open
ebremer opened this issue Mar 15, 2024 · 6 comments
Open

Null values in RDF Lists #835

ebremer opened this issue Mar 15, 2024 · 6 comments
Labels
defer Issue deferred to future Working Group spec-design

Comments

@ebremer
Copy link

ebremer commented Mar 15, 2024

I have a need to express a list of ordered items, but with potentially missing values. For example 20, null, 77.
I've read through the documentation, and null values are filtered out when using compaction but the second position needs to be held.
I can model a list in RDF and omit the second rdf:first value, but, JSON-LD likes to filter the nulls away. Is there anyway framing can be used to express the null?
Having { :dcm:00286040": [20, null, 77] } would be a nice concise way of expressing this.
My specific use case is modelling DICOM image metadata:

https://dicom.nema.org/medical/dicom/current/output/chtml/part18/sect_F.2.5.html

I currently use the blank node method, but I'm trying to make the DICOM JSON model more "RDF" by moving the value representations to custom data types. For example:

  1. one triple
    [ dcm:00191030 "262144"^^dcm:UL ]

  2. three triples
    [ dcm:00191030 [ dcm:Value ( "262144"^^xsd:long ); dcm:vr "UL" ]]

@ebremer
Copy link
Author

ebremer commented Mar 18, 2024

I looked at #76. Would it make sense that use of null in lists would be handled different? For example,
In turtle:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sdo: <https://schema.org/> .
@prefix xs:  <http://www.w3.org/2001/XMLSchema#> .

[ sdo:about  ( "1"^^xs:long "2"^^xs:long "3"^^xs:long )
] .

becomes this in JSON-LD:

{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "xs": "http://www.w3.org/2001/XMLSchema#",
    "sdo": "https://schema.org/",
    "about": {
      "@id": "sdo:about",
      "@container": "@list",
      "@type": "xs:long"
    }
  },
  "@id": "_:b0",
  "about": [
    "1",
    "2",
    "3"
  ]
}

In turtle, I could remove [] rdf:first "2"^^xs:long triple, thus asserting nothing about the second position, but a second position in the list would be held for some value in the future, however,

select *
            where {
                ?s sdo:about (1 ?two 3)
            }

doesn't match as it is expecting ?two to be there. I don't have a way in a compact notation to say:
select *
where {
?s sdo:about (1 optional {?two} 3)
}
alternatively, I could say:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sdo: <https://schema.org/> .
@prefix xs:  <http://www.w3.org/2001/XMLSchema#> .

[ sdo:about  ( "1"^^xs:long rdf:nil "3"^^xs:long )
] .

which would now work with select * where { ?s sdo:about (1 ?two ?3)}, but using this in JSON-LD becomes

{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "xs": "http://www.w3.org/2001/XMLSchema#",
    "sdo": "https://schema.org/",
    "about": {
      "@id": "sdo:about",
      "@container": "@list",
      "@type": "xs:long"
    },
    "nil": "rdf:nil"
  },
  "@id": "_:b0",
  "sdo:about": {
    "@list": [
      {
        "@type": "xs:long",
        "@value": "1"
      },
      "http://www.w3.org/1999/02/22-rdf-syntax-ns#nil",
      {
        "@type": "xs:long",
        "@value": "3"
      }
    ]
  }
}

Not the compact plain-ish JSON look I was hoping for. My preference would be something like below when using rdf:nil as a place holder

{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "xs": "http://www.w3.org/2001/XMLSchema#",
    "sdo": "https://schema.org/",
    "about": {
      "@id": "sdo:about",
      "@container": "@list",
      "@type": "xs:long"
    }
  },
  "@id": "_:b0",
  "about": [
    "1",
    null,
    "3"
  ]
}

@gkellogg
Copy link
Member

Based on discussions in the RDF-star Working Group, you're not likely to see support for a model with non-wellformed lists. Many systems expect lists to be wellformed with each node having both rdf:first and rdf:rest components.

You could use rdf:nil as one of the entries, but note that this is equivalent to an empty list, not a null value, so your example above might be better rendered as (1^^xs:long () 3^^xs:long). Arguably, the JSON-LD algorithm might render it the same way.

As null has a more specific meaning in JSON, it has no corresponding RDF interpretation. There are ways to getnull emitted from JSON-LD (via framing), but in general usage in JSON-LD it stands for no value, which is why it can be eliminated when expanding or generating RDF. But, I could see how a tweak to the "Convert List to RDF" algorithm might emit something closer to what you have with Turtle: (1 () 3).

@ebremer
Copy link
Author

ebremer commented Mar 19, 2024

Thanks for replying Gregg,
I know a triple store will take a RDF List without a [] rdf:first and I can use an optional { [] rdf:first ?value} or a minus { [] rdf:first ?value } and query via SPARQL the way I would like. The JSON-LD playground will emit N-Quads without a _:myList a rdf:List. Is a "well-formed" RDF List defined somewhere? I always understood that "null maps to no triple". Apache Jena's custom list function:
?list list:index (1 ?member)
works well with omitted rdf:first elements in a List even wrapped in an optional {} or a minus {}. I cannot find anything in the RDF specs that explicitly says a rdf:first cannot be omitted. Please correct me if I'm wrong. I understand most of the uses of null in JSON-LD and agree with them, save the "@container": "@list" case which is the one place I think "null" could be used safely. I would think a null being used in a JSON array (when "@container": "@list" is specified), would yes, remove that triple, but only the [] rdf:first ?value triple but not the entire _:current rdf:first ?value; rdf:rest _:next . When compacting, JSON-LD would just put null to indicate no rdf:first for that entry leaving the scaffolding of the RDF List in-place when serialized to n-triples or such. There are use cases where the positional information in an ordered list are important but there are no values for that position. If RDF Lists in the end are defined as "must always have rdf:first for each node element", one would have to mirror the work of RDF Lists but then allow the missing rdf:first (or clonerdf:first) and all of the associated support functions. It just seems a missed opportunity if this case isn't handled by the specification (both SPARQL and JSON-LD) Would it put anyone out in the cold?

SPARQL allows compact forms like (1 ?x 3 4) but no way to indicate whether a list variable is optional {} or a minus {} (so, I added w3c/sparql-dev#196) - Erich

@TallTed
Copy link
Contributor

TallTed commented Mar 19, 2024

@ebremer — Your #835 (comment) cries out for some liberal application of codefences (mostly inline backtick wrappers, `like this`) to set the data aside from the rest of the text. These are especially desirable around the @words, which GitHub treats as user handles, and therefore pings those users who have not chosen to participate in this discussion.

@ebremer
Copy link
Author

ebremer commented Mar 19, 2024

@TallTed the cries have been heard and answered along with my apologies.

@gkellogg
Copy link
Member

Thanks for replying Gregg, I know a triple store will take a RDF List without a [] rdf:first and I can use an optional { [] rdf:first ?value} or a minus { [] rdf:first ?value } and query via SPARQL the way I would like. The JSON-LD playground will emit N-Quads without a _:myList a rdf:List. Is a "well-formed" RDF List defined somewhere?

This is generally required to properly serialize Collections, but indeed, RDF Semantics currently says: "Also, RDF imposes no 'well-formedness' conditions on the use of this vocabulary, so that it is possible to write RDF graphs which assert the existence of highly peculiar objects such as lists with forked or non-list tails, or multiple heads". In general, vocabularies can't impose any conditions on their use, although some profiles may create OWL restrictions.

Well-formedness is likely in RDF 1.2 for describing the use of triple terms, and there is a possibility that it will be extended to describe a similar expectation for Collections/Lists. The primary concern is about nodes with multiple values for rdf:first/rdf:rest; it may be a valid use case, such as you suggestion, for nodes to have rdf:rest properties without rdf:first, and certainly, you can construct queries that make sense to do this. If that were determined to be well-formed, a possible update to JSON-LD (and many Turtle serialization) algorithms would be to allow such nodes to be represented using the JSON-LD @list container type.

I always understood that "null maps to no triple". Apache Jena's custom list function: ?list list:index (1 ?member) works well with omitted rdf:first elements in a List even wrapped in an optional {} or a minus {}. I cannot find anything in the RDF specs that explicitly says a rdf:first cannot be omitted. Please correct me if I'm wrong.

Indeed, nothing prohibits this, it's just not used when representing such lists using the @list container in JSON-LD 1.1, and many Turtle processors would similarly not emit this (although there are no normative requirements for serializing Turtle).

I understand most of the uses of null in JSON-LD and agree with them, save the "@container": "@list" case which is the one place I think "null" could be used safely. I would think a null being used in a JSON array (when "@container": "@list" is specified), would yes, remove that triple, but only the [] rdf:first ?value triple but not the entire _:current rdf:first ?value; rdf:rest _:next . When compacting, JSON-LD would just put null to indicate no rdf:first for that entry leaving the scaffolding of the RDF List in-place when serialized to n-triples or such. There are use cases where the positional information in an ordered list are important but there are no values for that position. If RDF Lists in the end are defined as "must always have rdf:first for each node element", one would have to mirror the work of RDF Lists but then allow the missing rdf:first (or clonerdf:first) and all of the associated support functions. It just seems a missed opportunity if this case isn't handled by the specification (both SPARQL and JSON-LD) Would it put anyone out in the cold?

I expect the JSON-LD WG to follow the guidance from the RDF-star WG, and these constraints have not yet been decided. If it is determined that Lists can have well-formedness constraints, and this includes allowing rdf:first to have a cardinality of 0 or 1, then a future update to the JSON-LD API could be to include serializing and comparing such lists.

SPARQL allows compact forms like (1 ?x 3 4) but no way to indicate whether a list variable is optional {} or a minus {} (so, I added w3c/sparql-dev#196) - Erich

SPARQL support for Lists is quite limited, and is pretty much limited to the expressivity allowed by simply expanding the list into the first/rest ladder. More complicated things are possible with Property Paths, for which List navigation was an important use case. But, Notation3 does have better support for lists, where a List is a first-class resource, although I don't believe it allows empty elements either.

@gkellogg gkellogg added spec-design defer Issue deferred to future Working Group labels Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defer Issue deferred to future Working Group spec-design
Projects
None yet
Development

No branches or pull requests

3 participants