Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for SKOS OrderedCollection #723

Open
mielvds opened this issue Sep 5, 2022 · 4 comments
Open

Support for SKOS OrderedCollection #723

mielvds opened this issue Sep 5, 2022 · 4 comments

Comments

@mielvds
Copy link
Contributor

mielvds commented Sep 5, 2022

Would it be possible to extend the support for Collection to OrderedCollection as well?

@koenedaele
Copy link
Member

What's the use case you're trying to support?

The one we have is sorting concepts chronologically in our thesaurus of periods: https://thesaurus.onroerenderfgoed.be/conceptschemes/DATERINGEN

If you look at https://thesaurus.onroerenderfgoed.be/conceptschemes/DATERINGEN/c/1251, you will see that it's narrower periods are ranked chronologically, from oldest to newest. We do this by adding something called a Sort Label to the concepts in that collection:

  • GDF: paleozoïcum
  • GDH: mesozoïcum
  • GDJ: Tertiair
  • GDL: Quartair

Basically, any string you want and it gets sorted according to that string (so we left some empty spots in cases the ordering needs to change again).

A few consequences of this:

  • The sortlabels are attached to both concepts and collections. Since it's necessary to control the display order, regardless if these are narrower concepts of another concept or members of a collection.
  • The sortlabel is attached to the concept or collection, not to the relation with it's broader concept or collection. This means that a concept that is a part of two collections can not be sorted differenly in each collection. The sort order would be universal.
  • The sortlabels are just another type of labem with a language attribute. This means that it's (theoretically) possible to sort them differently in different languages.

So, I'm not sure if this is sufficient for your needs. Attaching the sort order to a relation is theoretically more accurate, but probably not that easy on the datamodel and the UI. The current implementation isn't the easiest on the editor either (you have to think of sort labels that make some sense), but since it's a usecase we only have in one thesaurus it's manageable (and I see even in that thesaurus some of the data has lot the sortLabel).

If the above is sufficient for your needs, it should be possible to look at exporting this information to RDF in the form of skos:OrderedCollection. And vice-versa, making it possible to import this again by generating sortLabels for an OrderedCollection. Complexities I see in this would be how to export a concept with narrower concepts that have been ordered. I think this could be done by inserting a kind af anonymous orderedcollection to indicate that the subconcepts should be ordered. But I expect that brings some other complexities with it we would need to think through.

So, first things first, would something like this be enough for you?

@mielvds
Copy link
Contributor Author

mielvds commented Sep 6, 2022

My concrete use case: maintaining the fixed order I got from what used to be a table of term definitions in a way that is SKOS compliant. They were manually sorted according to the order in which they should appear in end-user applications.
But in the end, I'm mainly concerned about loosing the OrderedCollection after importing into Atramhasis.

Wrt to the practical implementation. I think that what your descibe would suffice is that can somehow be translated to the SKOS import/export. As far as I know, skos:OrderedCollection is the only way to describe order using SKOS, but I have to admit that the use of an RDF list is not practical, especially when using SPARQL (queries are very complicated and extremely slow).

I see you currently map to skos:hiddenLabel, but that is not semantically accurate and will cause problems in the long run (eg. in full-text-search indexes and when there are multiple hidden labels). I recently got the suggestion to (also) use schema:position.

So to sum up:

  • supporting this would mainly be for hitting full SKOS spec compliance and not loosing information after import
  • adding a sortlabel of some sort is anyway recommended for practical reasons (SPARQL), but this cannot conflict with SKOS semantics of hiddenLabel.

@koenedaele
Copy link
Member

The mapping to skos:hiddenLabel was mostly a quick fix to be able to keep all data on export. On import they just stay as hidden Labels that can then be changed back to sort labels by the editor. I've never really see a good use case for hidden labels so far, but someone probably has.

I'm cerrtainly open to other options. skos:OrderedCollection does indeed look to be the official way. I'm alos not too keen on the RDF:lists, but I think we can make something work. Certainly for simple cases.

The schema:position is interesting. I did think about creating an atramhasis:sortLabel or such, but I didn't really want to create yet another ontology. So, schema:position might be a good fit. The range looks to be schema:Integer or schema:Text. Simplest implementation would be exporting the sortLabel to schema:position as an rdf literal and reading that on import as well. Ignoring non rdf literal values. We could decouple the position from the list of labels in the skosprovider interface or even make it language independent, but that is a bigger change and I'm not certain it's worth it for a fairly rare use case.

How would you handle skos:OrderedCollection when dealing with sorting the narrower concepts of a concept? Adding an orderedcollection as an anonymous resource in between the broader concept and the narrower concepts just to create the order? Collections with a URI get imported as editable collections, collections without a URI just pass on the ordering to their narrower concepts.

So, we would have two ways of defining arbitray orders on export/import: schema:position and skos:orderedcollection. Where on import schema:position would take precedence on skos:orderedcollection.

Do you have any example files you could share? Would be useful to have as fixtures for unit tests.

@mielvds
Copy link
Contributor Author

mielvds commented Sep 7, 2022

The mapping to skos:hiddenLabel was mostly a quick fix to be able to keep all data on export. On import they just stay as hidden Labels that can then be changed back to sort labels by the editor. I've never really see a good use case for hidden labels so far, but someone probably has.

We're are probably going to need them to make concepts findable using "slang". Labels you don't want to endorse or display, but you know users search for them. For example, even though the NL label is "Sinaasappel", they would search for "Appelsien".

I'm cerrtainly open to other options. skos:OrderedCollection does indeed look to be the official way. I'm alos not too keen on the RDF:lists, but I think we can make something work. Certainly for simple cases.

I think it's just a matter of respecting the order when writing to the database at import? RDFLib luckily has this Collection class that allows handling lists as a python iterable.

The schema:position is interesting. I did think about creating an atramhasis:sortLabel or such, but I didn't really want to create yet another ontology. So, schema:position might be a good fit. The range looks to be schema:Integer or schema:Text. Simplest implementation would be exporting the sortLabel to schema:position as an rdf literal and reading that on import as well. Ignoring non rdf literal values.

That would work!

We could decouple the position from the list of labels in the skosprovider interface or even make it language independent, but that is a bigger change and I'm not certain it's worth it for a fairly rare use case.

I agree

How would you handle skos:OrderedCollection when dealing with sorting the narrower concepts of a concept? Adding an orderedcollection as an anonymous resource in between the broader concept and the narrower concepts just to create the order? Collections with a URI get imported as editable collections, collections without a URI just pass on the ordering to their narrower concepts.

We don't have this use case where collections are in between concepts, but I think you use a skos:OrderedCollection wherever you use askos:Collection. The only difference is that it adds a skos:memberList predicate that repeats the members in sequence. In the UI you could even solve it with a checkbox that indicates that the current (sort) order should be maintained.

Granted, a OrderedCollection doesn't make much sense is the order is a result of sorting; it's the ability to introduce and maintain a custom order that you want. The UI should support that, but it's a low-priority feature.

So, we would have two ways of defining arbitray orders on export/import: schema:position and skos:orderedcollection. Where on import schema:position would take precedence on skos:orderedcollection.

I'd say these are two different things.

  • skos:orderedcollection is for communicating a fixed order
  • schema:position is for sorting
    In many cases these are the same, but not always. If I arrange my concepts in ascending alphabetical order as a orderedcollection, that's the truth I want to communicate. If I just use schema:position, I have no control over ascending or descending order.

Do you have any example files you could share? Would be useful to have as fixtures for unit tests.

This list of "Vakken": https://github.com/i-Learn-SKOS/common-conceptschemes/blob/main/common/schemes/vak-norelated-final.skos.ttl

@koenedaele koenedaele added the Epic label Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants