The spec/doc should be reorganised #330

gouttegd · 2023-11-09T12:17:10Z

Right now, the SSSOM website is a bit of a mess.

The home page and the about page are mostly redundant.
The specification page is merely the auto-generated documentation of the underlying LinkML model – calling that a “specification“ is misleading as this is woefully insufficient to anyone willing to implement the standard (necessary, sure, but not sufficient).
The overview page (which, also misleadingly, is under the URL /sssom/spec/) is a bit of everything:
- a list of contributors (nothing wrong with that but maybe could get its own page, or be added to the existing credits page;
- a general introduction to the concept of mapping;
- a fleeting reference to the data model;
- a list of the commonly used and recommended mapping predicates;
- an attempt at formally specifying the OWL/RDF and SSSOM/TSV serialisation formats;
- a list of use cases.
The “resources for users“ section it itself a mix of many things. Strikingly, it contains bit that really belong to the “specification” part, such as this (in the basic tutorial):

All three must be referred to by an identifier in CURIE syntax (Compact URI) when using the SSSOM table format or JSON, or an IRI (Internationalized Resource Identifier) when you are using the RDF representation of SSSOM.

Overall this makes it very difficult for implementers to figure out what and where are the really “normative“ parts of the website. This is of course at least partially due to the fact that the website is clearly a “work in progress“, but also, more fundamentally, because the website somehow tries to be simultaneously a specification for the standard, a documentation of that standard for end-users, a documentation of the reference implementation (sssom-py), and an academic paper on semantic mappings.

More immediately, this makes it difficult for me to figure out where I should put the various improvements to the spec I have been thinking about (such as the propagation of metadata slots #305, the recommendations on backwards compatibility #325, or the recommendations on how to deal with non-standard slots #328).

Therefore I’d like to propose that the website be reorganised so as to clearly separate the specification, the documentation, and the general notions on mappings. I welcome any suggestion for a better organisation, but right now here’s what I’m considering:

About this document (overall purpose of this document)
Introduction to semantic mappings (mostly, what is currently in the introduction of the overview, though I think some of the stuff in the “resources for users“ could probably belong there as well)
SSSOM specification (everything that developers must know to develop software compatible with the standard)
- Introduction and use cases (what the standard is and what it is for)
- Specification of the data model
  - Auto-generated LinkML-derived documentation
  - Complements to the auto-generated documentation (anything that is not described in the schema but that are necessary to understand the data model)
- Specification of the serialisation formats
  - OWL/RDF serialisation
  - SSSOM/TSV serialisation
  - JSON serialisation
User documentation (most of the “resources for users“ stuff)
Other stuff (credits, glossary, how to contribute, etc.)

Thoughts?

The text was updated successfully, but these errors were encountered:

matentzn · 2023-11-09T12:42:28Z

I 100% agree with all you propose. I would love to try and implement (at least in spirit) a version of https://diataxis.fr/, but your proposal is mostly reorganisation at a higher level.

The examples provided in the SSSOM/TSV section of the "overview" document are full of errors and would fail the most basic validation by our own tools: - use of `Lexical` instead of `semapv:LexicalMatching` in the `mapping_justification` field (probably a remnant of the time prior to the adoption of the SEMAPV vocabulary); - bogus IRI prefix for the SKOS namespace (missing terminal `#`); - use of a full-length identifier (instead of a CURIE) for `creator_id`. This PR fixes those errors. In addition, it also ensures that the fields are listed in the *recommended order*. It’s not critical but if we take the time to recommend that fields be sorted in a given order, the least we can do is to follow our own advice in our examples. While we are at it, we also add a small note about the requirement for using CURIEs in the SSSOM/TSV format, since that requirement currently does not appear anywhere but is already enforced by `sssom validate`. This is a band-aid until the docs are completely overhauled as part of #330. Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>

gouttegd self-assigned this Nov 9, 2023

This was referenced Apr 8, 2024

Update documentation for entity reference to clarify CURIE/URI type #358

Merged

Fix examples of SSSOM/TSV files. #362

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The spec/doc should be reorganised #330

The spec/doc should be reorganised #330

gouttegd commented Nov 9, 2023 •

edited

matentzn commented Nov 9, 2023

The spec/doc should be reorganised #330

The spec/doc should be reorganised #330

Comments

gouttegd commented Nov 9, 2023 • edited

matentzn commented Nov 9, 2023

gouttegd commented Nov 9, 2023 •

edited