Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The spec/doc should be reorganised #330

Open
gouttegd opened this issue Nov 9, 2023 · 1 comment
Open

The spec/doc should be reorganised #330

gouttegd opened this issue Nov 9, 2023 · 1 comment
Assignees

Comments

@gouttegd
Copy link
Contributor

gouttegd commented Nov 9, 2023

Right now, the SSSOM website is a bit of a mess.

  • The home page and the about page are mostly redundant.
  • The specification page is merely the auto-generated documentation of the underlying LinkML model – calling that a “specification“ is misleading as this is woefully insufficient to anyone willing to implement the standard (necessary, sure, but not sufficient).
  • The overview page (which, also misleadingly, is under the URL /sssom/spec/) is a bit of everything:
    • a list of contributors (nothing wrong with that but maybe could get its own page, or be added to the existing credits page;
    • a general introduction to the concept of mapping;
    • a fleeting reference to the data model;
    • a list of the commonly used and recommended mapping predicates;
    • an attempt at formally specifying the OWL/RDF and SSSOM/TSV serialisation formats;
    • a list of use cases.
  • The “resources for users“ section it itself a mix of many things. Strikingly, it contains bit that really belong to the “specification” part, such as this (in the basic tutorial):

All three must be referred to by an identifier in CURIE syntax (Compact URI) when using the SSSOM table format or JSON, or an IRI (Internationalized Resource Identifier) when you are using the RDF representation of SSSOM.

Overall this makes it very difficult for implementers to figure out what and where are the really “normative“ parts of the website. This is of course at least partially due to the fact that the website is clearly a “work in progress“, but also, more fundamentally, because the website somehow tries to be simultaneously a specification for the standard, a documentation of that standard for end-users, a documentation of the reference implementation (sssom-py), and an academic paper on semantic mappings.

More immediately, this makes it difficult for me to figure out where I should put the various improvements to the spec I have been thinking about (such as the propagation of metadata slots #305, the recommendations on backwards compatibility #325, or the recommendations on how to deal with non-standard slots #328).

Therefore I’d like to propose that the website be reorganised so as to clearly separate the specification, the documentation, and the general notions on mappings. I welcome any suggestion for a better organisation, but right now here’s what I’m considering:

  • About this document (overall purpose of this document)
  • Introduction to semantic mappings (mostly, what is currently in the introduction of the overview, though I think some of the stuff in the “resources for users“ could probably belong there as well)
  • SSSOM specification (everything that developers must know to develop software compatible with the standard)
    • Introduction and use cases (what the standard is and what it is for)
    • Specification of the data model
      • Auto-generated LinkML-derived documentation
      • Complements to the auto-generated documentation (anything that is not described in the schema but that are necessary to understand the data model)
    • Specification of the serialisation formats
      • OWL/RDF serialisation
      • SSSOM/TSV serialisation
      • JSON serialisation
  • User documentation (most of the “resources for users“ stuff)
  • Other stuff (credits, glossary, how to contribute, etc.)

Thoughts?

@gouttegd gouttegd self-assigned this Nov 9, 2023
@matentzn
Copy link
Collaborator

matentzn commented Nov 9, 2023

I 100% agree with all you propose. I would love to try and implement (at least in spirit) a version of https://diataxis.fr/, but your proposal is mostly reorganisation at a higher level.

gouttegd added a commit that referenced this issue Apr 21, 2024
The examples provided in the SSSOM/TSV section of the "overview"
document are full of errors and would fail the most basic validation by
our own tools:

- use of `Lexical` instead of `semapv:LexicalMatching` in the
`mapping_justification` field (probably a remnant of the time prior to
the adoption of the SEMAPV vocabulary);
- bogus IRI prefix for the SKOS namespace (missing terminal `#`);
- use of a full-length identifier (instead of a CURIE) for `creator_id`.

This PR fixes those errors. In addition, it also ensures that the fields
are listed in the *recommended order*. It’s not critical but if we take
the time to recommend that fields be sorted in a given order, the least
we can do is to follow our own advice in our examples.

While we are at it, we also add a small note about the requirement for
using CURIEs in the SSSOM/TSV format, since that requirement currently
does not appear anywhere but is already enforced by `sssom validate`.

This is a band-aid until the docs are completely overhauled as part of
#330.

Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants