Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new slot "mapping_status" (mapping lifecycle) to SSSOM core model #347

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

matentzn
Copy link
Collaborator

Resolves #345

  • docs/ have been added/updated if necessary
  • make test has been run locally
  • tests have been added/updated (not applicable)
  • CHANGELOG.md has been updated.

If you are proposing a change to the SSSOM metadata model, you must

  • provide a full, working and valid example in examples/
  • provide a link to the related GitHub issue in the see_also field of the linkml model
  • provide a link to a valid example in the see_also field of the linkml model

This is a draft PR adding a mapping_status metadata field to the SSSOM core model to describe the life cycle phase a specific mapping is in. Please use the issue for general discussions, and the PR only about specific suggestions.

subject_id predicate_id object_id mapping_justification author_id mapping_date status comment
alum8:Marsh-or-wetlandsaline skos:exactMatch get:groups/MFT1.3 semapv:ManualMappingCuration orcid:0009-0001-6090-9959 2023-12-01 draft
alum8:Marsh-or-wetlandsaline skos:exactMatch get:groups/MFT1.3 semapv:ManualMappingCuration orcid:0000-0002-2934-5497 2024-01-03 reviewed Looks Ok but let's check with Piers
alum8:Marsh-or-wetlandsaline skos:exactMatch get:groups/MFT1.3 semapv:ManualMappingCuration orcid:0000-0002-2568-59457 2024-01-10 approved Mapping has been approved by Pier and team.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get what's the difference between approved and published based on this example, also this doesn't match the enum values

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is commonly a delay between approval and the actual release, which may be phased with other items or on pre-defined calendar dates.

alum8:Marsh-or-wetlandsaline skos:exactMatch get:groups/MFT1.3 semapv:ManualMappingCuration orcid:0009-0001-6090-9959 2023-12-01 draft
alum8:Marsh-or-wetlandsaline skos:exactMatch get:groups/MFT1.3 semapv:ManualMappingCuration orcid:0000-0002-2934-5497 2024-01-03 reviewed Looks Ok but let's check with Piers
alum8:Marsh-or-wetlandsaline skos:exactMatch get:groups/MFT1.3 semapv:ManualMappingCuration orcid:0000-0002-2568-59457 2024-01-10 approved Mapping has been approved by Pier and team.
alum8:Marsh-or-wetlandsaline skos:exactMatch get:groups/MFT1.3 semapv:ManualMappingCuration orcid:0000-0002-2934-5497 2024-01-13 published
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the mapping is in a document, then it's published, so not really sure why this even exists

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That view may hold in amongst developers, but not in the statutory (government) sector or even amongst academics (who regularly withhold information 'until the journal article appears').

@cthoyt
Copy link
Member

cthoyt commented Jan 31, 2024

Overall I think this is a weak proposal. It increases complexity of the format while being sort of redundant of several things that already exist. Why not use some of the fields where you can put whatever arbitrary metadata you want to facilitate this for the minority of users who might want to use it?

I think we will keep getting more and more requests to add arbitrary enums/metadata fields based on single use cases, and I want to oppose these

@gouttegd
Copy link
Contributor

My main problem with this is that it’s not merely just an addition of a field (that in itself wouldn’t be a big deal), it’s an addition that changes a rather important aspect of the format.

Until now, any single mapping record (i.e. “row” in the TSV format) is “standalone” – the record contains all the metadata needed to evaluate it and decide what to do with it. No need to look anywhere else in the mapping set to make a decision.

With that change, for every mapping I now need to check if there’s another record further down in the set for the same triple {subject, predicate, object} but with a different status (in particular, with a status of withdrawn, which would invalidate a previous record with a status of published).

I am not fundamentally opposed to that, but I do feel this requires more consideration. In particular, if the idea is to “provide the metadata for the client to make the decision about "which records are valid"”, then the specification needs to be much more clear about how that decision of “which records are valid” should be made. Otherwise we run the risk of ending up with a format that is interoperable in name only, because each implementation will have its own logic to determine the validity of records.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[New metadata element]: status
4 participants