Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capturing the time of mapping review #309

Open
gouttegd opened this issue Aug 1, 2023 · 7 comments
Open

Capturing the time of mapping review #309

gouttegd opened this issue Aug 1, 2023 · 7 comments
Labels
discussion question Further information is requested

Comments

@gouttegd
Copy link
Contributor

gouttegd commented Aug 1, 2023

The current version of the schema allows to capture the fact that a mapping has been reviewed, through the use of the reviewer_id and/or reviewer_label fields.

Should the time at which a mapping has been reviewed also be captured, and if so, how?

The use case is a (group of) human reviewer(s) checking whether a mapping is still correct, possibly years after the mapping was first established, and concluding that indeed, the mapping is still 100% correct, there’s nothing to change.

(The case where the reviewers find that the mapping needs some changes is out of scope: if the mapping needs changing, that’s a new mapping that is asserted, so this can be recorded in mapping_date.)

There are several options.

A) Not actually recording the review date (more or less the current situation). The reviewers just add their IDs to the reviewer_id field but leave the mapping otherwise untouched. Pros: nothing to do. Cons: consumers of the mapping have no way to know when the mapping was reviewed, just that it has been reviewed. Has the review occurred in the past 5 weeks or the past 5 years?

A1) Variant: not recording the review date, but considering that when a mapping set has been reviewed, it is a new mapping set that is being published after review. The new mapping set therefore has a new publication_date, which can be inferred as the date the mappings of the set were last reviewed. Pros: no changes required. Cons: assuming all mappings are systematically (re-)reviewed prior to any new publication of the mapping set seems like a bold assumption.

B) Re-using mapping_date. That field is intended to capture the “date the mapping was asserted”. We can consider that reviewing of a mapping, even if the review does not lead to any change to the mapping, is tantamount to asserting that mapping (again). Reviewers should re-set that date to the date of the review. Pros: allow to capture the date of review without requiring any change to the schema. Cons: we loose the information about when the mapping was first asserted.

C) Adding a new field review_date, to be filled in by the reviewers at the same time they fill reviewer_id. Pros: review date is capture without any loss of information. Cons: require adding a new field to the schema.

Other ideas?

@gouttegd gouttegd added question Further information is requested discussion labels Aug 1, 2023
@matentzn
Copy link
Collaborator

matentzn commented Aug 1, 2023

mapping_date could indeed by redefined as "date the mapping was last confirmed". This has the least churn.

IMO there should be a reasonably high bar now to add additional metadata elements. But given how important reviews may be for some organisations (especially for non-exact matches), I am absolutely convincable to vote either side.

@gouttegd
Copy link
Contributor Author

gouttegd commented Aug 1, 2023

Re-using mapping_date has my preference as well.

IMO there should be a reasonably high bar now to add additional metadata elements.

I agree.

@dr-shorthair
Copy link

dr-shorthair commented Jan 16, 2024

We would like review_date in addition to mapping_date

In general there may be many stages in the lifecycle of a mapping. Currently it is only possible to record the timing of two stages - using mapping_date and publication_date - though other lifecycle stages are implied by the metadata set, review being the most obvious. Retirement or withdrawal is another obvious stage.

@matentzn
Copy link
Collaborator

@dr-shorthair feel free to use the issue template for new elements here: https://github.com/mapping-commons/sssom/issues/new/choose

I will respond to the other suggestions individually when the tickets come.

As you say, there are a lot of things that we could capture about a mapping. It is important that we only add elements to SSSOM that are of general importance (multiple stakeholders agree they are needed). This could be the case with your requests!

I gather you don't agree with the idea to interpret mapping_date as the point in time the mapping was last confirmed?

Note that you can always add a "semapv:MappingReview" as a separate justification about a mapping. This seems a bit idiosyncratic at first, but it could do the job.

@gouttegd
Copy link
Contributor Author

Note that you can always add a "semapv:MappingReview" as a separate justification about a mapping. This seems a bit idiosyncratic at first, but it could do the job.

Yes, and I am inclined to think this should be the way to go.

Rather than having a new distinct field to record the time of every step in the “life cycle” of a mapping (e.g. creation_date, review_date, withdrawal_date, etc.), we’d have one mapping justification (that is, one “row” in a SSSOM TSV file) for every step:

  • one row for when the mapping was first assessed;
  • one row for when the mapping was reviewed (there could be several such rows, if a mapping has been reviewed several times);
  • one row for when the mapping is withdrawn (which could be marked by the use of the Not predicate modifier maybe?).

This way we would only ever need one date field.

@dr-shorthair
Copy link

A more scalable pattern might be to add a status slot, and use a generic date slot to record when that status was reached.
(The author/creator/reviewer slots could be collapsed into one 'agent', though this may be a step too far given the legacy.)

Combine this with the one-row-per-status-transition pattern proposed above by @gouttegd

Potential list of status values:

  • draft
  • submitted
  • reviewed
  • issued / published
  • retired / withdrawn

@dr-shorthair
Copy link

See #345 for a sketch of a possible solution - @gouttegd @matentzn I would be grateful of your feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants