Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend what's allowed in curie_map to enable extended prefix maps #339

Open
cthoyt opened this issue Nov 28, 2023 · 2 comments
Open

Extend what's allowed in curie_map to enable extended prefix maps #339

cthoyt opened this issue Nov 28, 2023 · 2 comments
Assignees

Comments

@cthoyt
Copy link
Member

cthoyt commented Nov 28, 2023

Currently, the curie_map element takes a dictionary with string keys and string values. I propose we extend the data model of what can go in here:

  1. If a string is given, considers it as a URL pointing to an (extended) prefix map. Should be a JSON file, can be checked if it's an EPM if the value is a list, can be checked if it's a JSON-LD context if there's an @context element inside, otherwise consider as a simple prefix map
  2. If a list is given, interprets it as an extended prefix map
@gouttegd
Copy link
Contributor

gouttegd commented Feb 5, 2024

Strongly opposed to any kind of extension where we need to peek into the contents of a field to guess its type of value.

In fact the curie_map used to be defined as “either a URL pointing to a curie map, OR the curie map itself”. We changed that in #284 because it was agreed that such Frankenstein-typed slots, where the same slot can be either a string or a dictionary, were a bad idea.

If different types of curie map are desired (e.g. simple or extended), or if a curie map can be either included directly or referenced from an external resource, then we should use different slots (e.g. curie_map for an included simple map, curie_map_ref for a link to an external simple map, extended_map for an included EPM, extended_map_ref for a link to an external EPM).

For what it’s worth I am mildly against allowing the use of a pointer to an external map (simple or extended). I think that SSSOM mapping sets should be self-sufficient and should not require accessing an external resource to be used.

I am also unconvinced that an EPM brings anything useful in the context of a mapping set. When a mapping set contains a MESH:12345678 curie, all I need is to know what URL prefix MESH stands for (which the simple curie map provides). Why would I need to know all the alternative prefix names or URL prefixes associated to the MESH namespace?

I do understand that one might want to reconcile the prefixes used in one dataset to fit the “preferred URLs” that person wants (or needs). For example, if I get a dataset that was provided to me with

#curie_map:
#  MESH: "http://meshb.nlm.nih.gov/record/ui?ui="

and for some reason my application requires MESH IDs to use the http://id.nlm.nih.gov/mesh/ form, then of course I would use an EPM (where the “preferred prefix” for MESH is the one I need, such as the OBO EPM) to automatically remap the MESH curies. But in that case it’s up to me to provide an EPM that suits my needs. Whoever creates the dataset cannot know in advance which EPM I need, so what would be the benefit of including (or referring to) an EPM in the set?

@gouttegd
Copy link
Contributor

gouttegd commented Feb 5, 2024

Ah, and if we do allow referring to an external map: strongly opposed to allowing the external map to be represented as a JSON-LD @context. This is the Simple Standard for Sharing Ontological Mappings, so let’s keep it simple and not bring needlessly complex stuff just because we can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants