Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent Identifiers for Organizations and People #1

Closed
leightonlc opened this issue May 16, 2023 · 5 comments
Closed

Persistent Identifiers for Organizations and People #1

leightonlc opened this issue May 16, 2023 · 5 comments
Assignees
Labels
Final Review Tagged for final review before closing Identifiers Issues related to identification Requirements Request Proposed requirements based on user and stakeholder needs

Comments

@leightonlc
Copy link

Leighton Christiansen
National Transportation Library, USDOT

Requirement(s)

Metadata fields that record information about Organizations and People should also record one or more persistent identifiers for that entity.
publisher: should include subfields for "identifier" and "identifierType" where persistent identifiers such as SAM numbers, Crossref ids, and/or Research Organization Registry (ROR) ids can be recorded, with the type specified.
contactPoint: Besides "email", contactPoint should include subfields for "identifier" and "identifierType" where persistent identifiers such as ORCIDs, ResearcherID, and/or arXiv Author Identifiers can be recorded, with the type specified.

Examples:
"publisher": {
"organizationIdentifier": { "https://ror.org/02xfw2e90",
"identifierType": {"ROR",
}
}
}

"contactPoint": {
"personalIdentifier": {"https://orcid.org/0000-0002-0543-4268",
"identifierType": {"ORCID",
}
}
}

Problem Statement

The use of persistent identifiers to unambiguously identify researchers and research-related organizations is a standard practice in publishing and repositories.
People may change their name several times over a lifetime, or variants of their name may be used by different publishers, causing confusion about a researcher's lifetime of work. Further, many people have the same names. Persistent identifiers, such as ORCID, help to globally disambiguate researchers.
Organizations may also go through name changes, especially after internal reorganizations, or rebranding to better express mission, or to use more inclusive language. But their mission, and role vis a vie research, many not change. Further, sometimes researchers do not use the proper or preferred name or acronyms for a funding agency or publisher when citing support or publication history. Unique identifiers help to identify a specific organization through its lifecycle, and perhaps disambiguate it from an international or regional organization with the same or similar name.
The use of persistent identifiers is specifically called out by the FAIR Principles https://www.go-fair.org/fair-principles/ as the first three steps of Findability.
Further, the implementation of digital persistent identifiers is required by National Security Presidential Memorandum 33 (NSPM-33) and explained in the guidance document for NSPM-33 at https://www.whitehouse.gov/wp-content/uploads/2022/01/010422-NSPM-33-Implementation-Guidance.pdf
Expanding the use of persistent identifiers will help to bring DCAT-US in closer alignment with standard practice and new federal policies.

Target Audience / Stakeholders

Researchers
Repository managers
Research Funders
Data consumers
Metadata experts

Intended Uses / Use Cases

In DCAT-US 3

"publisher": {
"organizationIdentifier": { "https://ror.org/02xfw2e90",
"identifierType": {"ROR",
}
}
}

"contactPoint": {
"personalIdentifier": {"https://orcid.org/0000-0002-0543-4268",
"identifierType": {"ORCID",
}
}
}

Existing Approaches - Optional

Left blank intentionally

Additional context, comments, or links - Optional

Left blank intentionally

@fellahst fellahst added the Requirements Request Proposed requirements based on user and stakeholder needs label May 17, 2023
@torrin47
Copy link

torrin47 commented May 19, 2023

Yes, and...

contactPoint is implemented in the DCAT-US schema by means of the vCard specification, which has an optional UID element that would be a good fit for ORCID and easy to implement. Documented here:
USEPA/EPA_Environmental_Dataset_Gateway#19
and here:
project-open-data/project-open-data.github.io#614

publisher would need more work to accommodate RORs, but agree fully with the value proposition.

Other important persistent identifiers, such as DOIs and PMCIDs supporting linked open data principles, are often dumped into the "references" array with no additional context. Suggest that any URI or other persistent identifier in DCAT-US be allowed to include an associated human-readable description of what the URI represents and potentially a reference to an issuing authority as described here: https://www.w3.org/TR/vocab-dcat-3/#identifiers-type
Lots more discussion on this topic over here:
project-open-data/project-open-data.github.io#592
project-open-data/project-open-data.github.io#69

@fellahst fellahst added the Identifiers Issues related to identification label May 31, 2023
@fellahst
Copy link
Collaborator

Leighton,

Thank you for highlighting the importance of incorporating persistent identifiers for organizations and individuals in the DCAT-US schema, a practice aligned with FAIR principles and NSPM-33 guidelines. Your proposal to include specific subfields for "identifier" and "identifierType" under 'publisher' and 'contactPoint' is insightful and addresses the need for unambiguous identification in digital publishing and repositories.

The FAIR principles advocate for globally unique and persistent identifiers (F1) and their retrievability through standardized protocols (A1). To this end, generating resolvable URLs in compliance with the RFC 3986 IETF standard and Linked Data best practices is crucial. This includes elements like scheme, authority, path, and local or globally unique identifiers.

URI resolution services like purl.org, w3ids, doi.org, orcid.org, arxiv.org, and Identifiers.org play a vital role in ensuring consistent access to resources, emphasizing the need for persistent identifiers in data management.

For the 'publisher' field, your suggestion to incorporate identifiers such as SAM numbers, Crossref ids, and ROR ids aligns with these principles. Similarly, for 'contactPoint', including identifiers like ORCIDs or ResearcherID enhances the precision in identifying researchers and organizations.

Implementing these changes will not only improve the discoverability and interoperability of datasets but also bring DCAT-US in closer alignment with standard practices and federal policies.

We have addressed your requirements in the current DCAT-US 3.0 specification in a number of ways:

Your contribution is valuable to our ongoing efforts to enhance data management practices, and we look forward to getting your feedback on how we addressed your requirements in the DCAT-US 3.0 application profile.

@fellahst fellahst added the Final Review Tagged for final review before closing label Dec 13, 2023
@ShaferAC
Copy link

+1

@mrratcliffe
Copy link

+1;
ORCID 0000-0002-2458-4675 agrees with this ;-)

@leightonlc
Copy link
Author

leightonlc commented Jan 1, 2024 via email

@fellahst fellahst closed this as completed Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Final Review Tagged for final review before closing Identifiers Issues related to identification Requirements Request Proposed requirements based on user and stakeholder needs
Projects
None yet
Development

No branches or pull requests

6 participants