Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Case: Use LinkML to define schemas #264

Open
glyg opened this issue May 24, 2023 · 4 comments
Open

Use Case: Use LinkML to define schemas #264

glyg opened this issue May 24, 2023 · 4 comments
Labels
use-case A (potential) use-case for ROLite creation, consumption or integration

Comments

@glyg
Copy link

glyg commented May 24, 2023

As a research software engineers, I want to use LinkML so that I can use the tooling associated to maintain my RO-crate specification and automate data export.

Hi, I am working on the deployments of services to manage Microscopy images across institutions, we mainly use OMERO to manage the data, but we miss a common representation for microscopy (meta)data records, for interoperability with other tools.

Thus, I would like to use RO-crate to define the record structure.
On the other hand, the community is pushing on using linkML as a schema definition tool, with the hope it will ease the combination of various metadata recommendation sources.

It seems that using linkML to define / create the RO-Crate could be a good entry point, and benefit both communities.

Have someone done that already?

I am now trying to define the RO-Crate schema in linkml and use it to produce ro_crate_metadata.json (this is not working atm), with the command:

linkml-convert -s ro-crate-schema.yml -t rdf data.yml -o ro_crate_metadata.json

Here is what the inputs look like:

(incomplete) Ro-Crate schema in linkML (ro-crate-schema.yml):

id: https://w3id.org/ro/crate/1.1
name: ro-crate-linkml
prefixes:
  linkml: https://w3id.org/linkml/
  schema: http://schema.org/
  ro_crate: https://ro/crate/1.1
  ORCID: https://orcid.org/
imports:
  - linkml:types
default_curi_maps:
  - semweb_context
default_prefix: ro_crate
default_range: string


classes:
  Thing:
    class_uri: schema:Thing
    attributes:
      id:
        range: uriorcurie
      description:
        range: string

  CreativeWork:
    is_a: Thing
    class_uri: schema:CreativeWork
    attributes:
      conformsTo:
        range: uriorcurie
      about:
        range: uriorcurie

  DataEntity:
    is_a: Thing

  Dataset:
    is_a: DataEntity
    class_uri: ro_crate:Dataset
    attributes:
      hasPart:
        range: DataEntities

  RootDataEntitiy:
    is_a: Dataset
    tree_root: true

  File:
    is_a: DataEntity
    class_uri: ro_crate:File
    attributes:
      name:
        range: string
      contentSize:
        range: string
      encodingFormat:
        range: string

      sdDatePublished:
        range: string # should be isoformat date

  DataEntities:
    description: >-
      A list of Datasets and Files
    attributes:
      entries:
        range: DataEntity
        multivalued: true
        inlined: true


  Person:
    class_uri: schema:Person              ## reuse schema.org vocabulary
    attributes:
      id:
        identifier: true
      full_name:
        required: true
        description:
          name of the person
        slot_uri: schema:name             ## reuse schema.org vocabulary
    id_prefixes:
      - ORCID

Example data (not working) data.yml:

- id: ro-crate-metadata.json
  type: CreativeWork
  conformsTo:
    id: https://w3id.org/ro/crate/1.1
- id: ./
  type: RootDataEntity
  hasPart:
    - id: cp7glop.ai
      is_a: File
      name: "Diagram showing trend to increase"
      contentSize: "383766"
      description: "Illustrator file for Glop Pot"
      encodingFormat: "application/pdf"
    - id: lots_of_little_files/
      is_a: Dataset
      name: "Too many files"
      description: "This directory contains many small files, that we're not going to describe in detail."

Any thought, pointer or hint on how to achieve that is welcome! (I am referencing this in a LinkML issue)

Thanks :)
Guillaume

@glyg glyg added the use-case A (potential) use-case for ROLite creation, consumption or integration label May 24, 2023
@stain stain changed the title Use Case: Use [LinkML](https://linkml.io) to define schemas Use Case: Use LinkML to define schemas May 25, 2023
@stain
Copy link
Contributor

stain commented May 25, 2023

Thanks for the suggestion! This fit well into what we're proposing for profiles https://www.researchobject.org/ro-crate/1.2-DRAFT/profiles as well.

Bioschemas have tried using DDE https://github.com/BioSchemas/specifications/ which is also related.

LinkML seems quite approachable to edit compared to SHACL and ShEX - see also some thoughts on those in ResearchObject/runcrate#17

As for File you should define it as http://schema.org/MediaObject (aka schema:MediaObject with your prefixes) - we don't have any ro_crate terms. The mapping for additional terms from our context is clarified in https://www.researchobject.org/ro-crate/1.2-DRAFT/metadata.html#additional-metadata-standards

@stain
Copy link
Contributor

stain commented May 25, 2023

You should also use the filename ro-crate-metadata.json as normated by https://www.researchobject.org/ro-crate/1.1/structure.html - not come across the underscore version before - some tools may require .jsonld suffix to generate.

@glyg
Copy link
Author

glyg commented May 25, 2023

Thanks a lot for the feedback @stain I'll fix those (underscores are a typo on my part) I'll try to make some progress and come back here

@glyg
Copy link
Author

glyg commented Dec 7, 2023

Hi, sorry for the stale issue. I posted a brief review of my attempt on forum.image.sc

It's OK for me to close this if you feel it clutters your repo :), although I still feel something should be done but I have no clue how...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
use-case A (potential) use-case for ROLite creation, consumption or integration
Projects
None yet
Development

No branches or pull requests

2 participants