Skip to content

FAIRer rose, clarifying the semantics of data matrices version 2.0

Choose a tag to compare
@proccaserra proccaserra released this 03 Dec 15:46

FAIRer rose, clarifying the semantics of data matrices.

A FAIRification project: Rose scent metabolite profiles - Nature Genetics, June 2018 and Science, July 2015

The FAIRification process relies on a principled approach rooted in the notion of design of experiments and relies on the Statistics Ontology - STATO for specifying the relevant semantics of the data matrices by identifying independent and dependent variables, as well as quantitation types (sample mean and standards error) held in the original documents.

Both datasets were used to showcase how data can be compared efficiently once the data matrices have been made FAIR.

This Data Science project is available from github at: with all necessary information, code and Jupyter notebooks, according to a CookieCutter Data Science template.

This release is related to following documents:

  1. Original Excel Table:
    Available as supplementary material and now made available via Zenodo.
  2. Frictionless Tabular Data Package:
    Resulting from the transformation of the excel document to a Tabular Data Package, available via Zenodo.
  3. RDF Linked Data graph: New Version, which fixes an issue related to use of literals (strings) instead of IRIs and which uses additional prefixes for dealing with the relations used in the predicates. (We thank @JervenBolleman for reporting the issue). Resulting from the conversion of Frictionless Tabular Data Package to a semantic model using (OBOfoundry resources)[] such as STATO, ChEBI, Plant Ontology as well as NCBI Organismal Taxonomy, available via Zenodo.
  4. Dataset comparison as Frictionless Data Package:
    Metabolites measured in two distinct experiments published in Science,2015 and Nature Genetics, 2018 and made available as a Tabular Data Package, available via Zenodo.