Skip to content
Guoqian Jiang edited this page Apr 8, 2020 · 13 revisions

SPARQL Query Examples

COVID-19 PICO Ontology

Analysis of the CORD-19 dataset

CORD-19 dataset content

  • Commercial use subset (includes PMC content) -- 9000 papers, 186Mb
  • Non-commercial use subset (includes PMC content) -- 1973 papers, 36Mb
  • PMC custom license subset -- 1426 papers, 19Mb
  • bioRxiv/medRxiv subset (pre-prints that are not peer reviewed) -- 803 papers, 13Mb

Proposed approach:

  1. Run documents through the NLP2FHIR pipeline, producing FHIR R4 resources descriptions.
  2. Convert FHIR R4 resources to RDF using the FHIR to RDF converter
  3. Load resulting RDF into into a SPARQL Endpoint (target host at the moment: