Home

Jump to bottom

Guoqian Jiang edited this page Apr 8, 2020 · 13 revisions

SPARQL Query Examples

COVID-19 PICO Ontology

CPICO Documentation

Analysis of the CORD-19 dataset

CORD-19 dataset content

Commercial use subset (includes PMC content) -- 9000 papers, 186Mb
- sample
Non-commercial use subset (includes PMC content) -- 1973 papers, 36Mb
- sample
PMC custom license subset -- 1426 papers, 19Mb
- sample
bioRxiv/medRxiv subset (pre-prints that are not peer reviewed) -- 803 papers, 13Mb
- sample

Proposed approach:

Run documents through the NLP2FHIR pipeline, producing FHIR R4 resources descriptions.
Convert FHIR R4 resources to RDF using the FHIR to RDF converter
Load resulting RDF into into a SPARQL Endpoint (target host at the moment: https://graph.fhircat.org/graphdb)