Skip to content

This is a follow-up effort from work done within Waagmeester et al 2020 eLife and Mayers et al 2022 Bioinformatics

License

Notifications You must be signed in to change notification settings

SuLab/Wikidata_Biomedical-Subgraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wikidata Biomedical Subgraph

Contributors and Acknowledgements (alphabetical)

This research is funded under the currently tabled Gene Wiki Project.
Andra Waagmeester (@andrawaag), Andrew I Su (@andrewsu), Carolina Gonzalez-Cavazos (@Carolina1396), Jose Emilio Labra Gayo (@labra), Kat Thornton (@emulatingkat), Lynn Schriml (@lschriml), Michael D Mayers (@mmayers), Sabah Ul-Hasan (@sabahzero), Sai Siddhartha (@saisiddu), Seyed Amir Hosseini Beghaeiraveri (@seyedahbr), Tyler Bettilyon (@tebba-von-mathenstein)

Overview

This code acts as the current approach to access and usage of the Wikidata biomedical subgraph for downstream analyses, such as identification of repurposable drug candidates. Older versions of this pipeline commit history can be found here as WRP, note ‘Issues’ section of repository for potentially relevant task items.

Examples of previous applications:

Pipeline

This subgraph is retrieved from the Wikidata January 3rd 2022 archive utilizing the Wikibase Dump Filter (WDF) json dump tool. Parallel efforts that include RDF dump approaches can be found here from Biohackathon 2021 and Biohackaton 2022.

Raw files from Jan 3rd 2022 .json dump through .csv output can be found within the avalanche HPC folder: sulhasan/Wikidata_Biomedical-Subgraph. This folder neighbors code forked from the WD-rephetio-anaylysis Github repository.

There are 18 node types and 41 edge types in this subgraph. Categories are up for discussion as to whether or not they have retained relevancy for when the subgraph is next utilized.

Reproducibility

Relevant code is available here. All other code available acts as a point of reference that may be applicable downstream or as a means of yielding more efficient output.


License CC0

About

This is a follow-up effort from work done within Waagmeester et al 2020 eLife and Mayers et al 2022 Bioinformatics

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published