Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SBGN-ML importer #439

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from
Draft

Add SBGN-ML importer #439

wants to merge 10 commits into from

Conversation

cthoyt
Copy link
Member

@cthoyt cthoyt commented May 16, 2020

Closes #423

This PR introduces the pybel.io.sbgnml I/O module, which first consumes SBGN-ML and converts it into a facile JSON-like data structure (which is generally reusable) then reasons over it and converts to BEL.

  • SBGN parser
  • BEL converter

The code also has some grounding mechanisms built in to deal with the often missing identifiers from the COVID-Pathways content, but I might remove this to make the tool more generally reusable. I'm not really sure who else generates SBGN-ML if not from CellDesigner (and more specifically, coming out of MINERVA)

DEMO NOTEBOOK

Example SBGN-ML Content

For an example, see the content posted by @cannin https://cannin.github.io/covid19-sbgn. The SBGN-ML file (disregard the SIF and SIF SVG, since those are derived resources).

He notes on his site that these SBGN-ML files were converted from CellDesigner using https://github.com/sbgn/cd2sbgnml. It's clear that there's some lost information due to cd2sbgnml (or there are strange pockets of incredibly low quality curation), so it will be a future step to write a CellDesigner importer later (see #440).

For example, this script can be run (after installing this branch in development mode) with the following code in the python REPL:

import json
from urllib.request import urlretrieve
from pybel.io.sbgnml.sbgnml import parse

url = 'https://cannin.github.io/covid19-sbgn/ER_Stress_Cov19.xml.sbgn'
path = 'ER_Stress_Cov19.xml'
urlretrieve(url, path)

rv = parse(path)
with open('ER_Stress_Cov19.json', 'w') as file:
    json.dump(rv, file, indent=2)

@codecov
Copy link

codecov bot commented May 29, 2020

Codecov Report

Merging #439 into master will decrease coverage by 2.51%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #439      +/-   ##
==========================================
- Coverage   79.10%   76.59%   -2.52%     
==========================================
  Files         171      176       +5     
  Lines        9039     8962      -77     
  Branches     1307     1333      +26     
==========================================
- Hits         7150     6864     -286     
- Misses       1608     1832     +224     
+ Partials      281      266      -15     
Impacted Files Coverage Δ
src/pybel/io/sbgnml/__init__.py 0.00% <0.00%> (ø)
src/pybel/io/sbgnml/constants.py 0.00% <0.00%> (ø)
src/pybel/io/sbgnml/convert.py 0.00% <0.00%> (ø)
src/pybel/io/sbgnml/parse.py 0.00% <0.00%> (ø)
src/pybel/io/sbgnml/utils.py 0.00% <0.00%> (ø)
src/pybel/tokens.py 88.40% <0.00%> (-2.64%) ⬇️
src/pybel/struct/mutation/expansion/upstream.py 88.88% <0.00%> (-2.03%) ⬇️
src/pybel/utils.py 87.05% <0.00%> (-1.84%) ⬇️
src/pybel/struct/mutation/induction/utils.py 80.00% <0.00%> (-1.82%) ⬇️
...el/struct/mutation/deletion/protein_rna_origins.py 82.75% <0.00%> (-1.62%) ⬇️
... and 79 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ab0a6a...13436d1. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SBGN Importer
1 participant