You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently I wanted to update one of our slides that depends on counting the number of protein coding genes, and was surprised to see that we don't know have the type field populated for human genes.
One approach would be to use an explicit SO term from a different hgnc source (like alliance), I don't think we have a reasonable way to capture provenance on a field level, but we could definitely do a koza map to make the alliance hgnc gene record available alongside the one that comes directly from hgnc, and look up the SO term within the alliance doc.
The other approach would be to map from hgnc_complete_set.txt. That's pyobo's approach:
Recently I wanted to update one of our slides that depends on counting the number of protein coding genes, and was surprised to see that we don't know have the
type
field populated for human genes.One approach would be to use an explicit SO term from a different hgnc source (like alliance), I don't think we have a reasonable way to capture provenance on a field level, but we could definitely do a koza map to make the alliance hgnc gene record available alongside the one that comes directly from hgnc, and look up the SO term within the alliance doc.
The other approach would be to map from hgnc_complete_set.txt. That's pyobo's approach:
https://github.com/biopragmatics/pyobo/blob/dc7b4736f2bbf943084e8f8a95e1293c2717c566/src/pyobo/sources/hgnc.py#L110-L145
We could even consider importing this, though, it might depend on how heavy the dependency ends up being.
The text was updated successfully, but these errors were encountered: