Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SO type to HGNC genes #649

Open
kevinschaper opened this issue Mar 21, 2024 · 0 comments
Open

Add SO type to HGNC genes #649

kevinschaper opened this issue Mar 21, 2024 · 0 comments
Assignees

Comments

@kevinschaper
Copy link
Member

Recently I wanted to update one of our slides that depends on counting the number of protein coding genes, and was surprised to see that we don't know have the type field populated for human genes.

One approach would be to use an explicit SO term from a different hgnc source (like alliance), I don't think we have a reasonable way to capture provenance on a field level, but we could definitely do a koza map to make the alliance hgnc gene record available alongside the one that comes directly from hgnc, and look up the SO term within the alliance doc.

The other approach would be to map from hgnc_complete_set.txt. That's pyobo's approach:

https://github.com/biopragmatics/pyobo/blob/dc7b4736f2bbf943084e8f8a95e1293c2717c566/src/pyobo/sources/hgnc.py#L110-L145

We could even consider importing this, though, it might depend on how heavy the dependency ends up being.

@kevinschaper kevinschaper added this to the 2024-04 Release milestone Mar 21, 2024
@kevinschaper kevinschaper self-assigned this Apr 2, 2024
@kevinschaper kevinschaper removed this from the 2024-06 Release milestone May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant