Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion of seqlevel styles #7

Open
lauradmartens opened this issue Apr 26, 2023 · 2 comments
Open

Conversion of seqlevel styles #7

lauradmartens opened this issue Apr 26, 2023 · 2 comments
Labels
enhancement New feature or request P2🏝 Low priority

Comments

@lauradmartens
Copy link
Collaborator

lauradmartens commented Apr 26, 2023

Description of feature

Add functionality that allows translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1").

This could be similar to the seqlevelsStyle function in the R package GenomeInfoDb :

seqlevelsStyle(gr_obj) = "UCSC"

@lauradmartens lauradmartens added the enhancement New feature or request label Apr 26, 2023
@ivirshup ivirshup added the P2🏝 Low priority label Apr 26, 2023
@nvictus
Copy link
Collaborator

nvictus commented Apr 3, 2024

In bioframe, we started doing this by providing an alias dictionary that maps all variants (including genbank IDs) to a single canonical name. Keeping track of naming "styles" for each provider and each species gets unwieldy, especially when ancillary scaffolds are considered (unlocalized, unplaced, alt).

https://bioframe.readthedocs.io/en/latest/guide-io.html#curated-genome-assembly-build-information

@ivirshup
Copy link
Member

ivirshup commented Apr 5, 2024

@nvictus, you investigated this a bunch during the hackathon. It sounded like we ended up at:

GenomeInfoDb probably has the info we want, but doesn't really make it accessible

Right?

What did GenomeInfoDb provide that bioframe doesn't? I would imagine you've covered some of the most common cases already.

ensembldb lets the user set the seqlevelsstyle like this: seqlevelsStyle(edb) <- "UCSC". Maybe we could do something similar via bioframe's assembly info?

EnsemblDB(connection, seq_style=bioframe.assembly_info(...))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P2🏝 Low priority
Projects
None yet
Development

No branches or pull requests

3 participants