Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output contig locations bed file #56

Open
soungalo opened this issue Mar 31, 2020 · 6 comments
Open

Output contig locations bed file #56

soungalo opened this issue Mar 31, 2020 · 6 comments
Labels
enhancement New feature or request

Comments

@soungalo
Copy link

soungalo commented Mar 31, 2020

This is a feature request - as of v1.1 the order of the contigs in each pseudomolecule is given in a separate TSV file. I think it would be more useful if instead (or in addition), the software will output a bed file that contains contig locations on pseudomolecules (thus also taking into account gap padding). In fact, two bed files could be created: one using coordinates of the generated pseudomolecules and the other using coordinates of the reference pseudomolecules.

Meanwhile - can you suggest a way to extract the reference locations information? It is not always trivial to extract it from the paf file...

@malonge malonge added the enhancement New feature or request label Apr 1, 2020
@malonge
Copy link
Owner

malonge commented Apr 1, 2020

Hi there,

You are right - that would be a better intermediate output format and that is the plan for v2 of RaGOO. Thanks for the suggestion.

In the meanwhile, you can use the script ragoo_utilities/get_contig_borders.py. It's rough, but hopefully, it will serve you until v2 comes out.

Thank you

@malonge malonge closed this as completed Apr 1, 2020
@malonge malonge reopened this Apr 1, 2020
@soungalo
Copy link
Author

soungalo commented Apr 1, 2020

Thanks. Can this script also extract mapping coordinates on the reference? If not, how can I extract them from the PAF? How should I treat cases of incomplete mapping and/or multiple mappings?

@malonge
Copy link
Owner

malonge commented Apr 1, 2020

Hi there,

Do you mean all of the mapping coordinates between a query sequence and the reference? Or just those alignments that informed scaffolding somehow?

@soungalo
Copy link
Author

soungalo commented Apr 2, 2020

Is there a way to tell which alignments that informed scaffolding?

@malonge
Copy link
Owner

malonge commented Apr 2, 2020

Unfortunately, there is no automatic way to obtain such alignments from RaGOO. I would encourage you to read the paper to see which alignments are used for which steps. There are certain steps that rely on the longest alignment between a contig and its assigned reference sequence, and those might be somewhat straightforward to pull out yourself.

@soungalo
Copy link
Author

soungalo commented Apr 2, 2020

Thanks - I'll give it a try.
Could be a nice addition to the output (as requested in my original post)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants