Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPAdes + RaGOO #67

Open
francicco opened this issue May 22, 2020 · 3 comments
Open

SPAdes + RaGOO #67

francicco opened this issue May 22, 2020 · 3 comments

Comments

@francicco
Copy link

Hi,

I'm setting un a pipeline to assemble short-read data at low coverage using a reference assembly of a very closely related species (GS: ~400Mb). Without going into details I have a step where I assemble the sr dataset with SPAdes. The idea now would be to use RaGOO to sort and assemble those short contigs with the reference. the first question is whether RaGGO is a good fit for it and how to run it.
I give it a small test running it like this:

ragoo.py -t 20 deNovo_Superblocks_200.fa Herd.Pilon.Heet.fasta

It finished pretty soon but the result was pretty weird, a very large assembly size (~1 Gb).

Any advice or suggestions?

Thanks a lot
Francesco

@malonge
Copy link
Owner

malonge commented May 26, 2020

Hi there,

With respect to the assembly size, the RaGOO scaffolds should just be an ordering and orienting of the input contigs. What was the total assembly size for the contigs? In theory, if there are a lot of contigs (and therefore gaps) the gap size may add up. Maybe you can check to see what percentage of the scaffolds is gap sequence. And it would be helpful to know the general assembly stats for your input assembly.

One shortcoming of RaGOO v1 is that it was really designed for contiguous long read assemblies. RaGOO v2, which will be beta-released in the next few weeks, should be much better suited for fragmented short-read assemblies. When it is released, I will be sure to notify you here and perhaps that will serve you better.

@francicco
Copy link
Author

Hi @malonge,

I see your point. I think I also have to be sure that those fragments (contigs) are not still overlapping, which I think it's the case. So far I used Nucmer + Amos, but there are too many fragments and it takes too much and sometimes Amos just can't handle it.
Any suggestion?

Thanks, and I'll try RaGOO v2 for sure!
F

@malonge
Copy link
Owner

malonge commented May 26, 2020

As a general rule, I would do what the VGP is doing (github). I believe they use a tool called purge_dups.

That said, these methods may not work as well on fragmented short-read assemblies. But perhaps they can point you in the right direction.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants