Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circlator seems to run forever #173

Open
tnn111 opened this issue Feb 4, 2021 · 2 comments
Open

Circlator seems to run forever #173

tnn111 opened this issue Feb 4, 2021 · 2 comments

Comments

@tnn111
Copy link

tnn111 commented Feb 4, 2021

In a number of cases, circlator appears to run forever. It's spades-bwa that's doing it. Has anyone else noticed this? Is that a reason to run SPAdes 3.7?

I tried switching to using canu as an assembler, but it fails because it can't find a GFA file towards the end. Circlator is great as a tool, but it's difficult to get things to work at times.

@schmittel
Copy link

Same here. I ran Circlator on a PacBio assembly and it took about 2 hours to complete. Ran it again on another sample (of the same genome, same size/depth) and it's still running after 36 hours.. Also stuck on spades-bwa.

@lauralwd
Copy link

lauralwd commented Jul 30, 2021

I'm running into this too. Spades-bwa is allocated multiple threads in my case but uses only one. I'm running spades 3.13.0. (I couldn't get 3.7.1 to run in the same environment as circlator) what version of spades are you running?

edit: not using spades 3.7.1 should not be an issue, see more info in this thread: #72

Some extra details.

Long story short, this is likely a spades thing rather than a circulator thing.

Circlator does several assemblies, for several k-mers. Most assemblies in my case take just a couple of minutes, but then a certain one gets stuck seemingly forever. Looking closer to what spades-bwa is doing, I find that a proper sam file is created; find it in the 03.assemble*/tmp/corrector*/*/*.sam

The spades log file last two lines look like this for me:

[main] Real time: 0.924 sec; CPU: 0.393 sec
  0:00:00.998     4M / 4M    INFO   DatasetProcessor         (dataset_processor.cpp     : 173)   Running bwa mem ...:/home/laura/miniconda3/envs/ciclator/share/spades-3.12.0-2/bin/spades-bwa mem  -v 1 -t 6 /stor/azolla_mitochondrium/assembly/laura/circlator/subset2-dedup1_circlator_v1/03.assemble.tmp.spades.97.r3mz5_pn/misc/assembled_contigs.fasta /stor/azolla_mitochondrium/assembly/laura/circlator/subset2-dedup1_circlator_v1/02.bam2reads.fasta  > /stor/azolla_mitochondrium/assembly/laura/circlator/subset2-dedup1_circlator_v1/03.assemble.tmp.spades.97.r3mz5_pn/tmp/corrector_7cxabnsk/lib0_BOqZpn/tmp.sam

Waiting for this to finish seems pointless, the samfile looks fine but spades-bwa hangs for some reason. I'll try some different spades versions to see if that resolves the issue.

  • spades=3.15.3=h95f258a_0 (the most recent version in conda) does not resolve the issue.

My ugly but pragmatic fix:

For me it seems spades-bwa only gets stuck at certain assemblies, but not others.
Remember that circlator does several assemblies at several kmers and then chooses the one with the highest N50.
You can just remove the kmer values for which the assemblies are problematic by either:

  • using the first best assembly circlator does: --assemble_spades_use_first
  • removing the problematic assemblies by specifying your own values for k --assemble_spades_k k1,k2,k3,...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants