Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error - report not written and empty samples #262

Open
tristanpwdennis opened this issue Mar 12, 2020 · 1 comment
Open

Error - report not written and empty samples #262

tristanpwdennis opened this issue Mar 12, 2020 · 1 comment
Assignees

Comments

@tristanpwdennis
Copy link

tristanpwdennis commented Mar 12, 2020

Hi,
We are running Nullarbor on a mixture of samples we sequenced ourselves, and some downloaded from ENA.
When we run, we don't get a final report. Looking more closely, it looks like, for the samples we downlaoded from ENA, the assembled contigs (fasta) and snps (vcf) are empty (fasta all gaps and vcf is just the header). The nohup.out says there has been some error with java - I've compressed it and added it below.

I ran the offending samples through a workflow I use for mapping/snp calling (bwa mem, samtools, freebayes) and I found that there is a problem with bwa mem throwing an error due to orphan reads in the split paired read files generated by using the ena toolkit 'fastq-dump' command. When I download the .fastq files as a single interleaved file seems to work fine. Not sure if this is contributing to the problem or not.

The command we ran was:
Command 1: nullarbor2.pl --name ancientA --ref /home/ubuntu/volume_sdb/Anthrax/AncientA/CZC5_NZ_AP018443.1.fasta --input ./ancientA.tab --outdir ancientA
Command 2: nohup nice make -j 2 -C /home/ubuntu/volume_sdb/Nullarbor/AncientA/ancientA &

Any assistance would be greatly appreciated, thank you so much!

nohup.out.gz

@tseemann tseemann self-assigned this Mar 13, 2020
@tseemann
Copy link
Owner

tseemann commented Mar 13, 2020

If you don't have proper paired read files then Nullarbor will fail, as all the tools expect the same number of reads in R1 and R2. I don't check this in Nullarbor itself, as we assume people have done basic QC on their data first, but it would be a good idea to add this feature.

I note is that you are using some quite old version of tools (eg. shovill 0.9).
What version of nullarbor are you using?

I also notice assemblies with 1000s of tiny contigs. This usually means contamination of some sort.

I strongly suggest running the make preview command first and view the report to check for any outliers. Then remove those and run again until it looks ok. Then run the full pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants