New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Got broken pipeline when running freebayes-parallel #561
Comments
When I had problems with freebayes, it was because I didnt have the last version which will use python 3 and not python 2. try to update freebayes and use the newest version of snippy, look at the path which freebayes Maybe this can help? |
your freebayes is too old. snippy too. the newest version had a corrected version of vcffirshheader. |
Their snippy is the latest version 4.6.0 and freebayes is probably the correct version too, given that I am gettin gthe same error. From what i have learned, vcflib got updated and that breaks snippy. Try specifiying this when installing with conda: |
Ignore the previous suggestion that didn't work, at least not for me. I have managed to fix it by using the follwing recipe (in a .yaml file).
|
Also you need to be careful with bcftools and snpeff. When I had bcftools 1.17 and snpeff 5.0 snippy would detect 0 snps across all samples. I have tested now bcftool 1.15.0 and snpeff 5.1 and it works |
i have exported the environment list to a recipe that should work in case you are still having this issue. Just rename the file from .txt to .yaml and it should be fine. |
Also tried: which builds an environment but errors out during the run Seems that Snippy is properly in dependency hell by now :( |
Yes, I have seen that my build doesn't work for everyone... I also used mamba. I specified exactly those dependencies and worked for me... My freebayes is 1.3.6, samtools 1.17 and bwa 0.7.17 in case it helps. I'd suggest playing around with the different versions. My colleague also couldn't make my version worked but managed to install it in the end, will ask for their dependencies. |
Okay so my above mentioned env actually calls the SNPs when working with a .fa instead of a .gbff
I'm not too proficient with bcftools so no idea why sample isn't specified in my vcf Adding the deps you just posted didn't fix it sadly At least these runs yield the |
No guys it's because some parts of the pipeline use scrips like VCF first header which are old and expect python 2 syntax. You have to get the latest versions of each components or to go manually into the file that gives the error and change the syntax so it can work with snippy. I got the same error a few months ago and I got it solved by uninstalling the packages and reinstalling them. I think the issue comes from you not using the correct versions or your paths not properly set between your virtual environments. What is suggest is to try different things. Create a new conda environment, download up to date snippy (the snippy dependencies will work with python3). You try to uninstall the different packages and reinstall them if necessary. Use pip install when possible. I think that the issue comes from the fact that some parts of the script like in vcffirstheader are not good. You should use the latest version of each package and install them manually and add them to path by yourself. You see that the error says that vcffirstheader can't strip the line and that is because it is using an old version of python and nowadays the function is written differently. So you need to update your vcffirstheader or to change this part of the code. Also for the previous comments please uninstall each package / library of your system and of conda environment. And reinstall them. Some commands: To know which version of the software you are using: which python3 To know all of the versions of a software installed on a system: which -a python3 To uninstall them: Sudo apt uninstall. (Or something like that) To install perl packages please use cpan Once you have Installed the software you need to add it to your path You can add it to .bashrc or to .profile nano ~/.profile nano ~/.bashrc You can ask chatgpt for help. Something like PATH=$:add your path here to the new packages Then you need to source it to activate it to the current environment source ~/.profile For the effects to take change. Also you have to read the logs. For instance for the vcffirstheader error. It is written in the logs that the error is caused by linestrip() Snippy is fine and working well but what probably happened is that you installed old versions of a dependency on the base environment of conda or on the root environment or that your paths are not properly set, hence the problem. Also be careful because python and python3 can be both installed and based on the scripts some will call python and some will call python3 which is infact the same. But two different installations from my understanding. So you have to give the correct path too. |
Could you please write the full command that you used for running snippy? And also the version of each packages on the environment? |
Your wall of text boils down to: It is not obvious which program has an incompatible version
|
I also suspect .gbff is not supported, but .gbk which it expects is deprecated by NCBI :< |
Yes it is very possible that it is caused with the input files. Can you try using a .fna as a genomic assembly reference? See if it works. Also can you do this command please: snippy version |
That is a separate problem when using .gbff As per my command above you can see I already do this, and run into the bcftools consensus error
|
What I suggest you to do is simply to download the snippy binary from GitHub github.com/tseemann/snippy/releases Download "source code (tar.gz)" Then extract it and put the snippy folder in your home folder or anywhere. then you add snippy executable to your path For instance if you have the folder snippy-4.6.0 in your home. You can do export PATH=$HOME/snippy-4.6.0/bin:$PATH You have to do it everytime you run a script in the console, or write it at the beginning of your script. Or you can write this command directly in your ~/.profile Then do The idea is that you already installed most dependencies on your conda environment. But you will change the path for snippy. And because snippy points to its own libraries within its own folder. By using the binary release you can have something that may work. |
Can you please write the following in your console: which -a snippy |
Respectfully, that doesn't make sense at all. That file is exactly the one I am running right now, it's just here: And it runs the same dependencies again, so nothing changed. |
Could you just post the versions of the dependencies that you are using? |
If I was you I would try to change your genbank file into a fasta using any2fasta and see if it works. Then run snippy on it. Here are my versions https://github.com/tseemann/snippy Torsten Seemann (sntppy) quentin@quentin-B450-AORUS-ELITE:~$ snippy version [22:41:24] This is snippy 4.6.0 [22:41:24] Written by Torsten Seemann [22:41:24] Obtained from https://github.com/tseemann/sntppy [22:41:24] Detected operating system: linux [22:41:24] Enabling bundled linux tools. [22:41:24] Found bwa /home/quentin/snippy/binaries/linux/bwa [22:41:24] Found beftools /home/quentin/snippy/binaries/linux/bcftools Found samtools- /home/quentin/snippy/binaries/linux/samtools [22:41:24] Found java- /home/quentin/Downloads/openjdk-11 linux-x64_bin/jdk-11/bin/java Found snpEff /home/quentin/snippy/binaries/noarch/snpEff [22:41:24] Found seqtk - /home/quentin/sntppy/binaries/linux/seqtk Found freebayes- /home/quentin/snippy/binaries/linux/freebayes [22:41:24] [22:41:24] Found vcfuntq /home/quentin/snippy/binaries/linux/vcfuntq [22:41:24] Found vcffirstheader- /home/quentin/sntppy/binaries/noarch/vcfftrstheader [22:41:24] Found gztp /usr/bin/gztp [22:41:24]Found vt /home/quentin/sntppy/binaries/linux/vt [22:41:24] Found sntppy-vcf_to_tab /home/quentin/snippy/bin/snippy-vcf_to_tab Found snippy-vcf_report /home/quentin/snippy/bin/snippy-vcf_report [22:41:24] Checking verston: samtools --version ts >= 1.7 ok, have 1.10 [22:41:24] Checking version: bcftools --version ts >= 1.7 ok, have 1.10 [22:41:24] Checking version: freebayes-version is 1.1 ok, have 1.3.1 [22:41:25] Checking version: snpEff version is >= 4.3 ok, have 4.3 [22:41:25] Checking version: bwa ts > 0.7.12 ok, have 0.7.17 [22:41:25] Please supply a reference FASTA/GBK/EMBL file |
|
Ok and now do which -a python which snippy |
Maybe your problem is related to this issue.. |
The bcftools problem was related to this, my mappings had no RG tag, letting snippy map itself fixed it, thanks! Doing that I also discovered that snippy will fail to run bwa commands when the read data paths have spaces in them, and symlinks don't help because it resolves the paths first... I can work around that though, and all should work Conclusion: instead of using conda/mamba, download the release as it has all the finicky dependencies packaged as binaries. |
Can you go into the folder where your vcf is and do something like bcftools view snps.vcf.gz Also can you try to run snippy but add the option -rgid "CM13" Let me know what happens when you add this option to the snippy command |
I tried this, did not help - the tag has to be in the .bam files, which snippy adds during mapping |
Good job! https://github.com/josephryan/RyanLabUnixBestPractices also sometimes it is possible to use the argument as a string between " " or you could do like this in other situations if your file is named "file with spaces in the name" you can add escape characters: file\ with\ space\ in\ the\ name export PATH=$HOME/snippy-4.6.0/bin:$PATH Yes using the binary release is the best way to have something stable in most cases. Have a great day. |
I got this error like this. Does anyone can help to fix??
I run command on HPC:
snippy --R1 $p_trim/VN0467_1_paired.fastq.gz --R2 $p_trim/VN0467_2_paired.fastq.gz --outdir /lustre7/home/buihoangphuc412/Projects/GWAS/variant/VN0467 --prefix VN0467 --ref $p_ref/genome/CHC97.fasta --report --cpus 4 --ram 2 --basequal 15
VN0467.log
echo snippy 4.6.0
cd /lustre7/home/buihoangphuc412
/home/buihoangphuc412/miniconda3/envs/pyseer/bin/snippy --R1 /lustre7/home/buihoangphuc412/Projects/GWAS/trim/VN0467_1_paired.fastq.gz --R2 /lustre7/home/buihoangphuc412/Projects/GWAS/trim/VN0467_2_paired.fastq.gz --outdir /lustre7/home/buihoangphuc412/Projects/GWAS/variant/VN0467 --prefix VN0467 --ref /lustre7/home/buihoangphuc412/Projects/GWAS/ref/genome/CHC97.fasta --report --cpus 4 --ram 2 --basequal 15
samtools faidx reference/ref.fa
bwa index reference/ref.fa
[bwa_index] Pack FASTA... 0.00 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.25 seconds elapse.
[bwa_index] Update BWT... 0.02 sec
[bwa_index] Pack forward-only FASTA... 0.00 sec
[bwa_index] Construct SA from BWT and Occ... 0.09 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index reference/ref.fa
[main] Real time: 0.420 sec; CPU: 0.368 sec
mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa
ln -sf reference/ref.fa .
ln -sf reference/ref.fa.fai .
mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz
bwa mem -Y -M -R '@rg\tID:VN0467\tSM:VN0467' -t 4 reference/ref.fa /lustre7/home/buihoangphuc412/Projects/GWAS/trim/VN0467_1_paired.fastq.gz /lustre7/home/buihoangphuc412/Projects/GWAS/trim/VN0467_2_paired.fastq.gz | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /tmp --threads 1 -m 1000M | samtools fixmate -m --threads 1 - - | samtools sort -l 0 -T /tmp --threads 1 -m 1000M | samtools markdup -T /tmp --threads 1 -r -s - - > VN0467.bam
COMMAND: samtools markdup -T /tmp --threads 1 -r -s - -
READ: 907222
WRITTEN: 895475
EXCLUDED: 106405
EXAMINED: 800817
PAIRED: 752182
SINGLE: 48635
DUPLICATE PAIR: 1168
DUPLICATE SINGLE: 10579
DUPLICATE PAIR OPTICAL: 0
DUPLICATE SINGLE OPTICAL: 0
DUPLICATE NON PRIMARY: 0
DUPLICATE NON PRIMARY OPTICAL: 0
DUPLICATE PRIMARY TOTAL: 11747
DUPLICATE TOTAL: 11747
ESTIMATED_LIBRARY_SIZE: 120974295
samtools index VN0467.bam
fasta_generate_regions.py reference/ref.fa.fai 237553 > reference/ref.txt
freebayes-parallel reference/ref.txt 4 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 15 -m 60 --strict-vcf -f reference/ref.fa VN0467.bam > VN0467.raw.vcf
vcfstreamsort: symbol lookup error: /lustre7/home/buihoangphuc412/miniconda3/envs/pyseer/bin/../lib/libvcflib.so.1: undefined symbol: wavefront_align
vcfuniq: symbol lookup error: /lustre7/home/buihoangphuc412/miniconda3/envs/pyseer/bin/../lib/libvcflib.so.1: undefined symbol: wavefront_align
Traceback (most recent call last):
File "/lustre7/home/buihoangphuc412/miniconda3/envs/pyseer/bin/vcffirstheader", line 17, in
print(line.strip())
BrokenPipeError: [Errno 32] Broken pipe
The text was updated successfully, but these errors were encountered: