Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snp-sites do not see do deletion events #94

Open
bioinfornatics opened this issue Jul 28, 2020 · 1 comment
Open

snp-sites do not see do deletion events #94

bioinfornatics opened this issue Jul 28, 2020 · 1 comment

Comments

@bioinfornatics
Copy link

Dear,

In order to validate snp-sitesin our lab we have compared the results with msa2vcf.jar

Thus we have took two close sequences from ncbi (ref NC_045512.2 and MT007544.1)

msa2vcf

$ java -jar dist/msa2vcf.jar --consensus 'NC_045512.2 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome' --output ../sars_cov2.2.vcf ../sars_cov2.aln
[INFO][MsaToVcf]format : Fasta
$ cat ../sars_cov2.2.vcf
##fileformat=VCFv4.2
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
##contig=<ID=chrUn,length=29903>
##msa2vcf.meta=compilation:20200728120720 githash:af51aa3 htsjdk:2.22.0 date:20200728122012 cmd:--consensus NC_045512.2 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome --output ../sars_cov2.2.vcf ../sars_cov2.aln
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  MT007544.1 Severe acute respiratory syndrome coronavirus 2 isolate Australia/VIC01/2020, complete genome        NC_045512.2 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome
chrUn   19065   .       T       C       .       .       DP=2    GT:DP   1/1:1   0/0:1
chrUn   22303   .       T       G       .       .       DP=2    GT:DP   1/1:1   0/0:1
chrUn   26144   .       G       T       .       .       DP=2    GT:DP   1/1:1   0/0:1
chrUn   29749   .       ACGATCGAGTG     A       .       .       DP=2    GT:DP   1/1:1   0/0:1

snp-sites

$ snp-sites -c -v -o sars_cov2.vcf  sars_cov2.aln
$ cat sars_cov2.vcf
##fileformat=VCFv4.1
##contig=<ID=1,length=29903>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  NC_045512.2     MT007544.1
1       19065   .       T       C       .       .       .       GT      0       1
1       22303   .       T       G       .       .       .       GT      0       1
1       26144   .       G       T       .       .       .       GT      0       1

Problem

snp-sites do not report the deletion

@bioinfornatics bioinfornatics changed the title snp-sites results differ from msa2vcf.jar snp-sites do not see do deletion events Oct 9, 2020
@stitam
Copy link

stitam commented Sep 23, 2022

Hi,

I may have come across the same problem, I think snp-sites does not recognise the "-" character, for example:

seq1
ATGAAAATTGGATT
seq2
ATGAAA-TTGGATT

snp-sites -v -o test.vcf test.aln

Here there is a deletion in position 7 in seq2, but snp-sites will not record it. Note, if I replace "-" with any other character e.g. "*", snp-sites will find the site. I am using the docker from staphb (https://hub.docker.com/r/staphb/snp-sites) which contains snp-sites v2.5.1.

I am wondering though, is this behaviour intended? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants