Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect zero-based VCF file #202

Open
ethering opened this issue Dec 5, 2023 · 0 comments
Open

Incorrect zero-based VCF file #202

ethering opened this issue Dec 5, 2023 · 0 comments

Comments

@ethering
Copy link

ethering commented Dec 5, 2023

Hi,
I'm using SURVIVOR (v1.0.7) for the first time and trying out some test data.
I've generated a params file using
SURVIVOR simSV test_params.param
and then simulated the SVs onto a reference genome.
SURVIVOR simSV ref.fasta test_params.param 0.1 0 simulated

Upon viewing the generated simulated.vcf file, I see that the VCF file is using zero-based coordinates when VCF uses 1-based coordinates.
Here are the first 16 characters of my reference sequence:

>SJAP_Chr1
CACCAAAAACCCTAAG

Here are the first 16 characters of my simulated sequence:

>SJAP_Chr1
AACAAAAAAGCCTAAA

and here are the lines in the VCF file that relate to the variants (the VCF header states 'source=Sniffles').

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT
SJAP_Chr1       0       SNP0SURVIVOR    C       A       PRECISE;SVMETHOD=SURVIVOR_sim;SVLEN=1   GT:GL:GQ:FT:RC:DR:DV:RR:RV      1/1
SJAP_Chr1       3       SNP1SURVIVOR    C       A       PRECISE;SVMETHOD=SURVIVOR_sim;SVLEN=1   GT:GL:GQ:FT:RC:DR:DV:RR:RV      1/1
SJAP_Chr1       9       SNP2SURVIVOR    C       G       PRECISE;SVMETHOD=SURVIVOR_sim;SVLEN=1   GT:GL:GQ:FT:RC:DR:DV:RR:RV      1/1
SJAP_Chr1       15      SNP3SURVIVOR    G       A       PRECISE;SVMETHOD=SURVIVOR_sim;SVLEN=1   GT:GL:GQ:FT:RC:DR:DV:RR:RV

As you can see, the first variant is at position zero and then the subsequent co-ordinates are also zero-based. The actual 'POS' values should be 1, 4, 10, and 16.

Based on my initial use of SURVIVOR, I have two other comments:

  1. The 'simulated.bed' file is not in BED format, so perhaps it would be better to give it a different file extension
  2. I can't find any command line help for the simSV method. Typing SURVIVOR simSV --help gives the stdout of Parameter file generated and produces a file called --help
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant