Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotating Breakend Points (<BND>) #111

Open
ajaarma opened this issue Jun 24, 2019 · 3 comments
Open

Annotating Breakend Points (<BND>) #111

ajaarma opened this issue Jun 24, 2019 · 3 comments

Comments

@ajaarma
Copy link

ajaarma commented Jun 24, 2019

Hi,
I am trying to annotate breakend points or BND structural variants called by illumina (Manta) but it seems it cannot annotate it because it doesnot recognize the ALT-id tags for BND types. The files are attached vcf_error.zip. The contents of the zipped file:

test.vcf: List of example query variants
gnomad_test.bed.gz: List of example variants with their coordinates and ID that can be used to annotate the query vcf file
vcfanno_bed.conf.toml: Configuration file

In the attached test.vcf file there are two variants :
Variant-1:
1 261425 MantaBND:58922:1:10:0:0:0:1 A [chr4:190113797[GA 292 PASS SVTYPE=BND;MATEID=MantaBND:58922:1:10:0:0:0:0;SVINSLEN=1;SVINSSEQ=G;BND_DEPTH=106;MATE_BND_DEPTH=42;AC=1;AN=2;CSQT=1|AP006222.1|ENST00000441866.2|transcript_variant GT:FT:GQ:PL:PR:SR 0/1:PASS:292:342,0,999:37,1:65,19

and Variant-2
1 261425 MantaBND:58922:1:10:0:0:0:1 A 292 PASS END=261426;SVTYPE=BND;MATEID=MantaBND:58922:1:10:0:0:0:0;SVINSLEN=1;SVINSSEQ=G;BND_DEPTH=106;MATE_BND_DEPTH=42;AC=1;AN=2;CSQT=1|AP006222.1|ENST00000441866.2|transcript_variant GT:FT:GQ:PL:PR:SR 0/1:PASS:292:342,0,999:37,1:65,19

These two variants represent the Breakend points or BND type events. The variant-1 is the true variant without any modification and variant-2 is same as variant-1 but edited with ALT-ID changed to (can also be or DUP:TANDEM etc) and endpoint tag END=261426 was added to this line.

I am trying to annotate with gnomad_test.bed.gz file (attached here) that has exactly same coordinates as this variant:
1 261425 261426 LP000Test

The configuration file is also attached: vcfanno_bed.conf.toml

I used the command as:
vcfanno -p 4 -ends -permissive-overlap vcfanno_bed.conf.toml test.vcf

The resulting annotation for Variant-1:
-- Not annotated with LP000Test

Whereas for variant-2:
-- gets annotated with LP000Test.

It seems the problem is that vcfanno cannot recognize or find the ALT-id and END point of the BND variant type.

Is there a way it can be fixed in vcfanno. This will help a lot when I am trying to compute internal overlap with our large cohort (n>800 samples). Currently, many of these BND types are getting missed out because of it and affects overall interpretation.

Thanks for helping it out.
vcfanno_error.zip

@brentp
Copy link
Owner

brentp commented Jun 26, 2019

I haven't looked at the data yet, but just to clarify:

1 261425 261426 LP000Test

in BED format will not overlap:

1 261425 MantaBND:58922:1:10:0:0:0:1 A [chr4:190113797[GA 292 PASS 

because the VCF is 1-based and the BED is 0-based.
Does that resolve your concerns?

@ajaarma
Copy link
Author

ajaarma commented Jul 1, 2019

Hi Brentp,

Thanks for the response and I agree with your argument. Hence, for the same reason I converted my BED to 1-based whose coordinate is as shown
1 261425 261426 LP000Test
with the logic as described here: https://www.biostars.org/p/84686/ and repharsing the same as:
if (type=SNV){start=start+1; end=end;}
if (type=DEL){start=start+1; end=end;}
if (type=INS){start=start; end=end+1;}
I can relate the BND as Insertion event.

But still it doesnot get annotated with all the BND events with same coordinates as in our cohort. It works when I edit and impute the ALT-ID as INS:[chr4:190113797[GA and adding END=261426 then I get all the BNDs with exactly same coordinate as present in my cohort.

This small makeshift edit works for me but if possible this can be fixed some thing in vcfanno code?

@brentp
Copy link
Owner

brentp commented Jul 1, 2019

can you post a 1 line vcf (with header) and a 1 line bed that demonstrate the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants