Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when working with --ins-info-key #78

Open
nvnieuwk opened this issue Feb 28, 2023 · 1 comment
Open

Error when working with --ins-info-key #78

nvnieuwk opened this issue Feb 28, 2023 · 1 comment

Comments

@nvnieuwk
Copy link

Hi, I'm trying to run Paragraph on VCFs that have been merged from different callers. Therefore I had to create a uniform info field for all callers which I called SVINSSEQ (which is the one from manta). When I try to run the tool specifying this key, it somehow turns the sequence into a tuple which causes the following error:

  Traceback (most recent call last):
    File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 121, in worker
      result = (True, func(*args, **kwds))
    File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
      return list(map(*args))
    File "/usr/local/lib/python3/grm/vcf2paragraph/__init__.py", line 286, in run_vcf2paragraph
      alt_paths=params["alt_paths"])
    File "/usr/local/lib/python3/grm/vcf2paragraph/__init__.py", line 86, in convert_vcf
      ref, indexed_vcf.name, ins_info_key, chrom, start, end, ref_node_padding, allele_graph)
    File "/usr/local/lib/python3/grm/vcfgraph/vcfgraph.py", line 128, in create_from_vcf
      graph.add_record(record, allele_graph, varId, ins_info_key)
    File "/usr/local/lib/python3/grm/vcfgraph/vcfgraph.py", line 179, in add_record
      ins_seq = vcf.info[ins_info_key].upper()
  AttributeError: 'tuple' object has no attribute 'upper'

The command run is:

multigrmpy.py \
    --input PosCon4.vcf.gz \
    --manifest PosCon4.tsv \
    --output PosCon4_genotyped \
    --reference GCA_000001405.15_GRCh38_full_plus_hs38d1_analysis_set.fna \
    --threads 1 \
    --ins-info-key SVINSSEQ

The VCF contains one variant (insertion) which looks like this:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	PosCon4
chrX	153903418	1_INS00000000	T	<INS>	480	PASS	PRECISE;SVTYPE=INS;SVMETHOD=JASMINE;END=153903418;SVLEN=35;PE=0;MAPQ=0;CT=NtoN;CIPOS=-14,14;CIEND=-14,14;SRMAPQ=60;INSLEN=35;HOMLEN=13;SR=8;SRQ=1;SVINSSEQ=AAGATGCGGGGTGTGATGTGCACCTGTGTGTGCTGCGGGTGTGTGCGTGTGTGGTGTTGGCTGTGCGTATGTGGTGTGGTATGGTGTGCAGGTGCATGCAGGTGCGTGGTGTGTATGGCTGTGTGGTGGGTACATGTGTGGGTGTGTGGCGTATGGGAGTGTGTGATGTGTGCATGTGTGTGGTGTG;CE=1.66556;STARTVARIANCE=0.000000;ENDVARIANCE=0.000000;AVG_LEN=35.000000;AVG_START=153903418.000000;AVG_END=153903419.000000;VARCALLS=1;ALLVARS_EXT=(INS00000000);SUPP_VEC_EXT=010;IDLIST_EXT=INS00000000;SUPP_EXT=1;SUPP_VEC=010;SUPP=1;IDLIST=INS00000000;INTRASAMPLE_IDLIST=INS00000000	GT:GL:GQ:FT:RCL:RC:RCR:RDCN:DR:DV:RR:RV	1/1:-125.994,-10.8314,0:108:PASS:12619:21001:8382:2:0:0:0:36

(headers were removed here to not make this too long)

What can I do to solve this?
Many thanks in advance
-Nicolas

@nvnieuwk
Copy link
Author

nvnieuwk commented Mar 1, 2023

Apparently the info field describing SVINSSEQ should look like this:

##INFO=<ID=SVINSSEQ,Number=1,Type=String,Description="Sequence of insertion">

And the error happens when the header looks like this:

##INFO=<ID=SVINSSEQ,Number=.,Type=String,Description="Sequence of insertion">

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant