Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intrahost.py and vphaser errors #1012

Open
elasekness opened this issue Nov 8, 2023 · 0 comments
Open

intrahost.py and vphaser errors #1012

elasekness opened this issue Nov 8, 2023 · 0 comments

Comments

@elasekness
Copy link

I am using the containerized version of viral-ngs (broadinstitute/viral-ngs) to run intrahost.py.

I have a coordinate sorted bam file (generated with bwa mem) with only properly paired reads included. I also added a readgroup although I only have one library per sample.

Regardless, when I execute intrahost.py with the following command:
docker run --rm -v $(pwd):/data -w /data quay.io/broadinstitute/viral-ngs intrahost.py vphaser_one_sample CTP10.rg.sorted.bam CTP10.fasta CTP10.tsv

I get an error "get_column_x: fail to identify mapping"

Full error message:
2023-11-08 20:36:05,207 - cmd:197:main_argparse - INFO - software version: v1.25.0-8-ge144969, python version: 3.6.7 | packaged by conda-forge | (default, Jul 2 2019, 02:18:42)
[GCC 7.3.0]
2023-11-08 20:36:05,207 - cmd:199:main_argparse - INFO - command: /opt/viral-ngs/source/intrahost.py vphaser_one_sample inBam=CTP10.rg.sorted.bam inConsFasta=CTP10.fasta outTab=CTP10.tsv vphaserNumThreads=None minReadsEach=0 maxBias=10 removeDoublyMappedReads=False loglevel=INFO
2023-11-08 20:36:12,698 - vphaser2:48:execute - ERROR - b'\n--------------------------------------------------------\nProgram runs with the following Parameter setting:\n\n\tinput BAM file\t=\tCTP10.rg.sorted.bam\n\toutput Directory\t=\t/tmp/tmpbdyy4c1_vphaser2\n\terrModel\t\t=\tpileup + phase\n\talpha\t\t=\t0.05\n\tignoreBases \t=\t0\n\t(var_matepair, var_cycle, var_dt, var_qt)\t=\t1,1,1,20\n\tpSample\t\t=\t30%\n\twindowSz\t=\t500\n\tdelta\t=\t2\n\n--------------------------------------------------------\n\n\n\t1 bam file(s) found: \n\t\tCTP10.rg.sorted.bam\n\n\nParse bam header: get refSeq info & sanity check\n\n\tCTP10 len =11029\n\n\t1 ref sequence(s) found: \n\t\tName: CTP10\n\t\t\tBamfileID = 0\tRefID = 0\n\n\n\t0 platform(s) found: \n\n\nGet maxQ, minQ, maxReadLen, avgFragSz, stdFragSz from bam files ...\n\n\tTotal Reads = 80726\n\t# Mapped Reads = 80726\n\t# Reads used for checking Q scores = 24132\n\tminQ = 35\tmaxQ=69\t\tmaxRL = 149\n\t(avgfragSz, std) = 205\t156\n\nGenerate qual -> quantile map ... \n\n\nSet up paired read map arrays ... \n\n\t# total mapped reads: 80726\n\t# mapped mate-pairs = 40363\n\nPrepare aln columns file...\n\n Ref: CTP10 , len = 11029\n\n\n\t\tcreate file: /tmp/tmpbdyy4c1_vphaser2/CTP10.0.499.region\n\n\t\tcreate file: /tmp/tmpbdyy4c1_vphaser2/CTP10.500.999.region\n[EXIT]: get_column_x: fail to identify mapping\n'
b'\n--------------------------------------------------------\nProgram runs with the following Parameter setting:\n\n\tinput BAM file\t=\tCTP10.rg.sorted.bam\n\toutput Directory\t=\t/tmp/tmpbdyy4c1_vphaser2\n\terrModel\t\t=\tpileup + phase\n\talpha\t\t=\t0.05\n\tignoreBases \t=\t0\n\t(var_matepair, var_cycle, var_dt, var_qt)\t=\t1,1,1,20\n\tpSample\t\t=\t30%\n\twindowSz\t=\t500\n\tdelta\t=\t2\n\n--------------------------------------------------------\n\n\n\t1 bam file(s) found: \n\t\tCTP10.rg.sorted.bam\n\n\nParse bam header: get refSeq info & sanity check\n\n\tCTP10 len =11029\n\n\t1 ref sequence(s) found: \n\t\tName: CTP10\n\t\t\tBamfileID = 0\tRefID = 0\n\n\n\t0 platform(s) found: \n\n\nGet maxQ, minQ, maxReadLen, avgFragSz, stdFragSz from bam files ...\n\n\tTotal Reads = 80726\n\t# Mapped Reads = 80726\n\t# Reads used for checking Q scores = 24132\n\tminQ = 35\tmaxQ=69\t\tmaxRL = 149\n\t(avgfragSz, std) = 205\t156\n\nGenerate qual -> quantile map ... \n\n\nSet up paired read map arrays ... \n\n\t# total mapped reads: 80726\n\t# mapped mate-pairs = 40363\n\nPrepare aln columns file...\n\n Ref: CTP10 , len = 11029\n\n\n\t\tcreate file: /tmp/tmpbdyy4c1_vphaser2/CTP10.0.499.region\n\n\t\tcreate file: /tmp/tmpbdyy4c1_vphaser2/CTP10.500.999.region\n[EXIT]: get_column_x: fail to identify mapping\n'
Traceback (most recent call last):
File "/opt/viral-ngs/source/intrahost.py", line 1203, in
util.cmd.main_argparse(commands, doc)
File "/opt/viral-ngs/source/util/cmd.py", line 224, in main_argparse
ret = args.func_main(args)
File "/opt/viral-ngs/source/util/cmd.py", line 106, in _main
mainfunc(**args2)
File "/opt/viral-ngs/source/intrahost.py", line 150, in vphaser_one_sample
for row in libraryFilteredIter:
File "/opt/viral-ngs/source/intrahost.py", line 226, in compute_library_bias
for row in isnvs:
File "/opt/viral-ngs/source/intrahost.py", line 163, in filter_strand_bias
for row in isnvs:
File "/opt/viral-ngs/source/tools/vphaser2.py", line 61, in iterate
self.execute(inBam, outdir, numThreads)
File "/opt/viral-ngs/source/tools/vphaser2.py", line 45, in execute
subprocess.check_output(cmd, env=envCopy, stderr=subprocess.STDOUT)
File "/opt/miniconda/envs/viral-ngs-env/lib/python3.6/subprocess.py", line 336, in check_output
**kwargs).stdout
File "/opt/miniconda/envs/viral-ngs-env/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/opt/miniconda/envs/viral-ngs-env/bin/vphaser2', '-i', 'CTP10.rg.sorted.bam', '-o', '/tmp/tmpbdyy4c1_vphaser2']' returned non-zero exit status 1.

I tried to run vphaser, as the error seems to originate there:
command:
docker run --rm -v $(pwd):/data -w /data quay.io/broadinstitute/viral-ngs vphaser2 -i CTP10.rg.sorted.bam -o vphaser_output
error message:
create file: vphaser_output/CTP10.0.499.region
[EXIT]: create_output_file: can't open file vphaser_output/CTP10.0.499.region

There is read coverage over these regions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant