Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Activating snv in sars tutorial leads to error #146

Open
schlaepferp opened this issue Jun 13, 2023 · 2 comments
Open

Activating snv in sars tutorial leads to error #146

schlaepferp opened this issue Jun 13, 2023 · 2 comments
Assignees

Comments

@schlaepferp
Copy link

I'm not sure if this is intentional (read: I miss something that I should provide somewhere), or if this is a bug.
I tried the same thing with other SARS reads, same behavior.

What I did:

I set up the SARS tutorial.
It runs through fine.

I afterwards modified the config.yaml and changed the line
snv: false
to
snf: true

Result: It breaks the code.

Expected behavior: Either an error message telling me what is missing, or running through smoothly.

Current dump

Activating conda environment: .snakemake/conda/2db3c1462cfa3088aba362563d569612_
usage: shorah [options] shotgun [-h] [-v] -b BAM -f REF
[-a FLOAT] [-r chrm:start-stop]
[-R INT] [-x INT] [-S FLOAT] [-I]
[-p FLOAT]
[-of {csv,vcf} [{csv,vcf} ...]]
[-c INT] [-w INT] [-s INT] [-k]
[-t INT]
shorah [options] shotgun: error: argument -r/--region: expected one argument
usage: shorah [options] shotgun [-h] [-v] -b BAM -f REF
[-a FLOAT] [-r chrm:start-stop]
[-R INT] [-x INT] [-S FLOAT] [-I]
[-p FLOAT]
[-of {csv,vcf} [{csv,vcf} ...]]
[-c INT] [-w INT] [-s INT] [-k]
[-t INT]
shorah [options] shotgun: error: argument -r/--region: expected one argument
[Tue Jun 13 14:44:29 2023]
Error in rule snv:
jobid: 20
input: /home/usr1/vpipe_test_dir/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta, results/SRR10903401/20200102/alignments/REF_aln.bam, results/SRR10903401/20200102/variants/coverage_intervals.tsv
output: results/SRR10903401/20200102/variants/SNVs/snvs.csv, results/SRR10903401/20200102/variants/SNVs/snvs.vcf
log: results/SRR10903401/20200102/variants/SNVs/shorah.out.log, results/SRR10903401/20200102/variants/SNVs/shorah.err.log (check log file(s) for error details)
conda-env: /home/usr1/vpipe_test_dir/work/.snakemake/conda/2db3c1462cfa3088aba362563d569612_
shell:

    let "WINDOW_SHIFTS=(250 * 4/5 + 3) / 3"
    let "WINDOW_LEN=WINDOW_SHIFTS * 3"

    echo "Windows are shifted by: ${WINDOW_SHIFTS} bp" > results/SRR10903401/20200102/variants/SNVs/shorah.out.log
    echo "The window length is: ${WINDOW_LEN} bp" >> results/SRR10903401/20200102/variants/SNVs/shorah.out.log

    # Get absolute path for input files
    CWD=${PWD}
    BAM=${PWD}/results/SRR10903401/20200102/alignments/REF_aln.bam
    REF=/home/usr1/vpipe_test_dir/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta; [[ ${REF} =~ ^/ ]] || REF=${PWD}/${REF}
    OUTFILE=${PWD}/results/SRR10903401/20200102/variants/SNVs/shorah.out.log
    ERRFILE=${PWD}/results/SRR10903401/20200102/variants/SNVs/shorah.err.log
    WORK_DIR=${PWD}/results/SRR10903401/20200102/variants/SNVs

    if [[ -n "" ]]; then
        # put input files in localscratch
        rsync -aq "${BAM}" "${REF}" "/"
        BAM="/${BAM##*/}"
        REF="/${REF##*/}"
    fi

    # Run ShoRAH in each of the predetermined regions (regions with sufficient coverage)
    LINE_COUNTER=0
    FILES=( )
    FILES_VCF=( )
    while read -r region || [[ -n ${region} ]]
    do
        echo "Running ShoRAH on region: ${region}" >> $OUTFILE
        (( ++LINE_COUNTER ))
        # Create directory for running ShoRAH in a corresponding region (if doesn't exist)
        DIR=${WORK_DIR}/REGION_${LINE_COUNTER}
        if [[ ! -d "${DIR}" ]]; then
            echo "Creating directory ${DIR}" >> $OUTFILE
            mkdir -p ${DIR}
        else
            # Results from previous runs
            if [[ false == "true" ]]; then
                DIR_DST=${WORK_DIR}/old
                echo "Moving results from a previous run to ${DIR_DST}" >> $OUTFILE
                rm -rf ${DIR_DST}/REGION_${LINE_COUNTER}
                mkdir -p ${DIR_DST}
                mv -f ${DIR} ${DIR_DST}
                mkdir -p ${DIR}
            fi
        fi
        # Change to the directory where ShoRAH is to be executed
        if [[ -z "" ]]; then
            cd ${DIR}
        else
            # special case: go in local scratch instead
            rsync -aq "${DIR}" "/"
            mkdir -p "/REGION_${LINE_COUNTER}"
            cd "/REGION_${LINE_COUNTER}"
        fi

        # NOTE: Execution command for ShoRAH2 valid from v1.99.0 and above
        shorah shotgun -t 1 -a 0.1 -w ${WINDOW_LEN} -x 100000  -p 0.9 -c 0 -r ${region} -R 42 -b ${BAM} -f ${REF} >> $OUTFILE 2> >(tee -a $ERRFILE >&2)
        if [[ -n "" ]]; then
            # copyback from localscratch
            rsync -auq "/REGION_${LINE_COUNTER}" "${WORK_DIR}"
            cd ${DIR}
        fi
        if [[ -s ${DIR}/reads.fas && -f ${DIR}/snv/SNVs_0.010000_final.csv ]]; then
            # Non empty reads: run had enough data and should have produced SNVs
            bcftools view ${DIR}/snv/SNVs_0.010000_final.vcf -Oz -o ${DIR}/snv/SNVs_0.010000_final.vcf.gz
            bcftools index ${DIR}/snv/SNVs_0.010000_final.vcf.gz
            FILES+=("${DIR}/snv/SNVs_0.010000_final.csv")
            FILES_VCF+=("${DIR}/snv/SNVs_0.010000_final.vcf.gz")
        elif (( 50 == 0 && LINE_COUNTER == 1 )) && [[ -f ${DIR}/reads.fas && ( ! -s ${DIR}/reads.fas ) ]]; then
            # if we have disabled coverage intervales entirely, the first and only line might have no reads
            # (e.g.: in negative controls )

            echo "No reads while coverage intervals disabled (possible negative control sample)" 2> >(tee -a $ERRFILE >&2)
            cd ${CWD}
            (( --LINE_COUNTER )) || true # Strict mode : (( 0 )) = fail
            break
        else
            echo "ERROR: unsuccesful execution of ShoRAH" 2> >(tee -a $ERRFILE >&2)
            exit 1
        fi

        # Change back to working directory
        cd ${CWD}
    done < results/SRR10903401/20200102/variants/coverage_intervals.tsv

    # Aggregate results from different regions
    if (( ${#FILES[@]} )); then
        echo "Intermediate csv files: ${FILES[*]}" >> results/SRR10903401/20200102/variants/SNVs/shorah.out.log
        echo "Intermediate vcf files: ${FILES_VCF[*]}" >> results/SRR10903401/20200102/variants/SNVs/shorah.out.log
        (head -n 1 "${FILES[0]}"; tail -q -n +2 "${FILES[@]}" | sort -t, -nk2) > results/SRR10903401/20200102/variants/SNVs/snvs.csv
        bcftools concat -o ${WORK_DIR}/snvs_tmp.vcf "${FILES_VCF[@]}"
        bcftools sort ${WORK_DIR}/snvs_tmp.vcf  -o results/SRR10903401/20200102/variants/SNVs/snvs.vcf
        rm -f ${WORK_DIR}/snvs_tmp.vcf
    elif (( LINE_COUNTER )); then
        echo "ERROR: unsuccesful execution of ShoRAH" 2> >(tee -a results/SRR10903401/20200102/variants/SNVs/shorah.err.log >&2)
        exit 1
    else
        echo "No alignment region reports sufficient coverage" >> results/SRR10903401/20200102/variants/SNVs/shorah.out.log
        touch results/SRR10903401/20200102/variants/SNVs/snvs.csv
        touch results/SRR10903401/20200102/variants/SNVs/snvs.vcf
    fi

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

[Tue Jun 13 14:44:29 2023]
Error in rule snv:
jobid: 18
input: /home/usr1/vpipe_test_dir/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta, results/SRR10903402/20200102/alignments/REF_aln.bam, results/SRR10903402/20200102/variants/coverage_intervals.tsv
output: results/SRR10903402/20200102/variants/SNVs/snvs.csv, results/SRR10903402/20200102/variants/SNVs/snvs.vcf
log: results/SRR10903402/20200102/variants/SNVs/shorah.out.log, results/SRR10903402/20200102/variants/SNVs/shorah.err.log (check log file(s) for error details)
conda-env: /home/usr1/vpipe_test_dir/work/.snakemake/conda/2db3c1462cfa3088aba362563d569612_
shell:

    let "WINDOW_SHIFTS=(250 * 4/5 + 3) / 3"
    let "WINDOW_LEN=WINDOW_SHIFTS * 3"

    echo "Windows are shifted by: ${WINDOW_SHIFTS} bp" > results/SRR10903402/20200102/variants/SNVs/shorah.out.log
    echo "The window length is: ${WINDOW_LEN} bp" >> results/SRR10903402/20200102/variants/SNVs/shorah.out.log

    # Get absolute path for input files
    CWD=${PWD}
    BAM=${PWD}/results/SRR10903402/20200102/alignments/REF_aln.bam
    REF=/home/usr1/vpipe_test_dir/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta; [[ ${REF} =~ ^/ ]] || REF=${PWD}/${REF}
    OUTFILE=${PWD}/results/SRR10903402/20200102/variants/SNVs/shorah.out.log
    ERRFILE=${PWD}/results/SRR10903402/20200102/variants/SNVs/shorah.err.log
    WORK_DIR=${PWD}/results/SRR10903402/20200102/variants/SNVs

    if [[ -n "" ]]; then
        # put input files in localscratch
        rsync -aq "${BAM}" "${REF}" "/"
        BAM="/${BAM##*/}"
        REF="/${REF##*/}"
    fi

    # Run ShoRAH in each of the predetermined regions (regions with sufficient coverage)
    LINE_COUNTER=0
    FILES=( )
    FILES_VCF=( )
    while read -r region || [[ -n ${region} ]]
    do
        echo "Running ShoRAH on region: ${region}" >> $OUTFILE
        (( ++LINE_COUNTER ))
        # Create directory for running ShoRAH in a corresponding region (if doesn't exist)
        DIR=${WORK_DIR}/REGION_${LINE_COUNTER}
        if [[ ! -d "${DIR}" ]]; then
            echo "Creating directory ${DIR}" >> $OUTFILE
            mkdir -p ${DIR}
        else
            # Results from previous runs
            if [[ false == "true" ]]; then
                DIR_DST=${WORK_DIR}/old
                echo "Moving results from a previous run to ${DIR_DST}" >> $OUTFILE
                rm -rf ${DIR_DST}/REGION_${LINE_COUNTER}
                mkdir -p ${DIR_DST}
                mv -f ${DIR} ${DIR_DST}
                mkdir -p ${DIR}
            fi
        fi
        # Change to the directory where ShoRAH is to be executed
        if [[ -z "" ]]; then
            cd ${DIR}
        else
            # special case: go in local scratch instead
            rsync -aq "${DIR}" "/"
            mkdir -p "/REGION_${LINE_COUNTER}"
            cd "/REGION_${LINE_COUNTER}"
        fi

        # NOTE: Execution command for ShoRAH2 valid from v1.99.0 and above
        shorah shotgun -t 1 -a 0.1 -w ${WINDOW_LEN} -x 100000  -p 0.9 -c 0 -r ${region} -R 42 -b ${BAM} -f ${REF} >> $OUTFILE 2> >(tee -a $ERRFILE >&2)
        if [[ -n "" ]]; then
            # copyback from localscratch
            rsync -auq "/REGION_${LINE_COUNTER}" "${WORK_DIR}"
            cd ${DIR}
        fi
        if [[ -s ${DIR}/reads.fas && -f ${DIR}/snv/SNVs_0.010000_final.csv ]]; then
            # Non empty reads: run had enough data and should have produced SNVs
            bcftools view ${DIR}/snv/SNVs_0.010000_final.vcf -Oz -o ${DIR}/snv/SNVs_0.010000_final.vcf.gz
            bcftools index ${DIR}/snv/SNVs_0.010000_final.vcf.gz
            FILES+=("${DIR}/snv/SNVs_0.010000_final.csv")
            FILES_VCF+=("${DIR}/snv/SNVs_0.010000_final.vcf.gz")
        elif (( 50 == 0 && LINE_COUNTER == 1 )) && [[ -f ${DIR}/reads.fas && ( ! -s ${DIR}/reads.fas ) ]]; then
            # if we have disabled coverage intervales entirely, the first and only line might have no reads
            # (e.g.: in negative controls )

            echo "No reads while coverage intervals disabled (possible negative control sample)" 2> >(tee -a $ERRFILE >&2)
            cd ${CWD}
            (( --LINE_COUNTER )) || true # Strict mode : (( 0 )) = fail
            break
        else
            echo "ERROR: unsuccesful execution of ShoRAH" 2> >(tee -a $ERRFILE >&2)
            exit 1
        fi

        # Change back to working directory
        cd ${CWD}
    done < results/SRR10903402/20200102/variants/coverage_intervals.tsv

    # Aggregate results from different regions
    if (( ${#FILES[@]} )); then
        echo "Intermediate csv files: ${FILES[*]}" >> results/SRR10903402/20200102/variants/SNVs/shorah.out.log
        echo "Intermediate vcf files: ${FILES_VCF[*]}" >> results/SRR10903402/20200102/variants/SNVs/shorah.out.log
        (head -n 1 "${FILES[0]}"; tail -q -n +2 "${FILES[@]}" | sort -t, -nk2) > results/SRR10903402/20200102/variants/SNVs/snvs.csv
        bcftools concat -o ${WORK_DIR}/snvs_tmp.vcf "${FILES_VCF[@]}"
        bcftools sort ${WORK_DIR}/snvs_tmp.vcf  -o results/SRR10903402/20200102/variants/SNVs/snvs.vcf
        rm -f ${WORK_DIR}/snvs_tmp.vcf
    elif (( LINE_COUNTER )); then
        echo "ERROR: unsuccesful execution of ShoRAH" 2> >(tee -a results/SRR10903402/20200102/variants/SNVs/shorah.err.log >&2)
        exit 1
    else
        echo "No alignment region reports sufficient coverage" >> results/SRR10903402/20200102/variants/SNVs/shorah.out.log
        touch results/SRR10903402/20200102/variants/SNVs/snvs.csv
        touch results/SRR10903402/20200102/variants/SNVs/snvs.vcf
    fi

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-06-13T144427.023150.snakemake.log

@DrYak
Copy link
Member

DrYak commented Jun 26, 2023

Something is going wrong here.

Could you please provide the file:
results/SRR10903401/20200102/variants/coverage_intervals.tsv ?

I will try to reproduce your errors to see where it come from.
(Normally SNV should work: their are tested as part of our CI/CD).

@DrYak DrYak self-assigned this Jun 26, 2023
@GoncheDanesh
Copy link

Hi,

I get the same error when running V-pipe with snv: true in the config file.
Did you find why ?

Thank you,
Best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants