Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during DNA-Mapping workflow: bowtie2.index.err: No output file specified #861

Open
jurummel opened this issue Oct 28, 2022 · 2 comments

Comments

@jurummel
Copy link

Hi everyone,

i am trying to analyse ATAC-seq data with snakepipes. Currently, I am facing an error during the DNA-mapping workflow. Before that, I have created an index for GRCm38. I have crossed samples from CAST_EiJ and C57BL_6NJ. Thus, I use the allelic-mapping mode and pass a SNP file to the pipeline:

DNA-mapping -i /projects/allelespecificchromatinmouse/work/ATAC/X204SC21092819-Z01-F001_01/raw_data/F2022YB/ -o /projects/allelespecificchromatinmouse/work/MouseSeqData/results/DNA-Mapping/F2022YB/ --local --mode allelic-mapping --VCFfile /projects/allelespecificchromatinmouse/work/MouseSeqData/help_data/mgp.v5.merged.snps_all.dbSNP142.vcf --strains 'CAST_EiJ,C57BL_6NJ' --ext .fq.gz --reads '_1' '_2' GRCm38_105_Mapping

I get the following Error-message:

rule bowtie2_index:
    input: snp_genome/CAST_EiJ_C57BL_6NJ_dual_hybrid.based_on_GRCm38_105_Mapping_N-masked
    output: snp_genome/bowtie2_Nmasked/Genome.1.bt2
    log: snp_genome/bowtie2_Nmasked/bowtie2.index.out, snp_genome/bowtie2_Nmasked/bowtie2.index.err
    jobid: 7
    threads: 5
    resources: tmpdir=/projects/allelespecificchromatinmouse/work/MouseSeqData/temp

Activating conda environment: /home/jrummel/anaconda3/envs/97bf6a3bf6520594fcbd63a07735fa20
[Tue Oct 18 17:05:16 2022]
Error in rule bowtie2_index:
    jobid: 7
    output: snp_genome/bowtie2_Nmasked/Genome.1.bt2
    log: snp_genome/bowtie2_Nmasked/bowtie2.index.out, snp_genome/bowtie2_Nmasked/bowtie2.index.err (check log file(s) for error message)
    conda-env: /home/jrummel/anaconda3/envs/97bf6a3bf6520594fcbd63a07735fa20
    shell:
        bowtie2-build --threads 5  snp_genome/bowtie2_Nmasked/Genome > snp_genome/bowtie2_Nmasked/bowtie2.index.out 2> snp_genome/bowtie2_Nmasked/bowtie2.index.err
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Job failed, going on with independent jobs.
Exiting because a job execution failed. Look above for error message
Complete log: /projects/allelespecificchromatinmouse/work/MouseSeqData/results/DNA-Mapping/F2022YB/.snakemake/log/2022-10-18T160358.600796.snakemake.log

 !!! ERROR in DNA mapping workflow! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

bowtie2.index.err:

No output file specified!
      Bowtie 2 version 2.3.5.1 by Ben Langmead (langmea@cs.jhu.edu, www.cs.jhu.edu/~langmea)
      Usage: bowtie2-build [options]* <reference_in> <bt2_index_base>
          reference_in            comma-separated list of files with ref sequences
          bt2_index_base          write bt2 data to files with this dir/basename
      *** Bowtie 2 indexes work only with v2 (not v1).  Likewise for v1 indexes. ***
      Options:
          -f                      reference files are Fasta (default)
          -c                      reference sequences given on cmd line (as
                                  <reference_in>)
          --large-index           force generated index to be 'large', even if ref
                                  has fewer than 4 billion nucleotides
          --debug                 use the debug binary; slower, assertions enabled
          --sanitized             use sanitized binary; slower, uses ASan and/or UBSan
          --verbose               log the issued command
          -a/--noauto             disable automatic -p/--bmax/--dcv memory-fitting
          -p/--packed             use packed strings internally; slower, less memory
          --bmax <int>            max bucket sz for blockwise suffix-array builder
          --bmaxdivn <int>        max bucket sz as divisor of ref len (default: 4)
          --dcv <int>             diff-cover period for blockwise (default: 1024)
          --nodc                  disable diff-cover (algorithm becomes quadratic)
          -r/--noref              don't build .3/.4 index files
          -3/--justref            just build .3/.4 index files
          -o/--offrate <int>      SA is sampled every 2^<int> BWT chars (default: 5)
          -t/--ftabchars <int>    # of chars consumed in initial lookup (default: 10)
          --threads <int>         # of threads
          --seed <int>            seed for random number generator
          -q/--quiet              verbose output (for debugging)
          -h/--help               print detailed description of tool and its options
          --usage                 print this usage message
          --version               print version information and quit

bowtie2.index.out is empty.

Do you have any idea how to fix this? If you need more information just let me know.

Thanks a lot for your help :)

(I have seen issue #517 regarding the same problem. Unfortunately that did not help.)

Best
Julian

@katsikora
Copy link
Contributor

Hi Julian,

thanks for reporting this issue.
It looks like the N-masked fasta file required for the bowtie index is missing. It should have been generated in the previous step. Did rule create_snpgenome produce any errors?

Best,

Katarzyna

@jurummel
Copy link
Author

jurummel commented Nov 4, 2022

Hi Katarzyna,

thanks for the quick reply. I don't get an error message for rule create_snpgenome.

rule create_snpgenome:
    input: /projects/allelespecificchromatinmouse/work/MouseSeqData/Indices_Mapping/genome_fasta
    output: snp_genome/CAST_EiJ_SNP_filtering_report.txt, snp_genome/C57BL_6NJ_SNP_filtering_report.txt, snp_genome/CAST_EiJ_C57BL_6NJ_dual_hybrid.based_on_GRCm38_105_Mapping_N-masked, snp_genome/all_C57BL_6NJ_SNPs_CAST_EiJ_reference.based_on_GRCm38_105_Mapping.txt
    log: SNPsplit_createSNPgenome.out, SNPsplit_createSNPgenome.err
    jobid: 8
    resources: tmpdir=/projects/allelespecificchromatinmouse/work/MouseSeqData/temp

Both files, SNPsplit_createSNPgenome.out & SNPsplit_createSNPgenome.err, are empty.
The directory snp_genome/CAST_EiJ_C57BL_6NJ_dual_hybrid.based_on_GRCm38_105_Mapping_N-masked contains N-masked fasta files for all chromosomes.

Thanks again :)

Best,
Julian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants