Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nanopolish returns VCF instead of FASTA files #5

Open
fanavarro opened this issue Jan 17, 2021 · 0 comments
Open

Nanopolish returns VCF instead of FASTA files #5

fanavarro opened this issue Jan 17, 2021 · 0 comments

Comments

@fanavarro
Copy link

Hi, I've been using the pipeline, concretely the option 1 (canu + nanopolish). However I got stuck in the nanopolish step because the expected file output was empty. The rule is the following one:

## Polish a genome assembly
rule nanopolish_consensus:
 input:
  reads = config['project_dir'] + '/03_subsampled_reads/{barcode}/reads.fastq.gz',
  bam = config['project_dir'] + '/05_nanopolish/{barcode}/reads.sorted.bam',
  bai = config['project_dir'] + '/05_nanopolish/{barcode}/reads.sorted.bam.bai',
  contigs = config['project_dir'] + '/05_nanopolish/{barcode}/canu.contigs.fasta',
  makerange_fp = config['nanopolish_fp'] + '/scripts/nanopolish_makerange.py',
  merge_fp = config['nanopolish_fp'] + '/scripts/nanopolish_merge.py'
 output:
  config['project_dir'] + '/05_nanopolish/{barcode}/nanopolish.contigs.fasta'
 params:
  results_fp = config['project_dir'] + '/05_nanopolish/{barcode}/nanopolish.results',
  polished_fp = config['project_dir'] + '/05_nanopolish/{barcode}/polished.results'
 threads: 16
 shell:
  """
  mkdir -p {params.polished_fp}
  python {input.makerange_fp} {input.contigs} | \
   parallel --results {params.results_fp} -P 16 \
    LD_LIBRARY_PATH="$CONDA_PREFIX/lib64" {config[nanopolish_fp]}/nanopolish variants \
    --consensus {params.polished_fp}/polished.{{1}}.fa \
    -w {{1}} -r {input.reads} -b {input.bam} -g {input.contigs} \
    -t {threads} --min-candidate-frequency 0.1 --methylation-aware=dcm,dam
  
  python {input.merge_fp} {params.polished_fp}/polished.*.fa > {output}
  """

In the shell statement, the command nanopolish variants --consensus is generating VCF files. The following command is python nanopolish_merge.py, which is expecting FASTA files, thus it do not work correctly and it generates an empty output file. Maybe this could be due to different nanopolish versions. However, in the instructions provided in the readme file no version is recommended.

I noticed that nanopolish provides a tool for converting a set of VCF into a fasta file, so I finally decided to use that instead of nanopolish_merge.py script. The resulting rule is the following (not test as I executed the command manually):

## Polish a genome assembly
rule nanopolish_consensus:
 input:
  reads = config['project_dir'] + '/03_subsampled_reads/{barcode}/reads.fastq.gz',
  bam = config['project_dir'] + '/05_nanopolish/{barcode}/reads.sorted.bam',
  bai = config['project_dir'] + '/05_nanopolish/{barcode}/reads.sorted.bam.bai',
  contigs = config['project_dir'] + '/05_nanopolish/{barcode}/canu.contigs.fasta',
  makerange_fp = config['nanopolish_fp'] + '/scripts/nanopolish_makerange.py',
  merge_fp = config['nanopolish_fp'] + '/scripts/nanopolish_merge.py'
 output:
  config['project_dir'] + '/05_nanopolish/{barcode}/nanopolish.contigs.fasta'
 params:
  results_fp = config['project_dir'] + '/05_nanopolish/{barcode}/nanopolish.results',
  polished_fp = config['project_dir'] + '/05_nanopolish/{barcode}/polished.results'
 threads: 16
 shell:
  """
  mkdir -p {params.polished_fp}
  python {input.makerange_fp} {input.contigs} | \
   parallel --results {params.results_fp} -P 16 \
    LD_LIBRARY_PATH="$CONDA_PREFIX/lib64" {config[nanopolish_fp]}/nanopolish variants \
    --consensus {params.polished_fp}/polished.{{1}}.fa \
    -w {{1}} -r {input.reads} -b {input.bam} -g {input.contigs} \
    -t {threads} --min-candidate-frequency 0.1 --methylation-aware=dcm,dam
  
  LD_LIBRARY_PATH="$CONDA_PREFIX/lib64" {config[nanopolish_fp]}/nanopolish vcf2fasta -g {input.contigs} {params.polished_fp}/polished.*.fa > {output}
  """

I hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant