Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

questions about speed and ONT reads #704

Open
mycecilia opened this issue Mar 10, 2020 · 0 comments
Open

questions about speed and ONT reads #704

mycecilia opened this issue Mar 10, 2020 · 0 comments

Comments

@mycecilia
Copy link

Hi, here are my questions on running FALCON. I'm wondering if anyone has tested on these.

  1. I’m assembling ONT reads using FALCON and FALCON unzip. Should I correct the raw ONT reads with CANU or some sort of correction program?

  2. Do you have some suggestions for speeding up FALCON? We have three plant genomes, 350MB, 3GB, and 17GB. So far for 118x ONT reads of a 350 Mb genome, it took me two weeks to finish the 0-rawreads/las-merge-runs stage, which is way too slow.

  3. What's an acceptable low coverage for diploids to adequately assemble primary contigs and haplotigs? I wonder if 50x coverage would just break the assembly down to small contigs or maybe lose some haplotigs while maintaining the assembled N50. Anyone has experience on this matter with lower coverage ONT reads using FALCON?

Here is what I’m planning to speed up the assembly:
a. Increase DBsplit_option -s from 100 to 200 to reduce the number of my tan-run jobs, 5461 jobs with -s 100 currently.
b. I want to play with njob and NPROC options. But I’m a little unsure about how they play out together. My local server has 48 cpus and 560 GB memory.

Thanks you in advance for any suggestion.

Here is my current run_falcon.cfg file for the 118x corrected-ONT reads for 350 Mb genome:

[General]
input_fofn = input_run1.fofn
input_type = raw 

pa_DBsplit_option = -a -x500 -s200
ovlp_DBsplit_option = -a -x500 -s200

ovlp_HPCTANmask_option = 
pa_REPmask_code = 0,300;0,300;0,300

genome_size = 350000000
seed_coverage = 80

length_cutoff = -1
length_cutoff_pr = 1500

pa_HPCdaligner_option = -v -B4 -M16
pa_daligner_option =  -e.70 -l1000 -s100 
falcon_sense_option = --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 12

ovlp_HPCdaligner_option = -v -B4 -M32 
ovlp_daligner_option = -h60 -e.96 -l500 -s1000

overlap_filtering_setting = --max_diff 100 --max_cov 100 --min_cov 20 --bestn 10


[job.defaults]
use_tmpdir = ./tmp
stop_all_jobs_on_failure = true
pwatcher_type = blocking
job_type = local
JOB_QUEUE=default
submit = /bin/bash -c "${JOB_SCRIPT}" > "${JOB_STDOUT}" 2> "${JOB_STDERR}"

[job.step.da]
NPROC=8

[job.step.la]
NPROC=8

[job.step.cns]
NPROC=12

[job.step.pda]
NPROC=8

[job.step.pla]
NPROC=8

[job.step.asm]
NPROC=24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant