Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

config settings #644

Open
gbdias opened this issue Jun 11, 2018 · 1 comment
Open

config settings #644

gbdias opened this issue Jun 11, 2018 · 1 comment

Comments

@gbdias
Copy link

gbdias commented Jun 11, 2018

Hey guys,

  • I have been trying to run Falcon on a single machine but the config file is somewhat complicated to understand and after 8 days (200h) running, the process is still working on read error correction. For comparison, Canu takes less than 2 days to fully complete this assembly.

  • What would be the best config settings for assembling the Drosophila genome (~180Mb) in a 48-cores 500GB RAM machine?

[General]
job_type = local

# list of fasta files
input_fofn = input.fofn

# input type, raw or pre-assembled reads (preads, error corrected reads)
input_type = raw
#input_type = preads

# The length cutoff used for seed reads used for error correction.
# "-1" indicates FALCON should calculate the cutoff using
# the user-defined genome length and coverage cut off
# otherwise, user can specify length cut off in bp (e.g. 2000)
length_cutoff = -1
genome_size = 180000000
seed_coverage = 30

# The length cutoff used for overalpping the preassembled reads 
length_cutoff_pr = 12000

## resource usage ##
# need this line even if running local
jobqueue = your_queue
# grid settings for...
# daligner step of raw reads
sge_option_da = -pe smp 5 -q %(jobqueue)s
# las-merging of raw reads
sge_option_la = -pe smp 20 -q %(jobqueue)s
# consensus calling for preads
sge_option_cns = -pe smp 12 -q %(jobqueue)s
# daligner on preads
sge_option_pda = -pe smp 6 -q %(jobqueue)s
# las-merging on preads
sge_option_pla = -pe smp 16 -q %(jobqueue)s
# final overlap/assembly 
sge_option_fc = -pe smp 24 -q %(jobqueue)s

# job concurrency settings for...
# all jobs
default_concurrent_jobs = 30
# preassembly
pa_concurrent_jobs = 30
# consensus calling of preads
cns_concurrent_jobs = 30
# overlap detection
ovlp_concurrent_jobs = 30

# daligner parameter options for...
# https://dazzlerblog.wordpress.com/command-guides/daligner-command-reference-guide/
# initial overlap of raw reads
pa_HPCdaligner_option =  -v -B4 -t16 -e.70 -l1000 -s1000
# overlap of preads
ovlp_HPCdaligner_option = -v -B4 -t32 -h60 -e.96 -l500 -s1000

# parameters for creation of dazzler database of...
# https://dazzlerblog.wordpress.com/command-guides/dazz_db-command-guide/
# raw reads
pa_DBsplit_option = -x500 -s50
# preads
ovlp_DBsplit_option = -x500 -s50

# settings for consensus calling for preads
falcon_sense_option = --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 6

# setting for filtering of final overlap of preads
overlap_filtering_setting = --max_diff 100 --max_cov 100 --min_cov 20 --bestn 10 --n_core 24
@pb-cdunn
Copy link

Always run on a fast test-case first.

cd FALCON-example/run/synth0
git-sym .
make
cd ../greg200k-sv2
# edit *.cfg
make falcon
make unzip

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants