Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve the N50 and reduce contigs numbers? #200

Open
cj2jy opened this issue Mar 14, 2024 · 7 comments
Open

How to improve the N50 and reduce contigs numbers? #200

cj2jy opened this issue Mar 14, 2024 · 7 comments

Comments

@cj2jy
Copy link

cj2jy commented Mar 14, 2024

Hi, I finished an assembly and the result is:

Type Length (bp) Count (#)
N10 22880485 3
N20 10335838 10
N30 6877938 22
N40 5222529 39
N50 3377214 63
N60 1919981 103
N70 927783 178
N80 440773 335
N90 218142 666

Min. 28326 -
Max. 57348206 -
Ave. 742827 -
Total 992417917 1336

run.cfg:

[General]
job_type = slurm
submit = sbatch --cpus-per-task=20 --mem-per-cpu=4g -o {out} -e {err} {script}
job_prefix = nextDenovo
task = all # 'all', 'correct', 'assemble'
rewrite = yes # yes/no
deltmp = yes
rerun = 1
parallel_jobs = 5
input_type = raw
read_type = ont
input_fofn = ./input.fofn
workdir = ./02_rundir

[correct_option]
read_cutoff = 2k
genome_size = 850M
seed_cutoff = 25000
pa_correction = 3
sort_options = -m 20g -t 18
minimap2_options_raw = -t 18
correction_options = -p 18

[assemble_option]
random_round = 20
minimap2_options_cns = -t 18 -k 23 -w 10
nextgraph_options = -a 1 -q 10

What can I do to increase the N50 and reduce the total number of contigs? I want a better result for 3d-DNA.
Looking forward to reply. Thank you.

@cj2jy cj2jy changed the title How How to improve the N50 and reduce contigs numbers? Mar 14, 2024
@cj2jy cj2jy closed this as completed Mar 14, 2024
@cj2jy cj2jy reopened this Mar 14, 2024
@moold
Copy link
Member

moold commented Mar 15, 2024

It's hard to say, if I had a better solution I would set it as the default value. How ever, I think you can try to optimize these parameters: seed_cutoff, -k -w -f in minimap2_options_raw and minimap2_options_cns. BTW, you should make sure you are using the latest version of NextDenovo. You also can sequencing more ultra-long ONT SUP reads. At the last, you can try some other assemblers.

@cj2jy
Copy link
Author

cj2jy commented Mar 15, 2024

Thank you, I will change those parameters and try again. But I don't know what the -f means and how to optimize it, do you have any suggestion?

@moold
Copy link
Member

moold commented Mar 18, 2024

try -f 0.0001 or less

@cj2jy
Copy link
Author

cj2jy commented Mar 21, 2024

Thank you, I ran again and it is still running. Can I use my last assembly result nd.asm.fasta as input to run assemble again? Would that be a better result?

1 similar comment
@cj2jy
Copy link
Author

cj2jy commented Mar 21, 2024

Thank you, I ran again and it is still running. Can I use my last assembly result nd.asm.fasta as input to run assemble again? Would that be a better result?

@moold
Copy link
Member

moold commented Mar 22, 2024

No

@DaniPaulo
Copy link

Hi @cj2jy,
I'm still trying to understand how to run NextDenovo using SLURM. Could you share your script.slurm.sh?

In the run.cfg you set submit = sbatch --cpus-per-task=20 --mem-per-cpu=4g, so that means you also set #SBATCH --cpus-per-task=20 and #SBATCH --mem-per-cpu=4g?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants