You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I am running an assembly of 1.7G heterozygous genome (1.2% het rate) using a 2TB machine. The ONT data is 50x of the highest quality (used Filtlong ≥5Kb and 150Gb)
Error message
After 10 days the assembly failed I/O error at the 02.cns_align step (see fosrt config). I removed this folder and resubmitted the assembly with more memory (2nd config). It went smoothly but now constantly failing at the ctg_graph step. the error is this:
hostname
hostname
cd /WORKDIR/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph0
cd /WORKDIR/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph0
time /apps/NEXTDENOVO/2.4.0/bin/nextgraph -a 1 -f /WORKDIR/03.ctg_graph/01.ctg_graph.input.seqs /WORKDIR/03.ctg_graph/01.ctg_graph.input.ovls -o nd.asm.p.fasta;
/apps/NEXTDENOVO/2.4.0/bin/nextgraph -a 1 -f /WORKDIR/03.ctg_graph/01.ctg_graph.input.seqs /WORKDIR/03.ctg_graph/01.ctg_graph.input.ovls -o nd.asm.p.fasta
[INFO] 2021-12-03 19:11:48 Initialize graph and reading...
/WORKDIR/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph0/nextDenovo.sh: line 5: 19296 Segmentation fault /apps/NEXTDENOVO/2.4.0/bin/nextgraph -a 1 -f /WORKDIR/03.ctg_gr
aph/01.ctg_graph.input.seqs /WORKDIR/03.ctg_graph/01.ctg_graph.input.ovls -o nd.asm.p.fasta
Genome characteristics
C-value =1.7Gb
Paste here the genomescope results:
GenomeScope version 2.0
input file = jf_21mer.hist
output directory = out/21mer/
p = 2
k = 21
property min max
Homozygous (aa) 98.7068% 98.7307%
Heterozygous (ab) 1.26928% 1.29316%
Genome Haploid Length 1,208,134,973 bp 1,210,345,670 bp
Genome Repeat Length 399,334,371 bp 400,065,090 bp
Genome Unique Length 808,800,602 bp 810,280,580 bp
Model Fit 73.122% 95.132%
Read Error Rate 0.214032% 0.214032%
LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-
4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 6.7 (Santiago)
Release: 6.7
Codename: Santiago
GCC
gcc version 6.3.0 (GCC)
Python
Python 3.8.6
NextDenovo
nextDenovo v2.4.0
To Reproduce (Optional)
Steps to reproduce the behavior. Providing a minimal test dataset on which we can reproduce the behavior will generally lead to quicker turnaround time!
Additional context (Optional)
I made three attempts and error is always: line 5: 19296 Segmentation fault /apps/NEXTDENOVO/2.4.0/bin/nextgraph
any idea on what the problem could be?
I'll be happy to check some intermediate files.
The files in 01.ctg_graph.input.ovls are not empty their sizes range 43M to 195M in the folder 02.cns_alig/*.cns.filt.dovt.ovl
Input_seqs also are there:
for i in $(cat 03.ctg_graph/01.ctg_graph.input.seqs); do ls -sh $i; done
4.3G 02.cns_align/01.seed_cns.sh.work/seed_cns0/cns.fasta
4.4G 02.cns_align/01.seed_cns.sh.work/seed_cns1/cns.fasta
4.4G 02.cns_align/01.seed_cns.sh.work/seed_cns2/cns.fasta
2.7G 02.cns_align/01.seed_cns.sh.work/seed_cns3/cns.fasta
4.4G 02.cns_align/01.seed_cns.sh.work/seed_cns4/cns.fasta
Any ideas or suggestions on how to fix this problem are welcome!
Thanks
The text was updated successfully, but these errors were encountered:
Describe the bug
I am running an assembly of 1.7G heterozygous genome (1.2% het rate) using a 2TB machine. The ONT data is 50x of the highest quality (used Filtlong ≥5Kb and 150Gb)
1st config file (24cpus 1TB total RAM):
[General]
job_type = local
task = all
rewrite = yes
parallel_jobs = 4
deltmp = yes
read_type = ont
input_type = raw
workdir = /WORKDIR/
input_fofn = /WORKDIR/long_reads.fofn
[correct_option]
read_cutoff = 1k
genome_size = 1.8g
seed_depth = 45
seed_cutoff = 0
blocksize = 1g
pa_correction = 4
minimap2_options_raw = -t 6 -x ava-ont
sort_options = -m 40g -t 20
correction_options = -p 6
[assemble_option]
minimap2_options_cns = -t 6 -x ava-ont -k17 -w17
minimap2_options_map = -t 6 -x ava-ont
nextgraph_options = -a 1
2nd config file (48cpus 2TB total RAM):
[General]
job_type = local
task = all
rewrite = yes
parallel_jobs = 8
deltmp = yes
read_type = ont
input_type = raw
workdir = /WORKDIR/
input_fofn = /WORKDIR/long_reads.fofn
[correct_option]
read_cutoff = 1k
genome_size = 1.8g
seed_depth = 45
seed_cutoff = 0
blocksize = 1g
pa_correction = 4
minimap2_options_raw = -t 6 -x ava-ont
sort_options = -m 40g -t 20
correction_options = -p 6
[assemble_option]
minimap2_options_cns = -t 6 -x ava-ont -k17 -w17
minimap2_options_map = -t 6 -x ava-ont
nextgraph_options = -a 1
Error message
After 10 days the assembly failed I/O error at the 02.cns_align step (see fosrt config). I removed this folder and resubmitted the assembly with more memory (2nd config). It went smoothly but now constantly failing at the ctg_graph step. the error is this:
hostname
cd /WORKDIR/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph0
time /apps/NEXTDENOVO/2.4.0/bin/nextgraph -a 1 -f /WORKDIR/03.ctg_graph/01.ctg_graph.input.seqs /WORKDIR/03.ctg_graph/01.ctg_graph.input.ovls -o nd.asm.p.fasta;
[INFO] 2021-12-03 19:11:48 Initialize graph and reading...
/WORKDIR/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph0/nextDenovo.sh: line 5: 19296 Segmentation fault /apps/NEXTDENOVO/2.4.0/bin/nextgraph -a 1 -f /WORKDIR/03.ctg_gr
aph/01.ctg_graph.input.seqs /WORKDIR/03.ctg_graph/01.ctg_graph.input.ovls -o nd.asm.p.fasta
Genome characteristics
C-value =1.7Gb
Paste here the genomescope results:
GenomeScope version 2.0
input file = jf_21mer.hist
output directory = out/21mer/
p = 2
k = 21
property min max
Homozygous (aa) 98.7068% 98.7307%
Heterozygous (ab) 1.26928% 1.29316%
Genome Haploid Length 1,208,134,973 bp 1,210,345,670 bp
Genome Repeat Length 399,334,371 bp 400,065,090 bp
Genome Unique Length 808,800,602 bp 810,280,580 bp
Model Fit 73.122% 95.132%
Read Error Rate 0.214032% 0.214032%
Input data
[Read length stat]
Types Count (#) Length (bp)
N10 266461 29793
N20 648378 23529
N30 1113845 19774
N40 1660889 16968
N50 2295837 14643
N60 3032994 12575
N70 3896295 10664
N80 4925021 8844
N90 6190301 7021
Types Count (#) Bases (bp) Depth (X)
Raw 7860332 100000021650 55.56
Filtered 0 0 0.00
Clean 7860332 100000021650 55.56
Config file
Last config used was:
[General]
job_type = local
task = all
rewrite = yes
parallel_jobs = 8
deltmp = yes
read_type = ont
input_type = raw
workdir = /WORKDIR/
input_fofn = /WORKDIR/long_reads.fofn
[correct_option]
read_cutoff = 1k
genome_size = 1.8g
seed_depth = 45
seed_cutoff = 0
blocksize = 1g
pa_correction = 4
minimap2_options_raw = -t 6 -x ava-ont
sort_options = -m 40g -t 40
correction_options = -p 6
[assemble_option]
minimap2_options_cns = -t 6 -x ava-ont -k17 -w17
minimap2_options_map = -t 6 -x ava-ont
nextgraph_options = -a 1
Operating system
LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-
4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 6.7 (Santiago)
Release: 6.7
Codename: Santiago
GCC
gcc version 6.3.0 (GCC)
Python
Python 3.8.6
NextDenovo
nextDenovo v2.4.0
To Reproduce (Optional)
Steps to reproduce the behavior. Providing a minimal test dataset on which we can reproduce the behavior will generally lead to quicker turnaround time!
Additional context (Optional)
I made three attempts and error is always: line 5: 19296 Segmentation fault /apps/NEXTDENOVO/2.4.0/bin/nextgraph
any idea on what the problem could be?
I'll be happy to check some intermediate files.
The files in 01.ctg_graph.input.ovls are not empty their sizes range 43M to 195M in the folder 02.cns_alig/*.cns.filt.dovt.ovl
Input_seqs also are there:
for i in $(cat 03.ctg_graph/01.ctg_graph.input.seqs); do ls -sh $i; done
4.3G 02.cns_align/01.seed_cns.sh.work/seed_cns0/cns.fasta
4.4G 02.cns_align/01.seed_cns.sh.work/seed_cns1/cns.fasta
4.4G 02.cns_align/01.seed_cns.sh.work/seed_cns2/cns.fasta
2.7G 02.cns_align/01.seed_cns.sh.work/seed_cns3/cns.fasta
4.4G 02.cns_align/01.seed_cns.sh.work/seed_cns4/cns.fasta
Any ideas or suggestions on how to fix this problem are welcome!
Thanks
The text was updated successfully, but these errors were encountered: