Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors with nextPolish.sh.e #87

Open
Gerlex89 opened this issue Dec 3, 2021 · 8 comments
Open

Errors with nextPolish.sh.e #87

Gerlex89 opened this issue Dec 3, 2021 · 8 comments

Comments

@Gerlex89
Copy link

Gerlex89 commented Dec 3, 2021

Hi all. I'm trying to polish an assembly from Flye for nanopore reads. However, I cannot make the command work as I receive errors that I cannot determine the source, or how to proceed. These errors appear also with the test data (nextPolish test_data/run.cfg).

Operating system
Linux Mint 20.2 Uma (Ubuntu Focal base)

GCC
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)

Python
Python 3.9.7

NextPolish
nextPolish v1.4.0

Input files
(https://nextpolish.readthedocs.io/en/latest/TUTORIAL.html#polishing-using-long-reads-only)

lgs.fofn:
I have only one FASTQ file because I'm testing the tool to adapt it to CWL, proceeding with ls /path/to/fastq_runid.fastq > lgs.fofn

run.cfg:
modified parts are commented

[General]
job_type = local
job_prefix = nextPolish
task = best
rewrite = yes
rerun = 3
parallel_jobs = 6
multithread_jobs = 5
genome = ./assembly.fasta # modified
genome_size = auto
workdir = ./01_rundir # modified, tried with different
polish_options = -p {multithread_jobs}

[lgs_option]
lgs_fofn = ./lgs.fofn
lgs_options = -min_read_len 1k -max_depth 100
lgs_minimap2_options = -x map-ont

Log

[183854 INFO] 2021-12-03 10:52:24 NextPolish start...
[183854 INFO] 2021-12-03 10:52:24 version:v1.4.0 logfile:pid183854.log.info
[183854 WARNING] 2021-12-03 10:52:24 Delete task: 5 due to missing lgs_fofn.
[183854 WARNING] 2021-12-03 10:52:24 Delete task: 5 due to missing lgs_fofn.
[183854 WARNING] 2021-12-03 10:52:24 Delete task: 6 due to missing hifi_fofn.
[183854 WARNING] 2021-12-03 10:52:24 Delete task: 6 due to missing hifi_fofn.
[183854 INFO] 2021-12-03 10:52:24 scheduled tasks:
[1, 2, 1, 2]
[183854 INFO] 2021-12-03 10:52:24 options: 
[183854 INFO] 2021-12-03 10:52:24 
rerun:                        3
rewrite:                      0
kill:                         None
cleantmp:                     0
use_drmaa:                    0
submit:                       None
job_type:                     local
sgs_unpaired:                 0
sgs_rm_nread:                 1
lgs_read_type:                
parallel_jobs:                6
align_threads:                5
check_alive:                  None
task:                         [1, 2, 1, 2]
job_id_regex:                 None
genome_size:                  18910
sgs_max_depth:                100
lgs_max_depth:                100
multithread_jobs:             5
lgs_max_read_len:             0
hifi_max_depth:               100
lgs_block_size:               500M
lgs_min_read_len:             1k
hifi_max_read_len:            0
polish_options:               -p 5
hifi_block_size:              500M
hifi_min_read_len:            1k
job_prefix:                   nextPolish
sgs_use_duplicate_reads:      0
lgs_minimap2_options:         -x map-ont
hifi_minimap2_options:        -x map-pb
sgs_block_size:               315166.6666666667
sgs_align_options:            bwa mem -p  -t 5
workdir:                      path/to/NextPolish.backup0
genome:                       path/to/flye-output/assembly.fasta
sgs_fofn:                     path/to/NextPolish.backup0/test.fofn
snp_phase:                    path/to/NextPolish.backup0/%02d.snp_phase
snp_valid:                    path/to/NextPolish.backup0/%02d.snp_valid
lgs_polish:                   path/to/NextPolish.backup0/%02d.lgs_polish
kmer_count:                   path/to/NextPolish.backup0/%02d.kmer_count
hifi_polish:                  path/to/NextPolish.backup0/%02d.hifi_polish
score_chain:                  path/to/NextPolish.backup0/%02d.score_chain
[183854 WARNING] 2021-12-03 10:52:24 mv path/to/NextPolish.backup0 to path/to/NextPolish.backup0.backup0
[183854 INFO] 2021-12-03 10:52:24 step 0 and task 1 start:
[183854 INFO] 2021-12-03 10:52:29 Total jobs: 3
[183854 INFO] 2021-12-03 10:52:29 Submitted jobID:[183883] jobCmd:[path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh] in the local_cycle.
[183883 CRITICAL] 2021-12-03 10:52:29 Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 127, error info: .
Traceback (most recent call last):
  File "path/to/NextPolish.backup0/./nextPolish", line 515, in <module>
  File "path/to/NextPolish.backup0/./nextPolish", line 369, in main
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 347, in start
    self._start()
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 371, in _start
    self.submit(job)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 255, in submit
    _, stdout, _ = self.run(job.cmd)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 291, in run
    log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr))
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1493, in critical
    self._log(CRITICAL, msg, args, **kwargs)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1589, in _log
    self.handle(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1599, in handle
    self.callHandlers(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1661, in callHandlers
    hdlr.handle(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 952, in handle
    self.emit(record)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/kit.py", line 42, in emit
    raise Exception(record.msg)
Exception: Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 127, error info: .
[183854 INFO] 2021-12-03 10:52:29 Submitted jobID:[183889] jobCmd:[path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh] in the local_cycle.
[183889 CRITICAL] 2021-12-03 10:52:29 Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e' returned non-zero exit status 127, error info: .
Traceback (most recent call last):
  File "path/to/NextPolish.backup0/./nextPolish", line 515, in <module>
  File "path/to/NextPolish.backup0/./nextPolish", line 369, in main
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 347, in start
    self._start()
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 371, in _start
    self.submit(job)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 255, in submit
    _, stdout, _ = self.run(job.cmd)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 291, in run
    log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr))
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1493, in critical
    self._log(CRITICAL, msg, args, **kwargs)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1589, in _log
    self.handle(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1599, in handle
    self.callHandlers(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1661, in callHandlers
    hdlr.handle(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 952, in handle
    self.emit(record)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/kit.py", line 42, in emit
    raise Exception(record.msg)
Exception: Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e' returned non-zero exit status 127, error info: .
[183854 INFO] 2021-12-03 10:52:30 Submitted jobID:[183895] jobCmd:[path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh] in the local_cycle.
[183895 CRITICAL] 2021-12-03 10:52:30 Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e' returned non-zero exit status 127, error info: .
Traceback (most recent call last):
  File "path/to/NextPolish.backup0/./nextPolish", line 515, in <module>
  File "path/to/NextPolish.backup0/./nextPolish", line 369, in main
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 347, in start
    self._start()
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 371, in _start
    self.submit(job)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 255, in submit
    _, stdout, _ = self.run(job.cmd)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 291, in run
    log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr))
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1493, in critical
    self._log(CRITICAL, msg, args, **kwargs)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1589, in _log
    self.handle(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1599, in handle
    self.callHandlers(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1661, in callHandlers
    hdlr.handle(record)
  File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 952, in handle
    self.emit(record)
  File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/kit.py", line 42, in emit
    raise Exception(record.msg)
Exception: Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e' returned non-zero exit status 127, error info: .
[183854 ERROR] 2021-12-03 10:52:37 db_split failed: please check the following logs:
[183854 ERROR] 2021-12-03 10:52:37 path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e
[183854 ERROR] 2021-12-03 10:52:37 path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e
[183854 ERROR] 2021-12-03 10:52:37 path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e

Regards,
Alex

@moold
Copy link
Member

moold commented Dec 6, 2021

Hi, could you paste the content of path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e to here?

@Gerlex89
Copy link
Author

Gerlex89 commented Dec 6, 2021

Hi,

path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e:

hostname
+ hostname
cd path/to/NextPolish/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1
+ cd path/to/NextPolish/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1
time path/to/NextPolish/NextPolish.backup0/bin/seq_split -d path/to/NextPolish/NextPolish.backup0 -m 315166.6666666667 -n 6 -t 5 -i 1 -s 1891000 -p input.sgspart path/to/NextPolish/NextPolish.backup0/test.fofn
+ time path/to/NextPolish/NextPolish.backup0/bin/seq_split -d path/to/NextPolish/NextPolish.backup0 -m 315166.6666666667 -n 6 -t 5 -i 1 -s 1891000 -p input.sgspart path/to/NextPolish/NextPolish.backup0/test.fofn
time: cannot run path/to/NextPolish/NextPolish.backup0/bin/seq_split: No such file or directory
Command exited with non-zero status 127
0.00user 0.00system 0:00.00elapsed ?%CPU (0avgtext+0avgdata 1020maxresident)k
0inputs+0outputs (0major+25minor)pagefaults 0swaps

@moold
Copy link
Member

moold commented Dec 6, 2021

As the log says, there is no seq_split execution file, so follow here to reinstall.
BTW, do not forget make command after downloading.

@Gerlex89
Copy link
Author

Gerlex89 commented Dec 7, 2021

I made the reinstall, but now I receive the log below from the nextPolish.sh.e file from running the test data. However, when I ran a clean installation of it in a server, then it's successful. Looks like is clear that the problem points to my local Python or Anaconda installation, but I would be glad if you have an idea what could be causing this issue. By now it's possible to close this question.

Thanks!

hostname
+ hostname
cd path/to/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome1
+ cd path/to/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome1
time /path/to/anaconda3/bin/python path/to/NextPolish/lib/nextpolish2.py -sp -p 1 -g path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta -b path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc -i 0 -l path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list -r ont -o genome.nextpolish.part000.fasta
+ time /path/to/anaconda3/bin/python path/to/NextPolish/lib/nextpolish2.py -sp -p 1 -g path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta -b path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc -i 0 -l path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list -r ont -o genome.nextpolish.part000.fasta
[110589 INFO] 2021-12-07 11:22:42 Corrected step options:
[110589 INFO] 2021-12-07 11:22:42 
split:                        0
process:                      1
auto:                         True
read_type:                    1
block_index:                  0
window:                       5000000
uppercase:                    False
alignment_score_ratio:        0.8
alignment_identity_ratio:     0.8
out:                          genome.nextpolish.part000.fasta
genome:                       path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta
bam_list:                     path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list
block:                        path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc
[110589 WARNING] 2021-12-07 11:22:42 Adjust -p from 1 to 0, -w from 5000000 to 5000000, logical CPUs:4, available RAM:~6G, use -a to disable automatic adjustment.
Traceback (most recent call last):
  File "path/to/NextPolish/lib/nextpolish2.py", line 260, in <module>
    main(args)
  File "path/to/NextPolish/lib/nextpolish2.py", line 192, in main
    pool = Pool(args.process, initializer=start)
  File "/path/to/anaconda3/lib/python3.9/multiprocessing/context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
  File "/path/to/anaconda3/lib/python3.9/multiprocessing/pool.py", line 205, in __init__
    raise ValueError("Number of processes must be at least 1")
ValueError: Number of processes must be at least 1
Command exited with non-zero status 1
0.08user 0.00system 0:00.09elapsed 100%CPU (0avgtext+0avgdata 16528maxresident)k
0inputs+8outputs (0major+2520minor)pagefaults 0swaps

@moold
Copy link
Member

moold commented Dec 7, 2021

The RAM is too small

@Gerlex89
Copy link
Author

Gerlex89 commented Dec 9, 2021

Is there a proper way to increase it?

According to the FAQ of NextPolish it should be possible to increase it from the default 3G for Paralleltask, but I cannot see an effect in changing it in the cluster.cfg file or where precisely use this of the submit parameter. I still receive the same error.

Also how is that the memory isn't enough for only 3G?

@moold
Copy link
Member

moold commented Dec 9, 2021

The computer node you submitted only have ~6Gb memory, you can not change it by adjusting parameters, you need to change another computer nodes to run it.

EDITED:
Maybe you just forgot to change job_type = local to job_type = sge or others, if you want to submit your job on a computer cluster.

@Gerlex89
Copy link
Author

Gerlex89 commented Dec 9, 2021

Good to know. At least it's not a problem with my installation or files submitted.

These errors come from a local test. When the tool is used in a cluster it runs perfectly.

I was trying to figure out why there was such different behavior, but if it's due to a hardware restriction then there's nothing to do by now.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants