Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems handling errors in reference genomes #4

Open
snayfach opened this issue Dec 17, 2021 · 1 comment
Open

Problems handling errors in reference genomes #4

snayfach opened this issue Dec 17, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@snayfach
Copy link

snayfach commented Dec 17, 2021

I'm trying to run PHIST on a single viral contig vs a large database of gzipped bacterial genomes. The program finishes without errors, but stops after the first ~715 reference genomes and reports these in the *_common_kmers.csv output file. Additionally, the number of reference genomes processed by PHIST is non-deterministic. Sometimes it stops after 715, 720, or 730 genomes processed.

Update: I unzipped the reference genomes. Now the program is printing a warning for certain genomes and stalling in what appears to be an infinite loop. After CTRL+C, here's the error message:

File "PHIST/phist.py", line 107, in
subprocess.run(cmd)
File "/usr/lib64/python3.6/subprocess.py", line 425, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/usr/lib64/python3.6/subprocess.py", line 855, in communicate
self.wait()
File "/usr/lib64/python3.6/subprocess.py", line 1477, in wait
(pid, sts) = self._try_wait(0)
File "/usr/lib64/python3.6/subprocess.py", line 1424, in _try_wait
(pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt

Turns out these genomes were empty files. After uncompressing and deleting these, the program finished without errors.

It would be great if the behavior of the program was improved when encountering these corrupted reference genomes. Either a warning that skips over them, or an error with informative error message

@snayfach snayfach changed the title PHIST doesn't compare query vs all reference genomes Problems handling errors in reference genomes Dec 18, 2021
@agudys agudys self-assigned this Dec 20, 2021
@agudys agudys added the enhancement New feature or request label Dec 20, 2021
@agudys
Copy link
Member

agudys commented Dec 20, 2021

Hello!
Thank you for the update. We will try to make the next release of PHIST more error resistant.

Adam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants