Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snakemake with --cluster-status flag failing #1001

Open
dorks81 opened this issue May 17, 2021 · 2 comments
Open

Snakemake with --cluster-status flag failing #1001

dorks81 opened this issue May 17, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@dorks81
Copy link

dorks81 commented May 17, 2021

Hello,

I am trying to get snakemake file to run on LSF using the --cluster-status flag. I create a status1.py file and snakemake is crashing while trying to execute the file. It appears that snakemake trying to execute the command ./status1.py Job <615515> is submitted to queue <short> but is crashing because the <> brackets are not parsable from the bash command line. Is there a work around?

Thank you for the help

/bin/sh: -c: line 0: syntax error near unexpected token `615515'
/bin/sh: -c: line 0: `./status1.py Job <615515> is submitted to queue <short>.'
Exception in thread Thread-1:
Traceback (most recent call last):
  File "tools/Anaconda3/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 1036, in job_status
    subprocess.check_output(
  File "tools/Anaconda3/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "tools/Anaconda3/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command './status1.py Job <615515> is submitted to queue <short>.' returned non-zero exit status 1.
 
During handling of the above exception, another exception occurred:
 
Traceback (most recent call last):
  File "tools/Anaconda3/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "tools/Anaconda3/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "tools/Anaconda3/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 1083, in _wait_for_jobs
    status = job_status(active_job)
  File "tools/Anaconda3-2019.10/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 1055, in job_status
    raise WorkflowError(
snakemake.exceptions.WorkflowError: Failed to obtain job status. See above for error message.
@dorks81 dorks81 added the bug Something isn't working label May 17, 2021
@ImNotaGit
Copy link

ImNotaGit commented Jul 10, 2021

I ran into the same issue. On many systems a cluster job submission command (e.g. qsub) immediately returns just a job ID string, and it seems that snakemake relies on this behavior to pass the job ID to the --cluster-status script. However, on some systems the returned message by e.g. qsub is more than a job ID. On my system it's like 'Your job xxxxx ("xxx.sh") has been submitted'. The entire message is appended as argument to the --cluster-status script without quoting, leading to this error.

I have tried this temporary work around below, which seems to be working but I'm not sure.

Modifying the code snippet below in <snakemake_install_dir>/snakemake/executors/__init__.py script at the line where the exception happened:

Original code:

 subprocess.check_output(
                            "{statuscmd} {jobid}".format(
                                jobid=job.jobid, statuscmd=self.statuscmd
                            ),
                            shell=True,
                        )

Modified the above code by adding single quotes around {jobid}:

 subprocess.check_output(
                            "{statuscmd} '{jobid}'".format(
                                jobid=job.jobid, statuscmd=self.statuscmd
                            ),
                            shell=True,
                        )

Then in the --cluster-status python script, do something like:

 message = sys.argv[1]
 jobid = message.split(' ')[2] # do something specific to your system to parse the job ID from the message
print(check_status(jobid))

Edit: I noticed that the {jobid} quoting is now fixed in #1459

@iamh2o
Copy link
Contributor

iamh2o commented Oct 16, 2021

Hello- a little late, but might be useful to someone down the line. I was having similar problems with sge. It was returning a string to snakemake and snakemake was using that string to pass to my cluster-status script, causing a initially hard to debug situation. Ultimately, for qsub, all you need to do is specify --cluster " qsub -terse " and only the job id is returned upon submit. But, in my initial testing -terse worked for me, but when snakemake took over it was as if the flag was not set at all..... which turns out to be another sort of bug--- there were job scripts already composed which snakemake was using rather than my new command line options. I had to rm -rf .snakemake before -terse would work.
BUT before I figured that out, I used a hack that should work for most cases like this. I created a simple script that wrapped qsub, parsed it's output, and returned just the jobid back-- which worked like a charm. It would look like this:

I modified my qsub instance for the bsub output you list above (I also did a quick search, and it seems there is no bsub option to just return the job id. Here it is:

  1. create a new script, perhaps called snakesub someplace visible in your PATH, and add these 4 lines to it:
#!/bin/sh                                                                                      
ret=`bsub "$@" `
echo $ret | cut -d "<" -f 2 | cut -d ">" -f 1
exit 0
  1. make it executable: chmod a+x snakesub

  2. send that to snakemake, ie: snakemake --cluster " snakesub " and specify the --cluster-status ./somescript as the docs describe and now things should work :-)

This is basically just a pass through where we take ALL of the params given to snakesub are directly passed to bsub via "$@" <-- which is a special bash/shell thing which if using different shells you'll need to look up the analogous pattern. In this case it is critical the $@ be wrapped in "double quotes". With double quotes, the argument list given to snakesub is unchanged and passed to bsub. If not double quoted, it becomes a long string, which might work, but also likely to cause difficult bugs.

  1. Give it a test manually before trusting snakemake with it.snakesub ./test.sh ... you should just get the integer back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants