Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move allocatecmd and deallocatecmd into their own plugin #800

Closed
ribab opened this issue Feb 27, 2019 · 2 comments
Closed

Move allocatecmd and deallocatecmd into their own plugin #800

ribab opened this issue Feb 27, 2019 · 2 comments

Comments

@ribab
Copy link
Contributor

ribab commented Feb 27, 2019

Here is the code I am using to use srun in an salloc with Open MPI. Starts around line 443 of SLURM.py. Will this interfere with anyone else's use case? It also makes SLURM.py more specific rather than making it more general. @ribab

    # Add support for srun in --no-shell allocation
    if self.allocated and (cmds['command'] == 'srun') and ("--job-name" in cmds['allocate_cmd']
        parseAllocate_cmd = cmds['allocate_cmd'].split()
        for word in parseAllocate_cmd:
            word = word.strip()
            if '--job-name' in word:
                jobname = word[11:]
        jobid = int(subprocess.check_output(['squeue', '--noheader', '--format', '%i', '--name'
        cmdargs.append('--jobid=%d' % (jobid))

Originally posted by @DebRez in #771 (comment)

This will only impact others if they are using SLURM plugin with a command that doesn't support "--jobid" switch, and are using allocate_cmd. In this case they will have a syntax error that might be difficult to debug. I'm not sure if this corner case would ever come up, but it might

Maybe it makes more sense if we do allocate_cmd and deallocate_cmd in separate stages that use a plugin that parses the jobid, and reference them using the log like so:

[TestRun:allocatecmd]
plugin = SallocNoShell
allocate = true
job_name = thisjob
# returns ${LOG:TestRun_allocatecmd.jobid}

[TestRun:slurm]
plugin = Slurm
...
options = --jobid=`${LOG:TestRun_allocatecmd.jobid}`

[TestRun:deallocatecmd]
plugin = SallocNoShell
deallocate = true
jobid = ${LOG:TestRun_allocatecmd.jobid}

What do you think? @DebRez

@DebRez
Copy link
Contributor

DebRez commented Feb 28, 2019

The current code requires that the command used is srun which does support --jobid so I think it should be ok the way it is.
if self.allocated and (cmds['command'] == 'srun') and ("--job-name" in cmds['allocate_cmd']
parseAllocate_cmd = cmds['allocate_cmd'].split()

The deallocate scancel uses the --job-name so that is not an issue.

@ribab

@ribab
Copy link
Contributor Author

ribab commented Mar 5, 2019

@DebRez

Maybe if the code checks whether --jobid= is not already in cmdargs. That might satisfy the corner case where someone using MTT wants to use --jobid in their srun options.

Like so:

# Add support for srun in --no-shell allocation
if self.allocated and (cmds['command'] == 'srun') and ("--job-name" in cmds['allocate_cmd'] \
            and '--jobid=' not in ' '.join(cmdargs):

    parseAllocate_cmd = cmds['allocate_cmd'].split()
    for word in parseAllocate_cmd:
        word = word.strip()
        if '--job-name' in word:
            jobname = word[11:]
    jobid = int(subprocess.check_output(['squeue', '--noheader', '--format', '%i', '--name'
    cmdargs.append('--jobid=%d' % (jobid))

DebRez added a commit to DebRez/mtt that referenced this issue Mar 11, 2019
SLURM.py in 'Add support for srun in --no-shell allocation' added to
boolean ' and '--jobid-' not in ' '.join(cmdargs):' to avoid a possible
corner case issue.

Fixes open-mpi#800

Signed-off-by: Deb Rezanka <rezanka@lanl.gov>
emallove pushed a commit to emallove/mtt that referenced this issue Dec 7, 2019
SLURM.py in 'Add support for srun in --no-shell allocation' added to
boolean ' and '--jobid-' not in ' '.join(cmdargs):' to avoid a possible
corner case issue.

Fixes open-mpi#800

Signed-off-by: Deb Rezanka <rezanka@lanl.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants