Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for nvidia-docker or singularity. #271

Open
medcelerate opened this issue Mar 3, 2020 · 8 comments
Open

Add support for nvidia-docker or singularity. #271

medcelerate opened this issue Mar 3, 2020 · 8 comments

Comments

@medcelerate
Copy link

It would be great to be able to either configure a custom ami to be used where different container runtimes can be defined or be able to run custom scripts to install other utilities.

@SooLee
Copy link
Member

SooLee commented Mar 12, 2020

@medcelerate sorry for getting back late. Custom AMIs are not currently supported, though we may add the support in the long run. Did you specifically need nvidia-docker or singularity? We can definitely look into more specific options.

@medcelerate
Copy link
Author

nvidia-docker in this case.

@medcelerate
Copy link
Author

Or run non-dockerized tools

@SooLee
Copy link
Member

SooLee commented Mar 16, 2020

@medcelerate running non-dockerized tools is possible by installing tools on the fly (using the shell option of Tibanna), but that would be inside something like an ubuntu docker container. I'm not sure if this could work with GPU-specific tools (is that what you're looking for?) I can certainly look into adding Nvidia-docker support in the next few days.

@nhartwic
Copy link

Seems like tibanna was updated to allow for configuring the AMI, at least based on the docs:

https://tibanna.readthedocs.io/en/latest/ami.html?highlight=ami#amazon-machine-image

...I think that in principle, I could 'fork' your current AMI and add appropriatte nvidia drivers in order to enable the use of gpus within tibanna jobs, but I've little experience with any of this and wanted to check in before I burn a week trying to figure it out.

And also maybe get this issue closed since it seems to be possible now to set AMI. Though GPU support still seems wanting.

@willronchetti
Copy link
Member

This response is somewhat long winded and not necessarily related to the original issue, but feel free to respond on this one and we can close/create a new one as needed.

Long story short, I think you will find it is not possible to add GPU support to Tibanna without fundamentally changing how it works. Full disclosure though my knowledge of GPU integration within the Cloud is somewhat limited, but I do believe it is analogous to any other EC2 style instance (AWS recommends https://aws.amazon.com/ec2/instance-types/g5/), meaning whatever you put on it must run on the GPU natively.

So you could add NVIDIA drivers to our AMI, but it wouldn't do you any good because you cannot attach GPUs to standard EC2 instances (I don't think, given the service that seems to implement this is EOL https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/elastic-graphics.html). You also wouldn't be able to launch the AMI on a GPU instance at all because of differing underlying architecture. If you're aware of a way to attach GPUs to AWS Cloud Instances, then you can disregard what I'm saying and let us know what you think can be done and we may consider it. Otherwise what follows I think is a significant undertaking we just can't justify right now, as standard instances are powerful enough for us.

There may be a path to accomplish this - but it will be very complex. Probably you'd need to update the AMI selection code to pick a custom GPU based AMI and provide your job to Tibanna as a shell command. Right now Tibanna uses CWL typically + Docker to do this, not sure what would be cleanest in the GPU context. But roughly speaking if you wanted to attempt it I'd follow the below steps:

  1. Create a GPU compatible AMI
  2. Debug until you can successfully launch into it
  3. Replicate the behavior done by the Tibanna Docker (ie: job tracking) in this file: https://github.com/4dn-dcic/tibanna/blob/master/awsf3-docker/run.sh
  4. Figure out a way to reasonably pass jobs to it, as Docker won't work in this case I don't think. This is probably where you will run into the most problems since most jobs require specialized software, and you don't want to put that into the AMI.

I think 1-3 can be accomplished with some leg work, but 4 will prove quite difficult. This is why Tibanna implements the "Docker within a Docker" model, so you can package arbitrary things into your jobs and Tibanna doesn't need to know about it.

@nhartwic
Copy link

nhartwic commented Mar 13, 2024

My original plan was to make a GPU compatible AMI, with the relevant tibanna dependencies, run my jobs as shell scripts containing singularity commands to allow my jobs to access to GPU on the host machine. Looking through your run script and thinking about it more, that clearly won't work.

Instead I'd probably need to fork tibanna and modify the run script so that gpus are passed (potentially optionally) through the docker calls using something like this, which will further complicate the requirements on the AMI. Getting this to further work with snakemake (which is my preferred method of launching jobs) will likely require significant updates to snakemake as well. At a minimum, I'll need to make a GPU compatible version of the snakemake docker container at which point non-containerized snakemake jobs should be able to access the GPU

IDK, it seems doable to me. Whether its worth doing personally, I'll have to consider.

@willronchetti
Copy link
Member

I was actually looking at the same article!

Looking into this more, it looks like I do have some misunderstanding on how the GPU instances work. They are in fact standard (x86 or AMD) host machines with GPU attachments as evidenced by NVIDIA driver compatible Ubuntu AMIs publicly available that will launch on their "GPU" instances (for example ami-0ef3e9355268c4dbc). So this may actually not be such a heavy lift. You may in fact be able to package the NVIDIA drivers onto our existing AMI and launch it directly on the AWS GPU instances. Worth a try I'd say. The snakemake thing may still be an issue though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants