Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for GPU scheduling with Slurm #4308

Closed
adamnovak opened this issue Jan 5, 2023 · 9 comments
Closed

Support for GPU scheduling with Slurm #4308

adamnovak opened this issue Jan 5, 2023 · 9 comments
Assignees

Comments

@adamnovak
Copy link
Member

adamnovak commented Jan 5, 2023

As noted in ComparativeGenomicsToolkit/cactus#887, people want to use Cactus with GPU support on Slurm, but Toil donen't yet know how to ask for GPUs on Slurm, and we don't have a GPU Slurm cluster to test with yet.

We can probably just try throwing --gres=gpu:<count> into the submission commands, and hope that all Slurm clusters with GPUs use that name. Which I think they might, because despite the "generic resource" name of GRES, the documentation talsk about some pretty tight integration that Slurm has with e.g. nVidia's CUDA.

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1257

@oneillkza
Copy link

It might be worth looking at the discussions the Nextflow folks had around this four years ago: nextflow-io/nextflow#997

@oneillkza
Copy link

And yep, I think it may just be as simple as throwing --gres=gpu:<count> into the submission commands.

@oneillkza
Copy link

I'd be happy to give this a try on our cluster -- I have a test run of Cactus all ready to go.

I guess in the meantime I'll have to try submitting to a chunk of a node and having Cactus/Toil use the singleMachine batchSystem.

@oneillkza
Copy link

I believe @thiagogenez also has a GPU cluster at the EBI, and is interested in this functionality.

@thiagogenez
Copy link

thiagogenez commented Jan 9, 2023

thanks @oneillkza for letting me know about this issue.

Yes, I'm interested to see this functionality working with Cactus.

So far, I have run Cactus on a Slrum cluster without Toil scheduling capabilities. I set Toil to use singleMachine approach and schedule GPU jobs for GPU nodes using a script acting as an external job scheduler.

@thiagogenez
Copy link

And yep, I think it may just be as simple as throwing --gres=gpu:<count> into the submission commands.

Hi @adamnovak
Same here with me. Just adding --gres=gpu:4 to grab a GPU-enabled worker.

Ex: srun --gres=gpu:4 --mem 200gb -t 30 --pty bash

@adamnovak
Copy link
Member Author

It sounds like there's a lot of appetite to get this working outside UC.

If someone wanted to do a PR for this I could make sure to review it and get it merged.

To implement this, the SlurmBatchSystem would need an implementation of _check_accelerator_request() that overrides the default and returns True if jobs don't request any accelerators with a type other than "gpu". It would work a lot like the Kubernetes version:

def _check_accelerator_request(self, requirer: Requirer) -> None:
for accelerator in requirer.accelerators:
if accelerator['kind'] != 'gpu' and 'model' not in accelerator:
# We can only provide GPUs or things with a model right now
raise InsufficientSystemResources(requirer, 'accelerators', details=[
f'The accelerator {accelerator} could not be provided.',
'The Toil Kubernetes batch system only knows how to request gpu accelerators or accelerators with a defined model.'
])

Then we'd have to change the SlurmBatchSystem.Worker's prepareSbatch() to take an argument reflecting the number of GPUs to request, and make it generate the --gres flag in the command line it prepares.

Then we'd need to manage to actually supply that argument to prepareSbatch(). We'd need to thread the argument though prepareSubmission(), and because that is a method from the base AbstractBatchSystem.Worker class, we'd need to change its interface to allow the GPU information to come through it. We'd also need to change the place where prepareSubmission() is called so that it can pass the GPU information through, which means we'd need to extend the tuples we store in the AbstractGridEngineBatchSystem.Worker.waitingJobs list and in the inter-thread newJobs queue that appears at AbstractGridEngineBatchSystem.newJobs and AbstractGridEngineBatchSystem.Worker.newJobs. That could be accomplished by pulling out the right information from jobDesc.accelerators when we put a tuple into the inter-thread queue.

Then we'd just need to get other AbstractGridEngineBatchSystem.Worker implementations to tolerate the new argument to their prepareSubmission() implementations, and everything ought to start working.

@thiagogenez
Copy link

thanks @adamnovak

I'm interested to propose a PR to solve this issue. Will have a look. Cheers

@adamnovak
Copy link
Member Author

We fixed this in #4350.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants