Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Container Scheduling with Azure Batch #335

Open
jmchilton opened this issue Aug 11, 2023 · 0 comments
Open

Support Container Scheduling with Azure Batch #335

jmchilton opened this issue Aug 11, 2023 · 0 comments

Comments

@jmchilton
Copy link
Member

It looks like the Azure API supports preparation and completion tasks beside container tasks.

This should mean setting up task dependency between pulsar and tool containers (biocontainers not containing pulsar) code can mimic TES support pretty directly. Documentation for how this works is available on some level:

Diagrams without and with MQ available:


The TES runner was added in #302 and a similar pattern should work for Azure. Most of the relevant code is in client.py - for instance https://github.com/galaxyproject/pulsar/blob/master/pulsar/client/client.py#L687.

I think the idea would be implementing a

class LaunchesAzureContainersMixin(CoexecutionLaunchMixin):

that mirrors LaunchesTesContainersMixin(CoexecutionLaunchMixin).

And then mirror the TES job clients:

class TesPollingCoexecutionJobClient(BasePollingCoexecutionJobClient, LaunchesTesContainersMixin):
    """A client that co-executes pods via GA4GH TES and depends on amqp for status updates."""

    def __init__(self, destination_params, job_id, client_manager):
        super().__init__(destination_params, job_id, client_manager)
        self._setup_tes_client_properties(destination_params)


class TesMessageCoexecutionJobClient(BaseMessageCoexecutionJobClient, LaunchesTesContainersMixin):
    """A client that co-executes pods via GA4GH TES and doesn't depend on amqp for status updates."""

    def __init__(self, destination_params, job_id, client_manager):
        super().__init__(destination_params, job_id, client_manager)
        self._setup_tes_client_properties(destination_params)

But setting up relevant azure properties.

After that is setup - I think build_client_manager in client_manager.py would need to dispatch on some relevant Azure connection properties to realize that PollingJobClientManager should be used:

def build_client_manager(**kwargs: Dict[str, Any]) -> ClientManagerInterface:
    if 'job_manager' in kwargs:
        return ClientManager(**kwargs)  # TODO: Consider more separation here.
    elif kwargs.get('amqp_url', None):
        return MessageQueueClientManager(**kwargs)
    elif kwargs.get("k8s_enabled") or kwargs.get("tes_url"):
        return PollingJobClientManager(**kwargs)
    else:
        return ClientManager(**kwargs)

MessageQueueClientManager and PollingJobClientManager would need to be updated to dispatch on these and produce the relevant clients.

That is all that should be strictly needed - but mirroring K8S and TES with connivence runners optimized for this configuration and test setups in Galaxy would be wonderful. See https://github.com/galaxyproject/galaxy/pull/14777/files for how to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant