Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services setup and teardown procedures with specific scripts and warmup time #2816

Open
raffienficiaud opened this issue Apr 16, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@raffienficiaud
Copy link

Is your feature request related to a problem? Please describe.
It is very difficult to start and stop services when:

  • the startup takes a bit of time: I am unable to instruct snakemake on how much time it might take to get the service up and running and when the service does not create the output file almost immediately, snakemake sees this as an error;
  • the service needs to be stopped with a specific command.

For the teardown part, I tried to capture signals with a trap, but the trap seem to be executed after snakemake terminates.

Describe the solution you'd like

I would like

  • to be able to start services by indicating a starting and stopping sequences,
  • an option to specify the expected amount of time needed for starting a service

Describe alternatives you've considered
A minimal example starting docker and waiting for the service to be ready would be like this:

import os
from pathlib import Path

rule all:
    input:
        "output.test"

rule start_docker_mysql:
    output:
        service("docker_mysql.socket"),
    resources:
        runtime="1h",
    shell:
        """
        set +xe

        var_docker_instance_name="some-mysql"

        echo starting mysql container '$var_docker_instance_name'
        docker run \
            --name $var_docker_instance_name \
            --rm \
            -e MYSQL_ROOT_PASSWORD=my-secret-pw \
            -p 3306:3306 \
            -d \
            mysql

        # this parses the logs content to check if the service has started properly
        echo waiting for service to be available
        check_task() {{
            echo "Checking docker start #$1 ..."
            var_running=$(docker logs $var_docker_instance_name 2>&1 | grep "MySQL init process done. Ready for start up.")
            if [[ ! -z "${{var_running}}" ]]; then
                echo "Docker container $var_docker_instance_name running: $var_running"
                /usr/bin/true
            else
                echo "Docker container $var_docker_instance_name NOT running"
                /usr/bin/false
            fi
        }}

        # looping until the service starts
        remaining_iterations=10
        check_task $remaining_iterations
        res=$?
        while [ $res != 0 ] && [ $remaining_iterations -gt 0 ] ; do
            sleep 3
            check_task $remaining_iterations
            res=$?
            remaining_iterations=$(($remaining_iterations - 1))
            echo $remaining_iterations
        done
        if [ $remaining_iterations -le 0 ] ; then
            (>&2 echo "Docker failed to start after 10 iterations!")
            (>&2 docker logs $var_docker_instance_name)
            exit 1
        else
            echo "Docker started and all services up ... "
        fi

        # terminates the container
        # capturing the shutdown of the service: does not work well
        shutdown_container() {{
            set +e
            echo "Shutting down container '$var_docker_instance_name'"
            docker stop $var_docker_instance_name
            echo "Container '$var_docker_instance_name' terminated"
            exit 0
        }}

        trap "shutdown_container" HUP INT QUIT SIGTERM USR1

        # make the service available to snakemake
        touch {output}

        # infinite loop
        echo "waiting for the service to be stopped"
        while [ 'true' ] ; do
            sleep 1
            # echo .
        done
        """

rule load_dump_to_db:
    output:
        "output.test"
    input:
        docker_service="docker_mysql.socket"
    shell:
        """
        set +xe

        var_docker_instance_name="some-mysql"

        docker ps -a

        sleep 3

        echo "Creating the temporary database"
        docker exec \
            -i $var_docker_instance_name \
            sh -c 'exec mysql -uroot -p"my-secret-pw" --execute="CREATE DATABASE test_db;"'

        touch output.test
        """

The execution log shows this:

service_start % snakemake -c 2
Assuming unrestricted shared filesystem usage.
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 2
Rules claiming more threads will be scaled down.
Job stats:
job                   count
------------------  -------
all                       1
load_dump_to_db           1
start_docker_mysql        1
total                     3

Select jobs to execute...
Execute 1 jobs...
[Tue Apr 16 09:36:46 2024]

group job 264f7d13-1c86-4936-8066-5cdad43a14d4 (jobs in lexicogr. order):

    [Tue Apr 16 09:36:46 2024]
    localrule load_dump_to_db:
        input: docker_mysql.socket
        output: output.test
        jobid: 1
        reason: Missing output files: output.test; Input files updated by another job: docker_mysql.socket
        resources: tmpdir=/var/folders/gy/4fqcxthx7jlgdblgz0p1ty1m0000gp/T


    [Tue Apr 16 09:36:46 2024]
    localrule start_docker_mysql:
        output: docker_mysql.socket (service)
        jobid: 2
        reason: Missing output files: docker_mysql.socket
        resources: tmpdir=/var/folders/gy/4fqcxthx7jlgdblgz0p1ty1m0000gp/T, runtime=60

Waiting at most 5 seconds for missing files.
starting mysql container $var_docker_instance_name
11bdc337e3f71e8f7d3dbb8776f713c22758d63780f4ff49c6464f273fef81da
waiting for service to be available
Checking docker start #10 ...
Docker container some-mysql NOT running
Checking docker start #10 ...
Docker container some-mysql NOT running
9
[Tue Apr 16 09:36:51 2024]
Error in group 264f7d13-1c86-4936-8066-5cdad43a14d4:
    jobs:
        rule start_docker_mysql:
            jobid: 2
            output: docker_mysql.socket (service)
        rule load_dump_to_db:
            jobid: 1
            output: output.test

Shutting down, this might take some time.
Checking docker start #9 ...
Docker container some-mysql running: 2024-04-16 07:36:53+00:00 [Note] [Entrypoint]: MySQL init process done. Ready for start up.
8
Docker started and all services up ... 
waiting for the service to be stopped
^CTerminating processes on user request, this might take some time.
Shutting down container 'some-mysql'
Complete log: .snakemake/log/2024-04-16T093646.203842.snakemake.log
WorkflowError:
At least one job did not complete successfully.
service_start % some-mysql
Container 'some-mysql' terminated

service_start % 

which indicates that

  1. snakemake is not waiting for the service file docker_mysql.socket to appear
  2. the trap is executed after snakemake terminates (Container 'some-mysql' terminated printed after termination)

Additional context
I could not find anywhere such an example and I am deducing this does not exist already. Note that mysql here is only for being able to perform sql commands and is not part of the processing itself, we can't use that docker image for running the scripts.

@raffienficiaud raffienficiaud added the enhancement New feature or request label Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant