Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: SIGTERM signal not passed to Docker task process #4548

Open
2 of 4 tasks
natbprice opened this issue Nov 30, 2023 · 3 comments
Open
2 of 4 tasks

[BUG]: SIGTERM signal not passed to Docker task process #4548

natbprice opened this issue Nov 30, 2023 · 3 comments

Comments

@natbprice
Copy link

What happened?

My pipeline step needs to perform some cleanup if it is terminated (e.g., cancelled, times out). This works correctly if I run the task on the host machine or in a Docker container on my local machine. However, if I run the step inside a container with Azure Pipelines it is not handled correctly. I believe this is because the process responsible for running tasks in Azure Pipelines is running as PID 1 and it exits without passing the SIGTERM signal to my task.

Here is a minimal reproducible example:

trigger: none

pool:
  vmImage: 'ubuntu-latest'

container:
  image: ubuntu:22.04
  options: --init
    
steps:
- checkout: none
- bash: |
    # Function to handle SIGTERM signal
    terminate() {
      echo "SIGTERM signal received. Exiting..."
      exit 0
    }
    trap terminate SIGTERM
    echo "Waiting for SIGTERM signal..."
    while true; do
      sleep 1
    done
  timeoutInMinutes: 1
  target: host
  displayName: HostTimeout
- bash: |
    # Function to handle SIGTERM signal
    terminate() {
      echo "SIGTERM signal received. Exiting..."
      exit 0
    }
    trap terminate SIGTERM
    echo "Waiting for SIGTERM signal..."
    while true; do
      sleep 1
    done
  condition: always()
  timeoutInMinutes: 1
  displayName: DockerTimeout

The "HostTimeout" step on the host has the expected output:

Waiting for SIGTERM signal...
SIGTERM signal received. Exiting...
##[error]The task has timed out.
Finishing: HostTimeout

The "DockerTimeout" step runs in the container and exits prematurely:

Waiting for SIGTERM signal...
##[error]The task has timed out.
Finishing: DockerTimeout

I have explored running with or without the Docker --init flag and calling my script with exec but it did not resolve the issue.

For the simple example, I may be able to configure a separate pipeline step that would perform cleanup, but this doesn't work for my real use case where the process called in the step has complex cleanup logic built in.

Versions

Agent Version 3.230.0 / Ubuntu 22.04.3 LTS

Environment type (Please select at least one enviroment where you face this issue)

  • Self-Hosted
  • Microsoft Hosted
  • VMSS Pool
  • Container

Azure DevOps Server type

dev.azure.com (formerly visualstudio.com)

Azure DevOps Server Version (if applicable)

No response

Operation system

No response

Version controll system

No response

Relevant log output

No response

@DenisNikulin5
Copy link
Contributor

Hi @natbprice thanks for reporting! We are working on more prioritized issues at the moment, but will get back to this one soon.

@technic
Copy link

technic commented Dec 17, 2023

Azure pipelines worker executes steps inside docker container with docker exec. When the step is cancelled SIGINT is send to the docker exec process. This does not send signals to the process inside the container and it continues running in the background (moby/moby#9098). Thus cancelled task is not stopped at all. The following tasks which are configured to run with always() condition are started in parallel. Afterwards container is removed and all processes created with docker exec inside of the container are killed.

@natbprice
Copy link
Author

natbprice commented Dec 17, 2023

@technic I was just using always() to show difference between a task running on host versus container. Without always the second demo task would not run.

In my testing, removing the container at the end of pipeline will not properly terminate running processes in that the task will not have an opportunity to catch signal and terminate cleanly.

You can also use always() with a cleanup step that will manually stop the running process that was not properly terminated. This is my current workaround. I am not sure if that is what you were suggesting.

As it relates to this ticket, it would be better if pipeline tasks were properly terminated without need for 2nd cleanup task. I am not sure if this is the job of pipeline agent or bash task. However, it doesn’t seem unsolvable for agent or bash task to at least stop the processes it has started. I believe it already assigns some unique ID to the task so it just needs to exec some cleanup if there is timeout or cancellation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants