Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

execStartCmd(execCmdId).exec failed to complete in docker 24.0.6 #2217

Open
leomao10 opened this issue Oct 9, 2023 · 2 comments
Open

execStartCmd(execCmdId).exec failed to complete in docker 24.0.6 #2217

leomao10 opened this issue Oct 9, 2023 · 2 comments

Comments

@leomao10
Copy link

leomao10 commented Oct 9, 2023

Version that have issue

  • docker:24.0.6-dind
  • docker-java:3.3.2
  • JAVA 11

We see the symptom with both macOS docker desktop & k8s cluster with docker:dind.

Symptom
Our application use dockerClient.execStartCmd to run some commands in the container, and wrap the logic with rxJava as syntax sugar so we can use something like this in our code:

    var execObservable = Observable.<String>create(emitter -> {
          docker.execStartCmd(execCmdId).exec(new Adapter<Frame>() {
            @Override
            public void onNext(Frame item) {
              logger.info("Exec next: {}", item);
              super.onNext(item);
              emitter.onNext(item.toString());
            }

            @Override
            public void onError(Throwable throwable) {
              logger.error("Error occurred" + throwable);
              super.onError(throwable);
              emitter.onError(throwable);
            }

            @Override
            public void onComplete() {
                logger.info("Exec completed");
                super.onComplete();
                emitter.onComplete();
            }
        });
    });

    execObservable.toList().blockingGet();

It all works fine until recently we upgraded to run the app along with docker:24.0.6-dind, the app would hang if the command we run complete fast. After debugging, we noticed that onComplete callback method never get trigger.

We also test various dind version, and they all works. Also, we don't see any similar issue if we execute slower commands(i.e. sleep 0.5). We highly suspect it is some behaviour change in latest dind, but can't find anything useful from it.

How to reproduce it
We managed to reproduce the issue with following test code:

import com.github.dockerjava.api.async.ResultCallback;
import com.github.dockerjava.api.DockerClient;
import com.github.dockerjava.api.async.ResultCallback.Adapter;
import com.github.dockerjava.api.command.CreateContainerResponse;
import com.github.dockerjava.api.model.Frame;
import com.github.dockerjava.api.model.HostConfig;
import com.github.dockerjava.api.model.PullResponseItem;
import com.github.dockerjava.core.DefaultDockerClientConfig;
import com.github.dockerjava.core.DockerClientBuilder;
import com.github.dockerjava.netty.NettyDockerCmdExecFactory;
import io.reactivex.Observable;
import org.junit.jupiter.api.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.List;

public class DockerJavaWithDindTest {
  private static final Logger logger = LoggerFactory.getLogger(DockerJavaWithDindTest.class);
  @Test
  public void testExecution() throws InterruptedException {
    DefaultDockerClientConfig config = DefaultDockerClientConfig
        .createDefaultConfigBuilder()
        .withDockerHost("tcp://localhost:2375") // point to dind container instead of local dockerd
        .build();
    DockerClient docker = DockerClientBuilder
        .getInstance(config)
        .withDockerCmdExecFactory(new NettyDockerCmdExecFactory())
        .build();
    String image = "alpine:latest";

    pullImage(docker, image);

    for (int i = 0; i < 10; i++) {
      logger.info("===================Iteration {}:================", i);
      runContainer(docker, image);
    }
  }

  private void pullImage(DockerClient docker, String image) throws InterruptedException {
    docker.pullImageCmd(image).exec(new ResultCallback.Adapter<PullResponseItem>() {
        @Override
        public void onNext(PullResponseItem item) {
          logger.info(item.getStatus());
          super.onNext(item);
        }

        @Override
        public void onError(Throwable throwable) {
          logger.error("Error occurred" + throwable);
          super.onError(throwable);
        }

        @Override
        public void onComplete() {
            logger.info("Image pull completed");
            super.onComplete();
        }
    }).awaitCompletion();

    logger.info("Completed pull image: {}", image);
  }

  private void runContainer(DockerClient docker, String image) {
    var cmd = List.of(
      "/bin/sh",
      "-c",
      String.format(
          "exit $( (/usr/bin/mkfifo %s && /bin/cat %s) || /bin/echo 1)",
          "/tmp/test_result",
          "/tmp/test_result"));

    CreateContainerResponse container = docker.createContainerCmd(image)
      .withCmd(cmd)
      .withHostConfig(HostConfig.newHostConfig())
      .exec();

    docker.startContainerCmd(container.getId()).exec();

    var execCreateCmdResponse = docker
      .execCreateCmd(container.getId())
      .withCmd("/bin/sh", "-c", "echo hello; echo 0 >> /tmp/test_result")
      .withAttachStderr(true)
      .withAttachStdout(true)
      .exec();

    var execCmdId = execCreateCmdResponse.getId();

    logger.info("Created Exec with id: {} in container: {}", execCmdId, container.getId());
    var execObservable = Observable.<String>create(emitter -> {
          docker.execStartCmd(execCmdId).exec(new Adapter<Frame>() {
            @Override
            public void onNext(Frame item) {
              logger.info("Exec next: {}", item);
              super.onNext(item);
              emitter.onNext(item.toString());
            }

            @Override
            public void onError(Throwable throwable) {
              logger.error("Error occurred" + throwable);
              super.onError(throwable);
              emitter.onError(throwable);
            }

            @Override
            public void onComplete() {
                logger.info("Exec completed");
                super.onComplete();
                emitter.onComplete();
            }
        });
    });

    execObservable.toList().blockingGet();
    logger.info("Container completed with status: " + docker.waitContainerCmd(container.getId()).start().awaitStatusCode());
  }
}

With dind contianer started:

docker run --privileged -d \
-e DOCKER_TLS_CERTDIR='' \
-p 2375:2375 \
--name dind docker:24.0.6-dind

Let me know if there is other info you need me to provide.

Thanks in advance.

Leo Liang

@sirantd
Copy link

sirantd commented Oct 17, 2023

Could anyone check it please, it affects our workload

@eddumelendez
Copy link
Member

Hi, thanks for the report. Given it started failing with latest release in docker, have you considered reporting this to docker instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants