Add a Dockerfile to build an image #140

egli · 2017-09-27T15:16:56Z

Here's a PR that adds a Dockerfile that

builds the pipeline using a standard maven image
installs the build artifacts from the build in a second (standard openjdk) image and runs the pipeline in the foreground.

Build and run the image as follows (you need the newest docker for the multistage build):

$ docker build -t daisyorg/pipeline2 .
$ docker run -d -p 8181:8181 daisyorg/pipeline2

You should now be able to connect to the pipeline using a client. However, at the moment this probably doesn't work as the pipeline by default is set up to work locally. This needs to be changed either in the config file in the original source or by changing the config file in the docker image.

bertfrees · 2017-09-27T15:48:28Z

Looks good. Before I merge this

I would like to add a little test
the configuration issue that you mention should be solved.

We can do this either by supporting a Maven property to set it at build time, like so: mvn clean package -Dpipeline.ws.localfs=false (does not work yet). Or what should also work is adding a remote argument to your ENTRYPOINT.

josteinaj · 2017-09-28T07:49:58Z

Another possibility for configuring the DP2 engine is to use sed (as in the snaekobbi/system Dockerfile):

So, for instance:

# Bind engine to 0.0.0.0 instead of localhost
RUN sed -i 's/org.daisy.pipeline.ws.host=.*/org.daisy.pipeline.ws.host=0.0.0.0/' /opt/daisy-pipeline2/etc/system.properties

# Enable calabash debugging
RUN sed -i 's/\(com.xmlcalabash.*\)INFO/\1DEBUG/' /opt/daisy-pipeline2/etc/config-logback.xml

# Enable 4 concurrent jobs
RUN sed -i 's/.*\(org.daisy.pipeline.procs\)=[0-9]*/\1=4/' /opt/daisy-pipeline2/etc/system.properties

For remote mode, the remote argument in ENTRYPOINT sounds cleaner though.

bertfrees · 2017-09-28T08:30:55Z

The sed way is a good alternative for when you don't have the source files.

egli · 2017-09-28T12:37:56Z

I'm adding a remote argument to the ENTRYPOINT, so the pipeline is now started in remote mode. However it now complains about

	bind => [1 parameter bound]   @o.d.p.p.logging.Slf4jSessionLogger:108#log
ERROR  [o.d.p.webservice.impl.PipelineWebService] 

************************************************************
WS mode authenticated but the client store is empty, exiting
please provide values for the following properties in etc/system.properties: 
-org.daisy.pipeline.ws.authentication.key    
-org.daisy.pipeline.ws.authentication.secret 
************************************************************

i.e. neither the key nor the secret is specified. I'm not sure how this should be handled.

egli · 2017-09-28T12:40:15Z

@bertfrees what kind of test did you have in mind. How can you test a docker image, other than running it manually?

egli · 2017-09-28T12:43:35Z

@josteinaj I'm not so sure if I like the idea of seding around in the config files. Can't we just set them up so that they just work?

bertfrees · 2017-09-28T13:14:23Z

Right, I forgot about that. If you specify remote it also enables authentication. Can you try with the environment variable PIPELINE2_LOCAL=false? Otherwise we'll have to edit the system.properties file. You don't need to sed around for that, it's preferable to filter the source file with Maven properties.

Regarding the test, I was thinking about just a shell script that starts the container and connects to it with the CLI, and possibly runs a sample file through it. Basically I was gonna copy this: https://github.com/bertfrees/benetech-docker-pipeline2/blob/test/Makefile.

P.S.: read: http://daisy.github.io/pipeline/Get-Help/User-Guide/Pipeline-as-Service

josteinaj · 2017-09-28T16:14:55Z

@egli instead of sed you could create a ./docker-resources/ directory in the repository (or maybe just docker) and put a custom system.properties etc. in there. Then COPY (or ADD, I always confuse those two) it into the image.

egli · 2017-09-29T14:34:45Z

I believe the configuration issue is now solved via environment variables.

Re the test I'll look at the example you provided on Monday

bertfrees · 2017-10-02T11:26:10Z

Good job. The only thing I wasn't quite happy with at first were the new environment variables. But after talking to you this morning I see a valid use case after all.

Basically I guess I'm just not happy with the current state of configuration in general. There are sometimes three, or even four, different ways to configure something: Java system properties, environment variables, script arguments, and for Debian there is also the special "REMOTE" variable in /etc/default/daisy-pipeline2 (no idea why I did that). If anything I'd like to remove ways to configure something, not add more.
You changed the bash script only, but the Windows batch file should ideally support the same environment variables.
Instead of adding more environment variables in an ad-hoc way whenever someone requests them I'd rather have a uniform way to map environment variables to Java system properties. The uniformity makes things more understandable and it can also make the shell scripts less verbose. Note that there is also the possibility to completely eliminate the Java system properties (except the ones belonging to external libs).
"Simplifying the configuration with Docker" alone is not really a valid reason for adding more environment variables I think. It's perfectly possible to use the system.properties file with Docker. However, the org.daisy.pipeline.ws.authentication.key and org.daisy.pipeline.ws.authentication.secret are in fact an exception to this. To include these in your properties file during build of the image is a bad idea, because they'll be stored as cleartext in the image and on Github, and the alternative is overwriting the properties file at run time via a volume which is clumsy.

bertfrees · 2017-10-02T11:44:23Z

The automatic mapping from environment variables to system properties could be (took some inspiration from https://github.com/weavejester/environ):

lowercase everything
translate _ to .
translate pipeline. to org.daisy.pipeline.

We should do this only for the variables that start with PIPELINE_ (*). The few properties that start with org.pipeline should be changed to org.daisy.pipeline...

(*): and more specifically only the ones that are actually meant for configuration; the other ones should be removed from system.properties: see also the section labeled "do not edit" in http://daisy.github.io/pipeline/wiki/Configuration-Files.

The Dockerfile uses a multistage build to first build the artifacts using maven. Then it copies the artifacts into a final image which exposes the port and starts the pipeline.

If PIPELINE2_AUTH_CLIENTKEY and/or PIPELINE2_AUTH_CLIENTSECRET are defined in the environment, when starting the pipeline, use those values. This simplyfies dockerization of the pipeline.

that is used everywhere else, for example in the default config of the pipeline cli

so that it can be set at run time for example when starting a Docker image and remove it from the system.properties (otherwise setting it as an option when starting the JVM seems to have no effect)

The test starts two containers based on the same image. One for the pipeline itself and a second one for the cli. It then starts a script from the cli.

bertfrees · 2017-10-10T10:27:22Z

I've added some more stuff here: daisy/pipeline@f00f389^...docker

bertfrees · 2017-10-11T11:27:00Z

There is something fishy with setting HOST to 0.0.0.0 I think because this address also shows up in e.g. ws/scripts.

bertfrees · 2017-10-16T16:57:21Z

For curl this does not seem to matter apparently (e.g. curl http://0.0.0.0:8181/ws/alive just works) but maybe not all tools can cope.

Also, this issue makes me wonder whether the web API should even expose the full paths. Why not just use the relative paths?

Another observation is that the CLI does not even use the href attributes, but instead generates the URLs itself based on other attributes like id. It seems that it is currently not possible to generate the download URLs for individual files, only ZIPs, but we could possibly support that too. The result would be that we wouldn't need to rely on href attributes at all. Or is that against the REST principle maybe?

@rdeltour @josteinaj Your thoughts?

bertfrees · 2017-10-16T17:16:16Z

@egli I have another request.

Currently, if you start the pipeline2 Docker service, you have to wait a few seconds before the web service is up. There is apparently a Docker feature called "health status" that can help you with that. I haven't tried it myself because my version of Docker is not new enough, but it would look something like this in docker-compose.yml:

    healthcheck:
      # Waiting for web service to be up...
      test: ["CMD", "curl", "http://localhost:8181/ws/alive"]
      interval: 10s
      timeout: 10s
      retries: 5

End in the depending service you do:

    depends_on:
      pipeline2:
        condition: service_healthy

The healthcheck part can also be moved to the Dockerfile itself. This would be a nice addition I think. Could you look into it?

See

egli · 2017-10-26T12:32:54Z

@bertfrees I just added a health check to the pipeline2 docker image. As for the depends_on: Unfortunately version 3 of the docker-compose format no longer supports the condition form of depends_on, so your second suggestion is no longer possible. I think the argument is that the application itself (in our case the webui) knows best when and how to reconnect.

bertfrees · 2017-10-26T14:06:58Z

Huh? Then what is the point of the health check? :/

Did you read that argument somewhere, or is it yours?

egli · 2017-10-26T14:22:54Z

I read something along that line deep down in a StackOverflow comment (to a solution that was still detailing the depends_on: condition solution). I think it has to do with how docker swarm handles this.

bertfrees · 2017-10-27T13:22:00Z

I think Docker aren't communicating this very well. I could only find some explanation here and in this discussion.

What it comes down to I think is that they are moving away from docker-compose and towards a new approach in which apparently some concepts like depends_on do not fit anymore.

I'm not really satisfied by the explanation they give for this choice. I'm not convinced that an application that depends on a service always knows best when the service is ready and how long to wait for it. (But to be fair I have to say that I haven't read that much about the new approach yet.)

Anyway, a health check does not make sense if you can't use it, and as long as you are using docker-compose I think it makes perfect sense to implement the "waiting" with docker-compose. I suggest we just keep using the v2 format so that I can use the condition form of depends_on. Docker say that there's no reason to use the v3 format if you don't intend to use swarm services. Support for the version 3 format in docker-compose is only meant to help the transition.

egli · 2017-10-28T19:04:03Z

Docker say that there's no reason to use the v3 format if you don't intend to use swarm services.

I read that too. I use docker-compose to quickly bring up a test instance, for that use case it is quite good. ATM I do not intend to use swarm services. But then again when just quickly bring up a test instance you don't really care that much about the health service. A few failed attempts of the webui to connect to the pipeline are not the end of the world.

I suggest we just keep using the v2 format so that I can use the condition form of depends_on.

Makes sense.

bertfrees · 2017-10-29T09:12:12Z

I want to use it for other things too, like tests. With tests it's quite convenient when you can just run them and be sure the Pipeline server is running. I think It's annoying if you need to implement the waiting logic in every little test application that you write, while it can be centralized in the server, which knows best how long to wait etc. That's why I liked the idea of the health check. Until I'm convinced that the other approach is better I want to try this.

This reverts 36ed956 and f23c1d6.

Also add a Makefile

and add it to the Makefile

…m properties - PIPELINE2_HOME - PIPELINE2_BASE - PIPELINE2_DATA - PIPELINE2_WS_LOCALFS - PIPELINE2_WS_AUTHENTICATION *nix only.

because you can now directly specify the Pipeline properties through environment variables. Note that this will only work for system properties that start with "org.daisy.pipeline" though.

Instead use the PIPELINE2_WS_LOCALFS and PIPELINE2_WS_AUTHENTICATION environment variables directly.

see #137

See https://docs.docker.com/engine/reference/builder/#healthcheck

bertfrees · 2017-11-02T12:26:06Z

I have pushed my version of the branch. Depends on daisy/pipeline-framework#126.

bertfrees · 2017-11-14T09:36:38Z

Should be merged into develop, not master!

rdeltour

LGTM (I'm not experienced in Docker, but having read the discussion the proposed changes makes sense to me).

Thanks @egli (and @bertfrees and @josteinaj), looks like a very useful thing to have!

see #140

bertfrees added 2 commits August 31, 2017 14:31

[maven-release-plugin] prepare for next development iteration

23e5c38

Add dtbook-to-rtf module

e3304a0

bertfrees mentioned this pull request Oct 2, 2017

Simplify configuration #141

Closed

4 tasks

egli added 8 commits October 9, 2017 19:52

Add a Dockerfile to build an image

d463250

The Dockerfile uses a multistage build to first build the artifacts using maven. Then it copies the artifacts into a final image which exposes the port and starts the pipeline.

Start the pipeline in remote mode

aaa10a0

Add a newline at the end of the file

fbced57

Allow declaration of client key and secret via env variables

36ed956

If PIPELINE2_AUTH_CLIENTKEY and/or PIPELINE2_AUTH_CLIENTSECRET are defined in the environment, when starting the pipeline, use those values. This simplyfies dockerization of the pipeline.

Set up the environment non-local and authenticated for docker

91a547c

Default the clientkey to the usual default

387ca2c

that is used everywhere else, for example in the default config of the pipeline cli

Expose the PIPELINE2_HOST as an env variable

f23c1d6

so that it can be set at run time for example when starting a Docker image and remove it from the system.properties (otherwise setting it as an option when starting the JVM seems to have no effect)

Add a test for the Docker image

35a4436

The test starts two containers based on the same image. One for the pipeline itself and a second one for the cli. It then starts a script from the cli.

bertfrees mentioned this pull request Oct 10, 2017

Profile to build Docker images #139

Closed

This was referenced Oct 19, 2017

Automatically shut down if it failed to start properly #143

Open

Docker support daisy/pipeline-tasks#116

Closed

bertfrees and others added 8 commits November 2, 2017 10:01

Make use of standardized environment variables

8851938

This reverts 36ed956 and f23c1d6.

Add a second Dockerfile that does not build the Pipeline inside Docker

7f8f8c1

Also add a Makefile

Fix Docker test

d8fc1e3

and add it to the Makefile

Simplify start script by using environment variables instead of syste…

770eec2

…m properties - PIPELINE2_HOME - PIPELINE2_BASE - PIPELINE2_DATA - PIPELINE2_WS_LOCALFS - PIPELINE2_WS_AUTHENTICATION *nix only.

Remove PIPELINE2_OPTS environemnt variable

c3baadb

because you can now directly specify the Pipeline properties through environment variables. Note that this will only work for system properties that start with "org.daisy.pipeline" though.

Debian package: remove "REMOTE" variable from /etc/default

c194f2e

Instead use the PIPELINE2_WS_LOCALFS and PIPELINE2_WS_AUTHENTICATION environment variables directly.

Remove the webui profile

c445d21

see #137

Add a health check to the docker image

c946f36

See https://docs.docker.com/engine/reference/builder/#healthcheck

bertfrees force-pushed the feature/dockerimage branch from 8c41fbe to c946f36 Compare November 2, 2017 12:24

bertfrees requested a review from rdeltour November 2, 2017 12:26

bertfrees mentioned this pull request Nov 2, 2017

Add a Dockerfile daisy/pipeline-webui#113

Closed

rdeltour approved these changes Nov 15, 2017

View reviewed changes

bertfrees added a commit that referenced this pull request Nov 27, 2017

Add a Dockerfile

326176e

see #140

bertfrees closed this Nov 27, 2017

bertfrees deleted the feature/dockerimage branch November 27, 2017 10:11

bertfrees added this to the v1.11.0 milestone Jan 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Dockerfile to build an image #140

Add a Dockerfile to build an image #140

egli commented Sep 27, 2017

bertfrees commented Sep 27, 2017 •

edited

josteinaj commented Sep 28, 2017 •

edited

bertfrees commented Sep 28, 2017

egli commented Sep 28, 2017

egli commented Sep 28, 2017

egli commented Sep 28, 2017

bertfrees commented Sep 28, 2017 •

edited

josteinaj commented Sep 28, 2017

egli commented Sep 29, 2017

bertfrees commented Oct 2, 2017

bertfrees commented Oct 2, 2017 •

edited

bertfrees commented Oct 10, 2017

bertfrees commented Oct 11, 2017

bertfrees commented Oct 16, 2017

bertfrees commented Oct 16, 2017

egli commented Oct 26, 2017

bertfrees commented Oct 26, 2017

egli commented Oct 26, 2017

bertfrees commented Oct 27, 2017

egli commented Oct 28, 2017

bertfrees commented Oct 29, 2017

bertfrees commented Nov 2, 2017

bertfrees commented Nov 14, 2017

rdeltour left a comment

Add a Dockerfile to build an image #140

Add a Dockerfile to build an image #140

Conversation

egli commented Sep 27, 2017

bertfrees commented Sep 27, 2017 • edited

josteinaj commented Sep 28, 2017 • edited

bertfrees commented Sep 28, 2017

egli commented Sep 28, 2017

egli commented Sep 28, 2017

egli commented Sep 28, 2017

bertfrees commented Sep 28, 2017 • edited

josteinaj commented Sep 28, 2017

egli commented Sep 29, 2017

bertfrees commented Oct 2, 2017

bertfrees commented Oct 2, 2017 • edited

bertfrees commented Oct 10, 2017

bertfrees commented Oct 11, 2017

bertfrees commented Oct 16, 2017

bertfrees commented Oct 16, 2017

egli commented Oct 26, 2017

bertfrees commented Oct 26, 2017

egli commented Oct 26, 2017

bertfrees commented Oct 27, 2017

egli commented Oct 28, 2017

bertfrees commented Oct 29, 2017

bertfrees commented Nov 2, 2017

bertfrees commented Nov 14, 2017

rdeltour left a comment

Choose a reason for hiding this comment

bertfrees commented Sep 27, 2017 •

edited

josteinaj commented Sep 28, 2017 •

edited

bertfrees commented Sep 28, 2017 •

edited

bertfrees commented Oct 2, 2017 •

edited