Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests to confirm pangeo binder configuration successful #10

Open
d-diaz opened this issue Jan 30, 2019 · 7 comments
Open

tests to confirm pangeo binder configuration successful #10

d-diaz opened this issue Jan 30, 2019 · 7 comments

Comments

@d-diaz
Copy link

d-diaz commented Jan 30, 2019

I'm working on getting a lidar point cloud processing pipeline running on Pangeo. Because of the lidar software requirements necessitating root priveleges to set up, I had to resort to using a Dockerfile to handle the installation and configuration. I've integrated all the requirements in the cookie cutter environment.yml into my own package's environment.yml and consolidated all the commands from the cookie cutter start, postBuild, and apt.txt files into my Dockerfile.

I am wondering if there's a way for me to confirm whether the pangeo configuration is actually fully working as expected once the build completes. Are there any test scripts that could be run for this?

@jhamman
Copy link
Member

jhamman commented Jan 31, 2019

@d-diaz - you should be able to test your configuration using repo2docker. Instructions are here: https://repo2docker.readthedocs.io/en/latest/usage.html#calling-repo2docker

@d-diaz
Copy link
Author

d-diaz commented Feb 8, 2019

I'm struggling to figure out how to integrate the commands from the postBuild and start files into a Dockerfile-based build. I've played around with a few things, and currently have the postBuild commands at the end of the Dockerfile, and the start file being called as an ENTRYPOINT in the Dockerfile. When I included the start file commands inside the Dockerfile, I ran into the problem that the JUPYTERHUB_USER variable was was not set anywhere, so the changes to jupyterlab-workspace.json were not being implemented correctly (they were being done with '' as the stand-in for JUPYTERHUB_USER). In it's current state, when I try to launch my repo with binder, the image builds, but the server fails to launch.

Is JUPYTERHUB_USER supplied as an argument to build or run the docker image?
Do I need to include a .dask/config.yaml file in my repo somewhere?

image

@jhamman
Copy link
Member

jhamman commented Feb 9, 2019

I'm seeing the following log message when launching your binder:

kubectl logs --namespace pangeo-binder jupyter-d-2ddiaz-2dpangeo-5flidar-2dzv5ef6av
jupyter-lab: 1: jupyter-lab: [${HOME}/binder/start]: not found

Not sure exactly why but I'm guessing your start script is failing some how. I'd take a close look at this script to see what could be going wrong.

@d-diaz
Copy link
Author

d-diaz commented Feb 10, 2019

Can you elaborate what you mean by "when launching your binder" so that I can reproduce that error message? When I use the pangeo-binder site and point it at my repo, it says there's an image already built, then tries to launch the server but fails (as in screenshot above).

UPDATE: It's definitely seems related to the last line in my Dockerfile: ENTRYPOINT ['~/binder/start.sh'] . If I comment that line out, the server will start, but I am still unable to start a notebook. When I try to run a notebook, it fails, and says "Model not defined". I can open a terminal and see that the start.sh file is where I expect it to be, and it can be run using ~/binder/start.sh to update jupyterlab-workspace.json.

@jhamman
Copy link
Member

jhamman commented Feb 10, 2019

I am an admin on Pangeo's binder cluster which allows me to run the logs command. You wont be able to run that.

I don't have any experience using binder/repo2docker with Dockerfiles directly. I suggest working on your build/launch locally using repo2docker before trying to use binder (see my first comment in this thread).

@d-diaz
Copy link
Author

d-diaz commented Feb 11, 2019

After running repo2docker on a few example repos that are working and watching what docker commands were being issued, I've added a bunch more commands to my Dockerfile (some of which may be superfluous). Nevertheless, the the build seems to be working locally (using jupyter-repo2docker), with the exception that the entrypoint (start.sh) script doesn't have a defined environment variable on my machine for JUPYTERHUB_USER, so when I run repo2docker locally, it seems to build the image, but then inserts an empty string into the jupyterlab-workspace.json instead of a valid entry for JUPYTERHUB_USER. You can see this in the last line where is says /user//lab does not match page_url ...:

Step 34/34 : ENTRYPOINT /bin/bash ${HOME}/binder/start.sh
 ---> Running in c43f2ae3dc9b
Removing intermediate container c43f2ae3dc9b
 ---> 40ec96c9bf3b
{"aux": {"ID": "sha256:40ec96c9bf3b994be38dfb2c5ede0894efee0b8468d12f9bed3ddf72f97bd0d7"}}Successfully built 40ec96c9bf3b
Successfully tagged r2d-2e1549846958:latest
/home/ddiaz/binder/jupyterlab-workspace.json is not a valid workspace:
/user//lab does not match page_url or start with workspaces_url.

As far as I can tell, the image build seems fine otherwise. Still getting server error, failure to launch using binder, however. If this image were run from a machine that had a defined JUPYTERHUB_USER environment variable, I think it would work if that variable were passed as an environment argument to docker run (i.e., docker run -e JUPYTERHUB_USER=$JUPYTERHUB_USER ... ). I'd suspect that's probably not how docker run is invoked, but can't seem to find that out browsing around the code anywhere. I've also tried including $JUPYTERHUB_USER in the Dockerfile's ENDPOINT command (i.e., ENDPOINT /bin/bash ${HOME}/binder/start.sh ${JUPYTERHUB_USER}) so that it hypothetically wouldn't need to be explicitly specified by the machine launching the docker container, but that didn't fix the server failing to launch either.

@d-diaz
Copy link
Author

d-diaz commented Feb 11, 2019

I've posted an issue on repo2docker to see if anyone there can help troubleshoot what's going on with the use of the Dockerfile buildpack. I'm pretty confident now that it's just that the environment variable JUPYTERHUB_USER is not being passed to the docker container when it's being run and executing this start.sh script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants