Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Existing stopped container is not removed when newly requested image is different and remove = False #423

Open
dijksterhuis opened this issue Mar 30, 2021 · 5 comments

Comments

@dijksterhuis
Copy link

Bug description

Previously stopped container 1 derived from image A that is different to the latest image B chosen by a user from the options form is not removed when it should be in order to create container 2. Instead, container 1 is simply restarted.

Potential FIx

An additional if block can be added around here:

# if current stopped container's image != image that was just requested
if obj and obj["Config"]["Image"] != image:
    self.log.warning(
        "Removing %s of previously selected image %s that should be cleaned up: %s (id: %s)",
        self.object_type,
        self.image,
        self.object_name,
        self.object_id[:7],
    )
    await self.remove_object()
    
    obj = None

Can put together a PR. Just tested by editing current master branch install on the host myself and works perfectly well (although jouranlctl doesn't seem to be logging things correctly).

Expected behaviour

A user stopping their server and selecting image B should cause the existing stopped container 1 based on image A to be removed and a new container 2 created based on image B.

Actual behaviour

A user stops their server and selects image B but existing container 1 is restarted.

How to reproduce

With c.Spawner.remove = False and c.Spawner.allowed_images = {image_name_a: image_a, image_name_b: image_b} and a Lab ide.

  1. User clicks Start Server and selects image A in the option form
  2. User waits for server to load.
  3. User goes to File > Hub Control Panel > Stop My Server
  4. User clicks Start Server and selects image B in the option form
  5. User waits for server to load.
  6. User is presented with the container 1 as before (original container 1 is not removed)

Your personal set up

  • OS: Ubuntu 18.04
  • Version(s): On host:
$ docker --version && pip3 list | grep jupy && pip3 list | grep docker
Docker version 20.10.3, build 48d30b5
jupyter-client                6.1.3
jupyter-core                  4.6.3
jupyter-telemetry             0.1.0
jupyterhub                    1.1.0
jupyterhub-dummyauthenticator 0.3.1
jupyterlab                    2.1.3
jupyterlab-server             1.1.5
docker                        4.2.0
dockerspawner                 12.0.0
  • Full environment

Container installs (we use custom images available at dockerhub and/or GitHub):

notebook==6.1.5
jupyter==1.0.0
jupyterhub==1.1.0
jupyter_core==4.6.1
jupyterlab==1.2.1
ipython==7.9.0
jupyter-tensorboard
ipykernel==5.0.0
  • Configuration

Output of: $ grep -v '\(^#\|^[[:space:]]*$\)' /etc/jupyterhub/config.py

import os
import requests
from jupyter_client.localinterfaces import public_ips
from jupyterhub.auth import PAMAuthenticator as Auth
SRV = "/srv/jupyterhub/"
VAR = "/var/opt/jupyterhub/"
CONF = "/etc/jupyterhub"
USR = "/usr/local/bin"
c = get_config()
c.JupyterHub.allow_named_servers = True
c.JupyterHub.cookie_secret_file = os.path.join(VAR, 'jupyterhub_cookie_secret')
c.JupyterHub.cookie_max_age_days = 60.0
c.JupyterHub.pid_file = os.path.join(VAR, 'jupyterhub.pid')
c.JupyterHub.named_server_limit_per_user = 4
c.JupyterHub.admin_access = True
c.JupyterHub.db_url = os.path.join('sqlite:/', VAR, 'jupyterhub.sqlite')
c.JupyterHub.hub_ip = public_ips()[0]
c.JupyterHub.bind_url = 'http://127.0.0.1:8000/jupyterhub/'
c.JupyterHub.hub_port = 8080
c.JupyterHub.proxy_api_ip = '0.0.0.0'
secret_token = os.urandom(64)
c.ConfigurableHTTPProxy.auth_token = secret_token.hex()
c.ConfigurableHTTPProxy.pid_file = os.path.join(VAR, "jupyterhub-proxy.pid")
c.JupyterHub.logo_file = os.path.abspath(os.path.join(SRV, '<logo_file>.png'))
c.JupyterHub.authenticator_class = Auth
c.Authenticator.admin_users = {'<admin user name>'}
c.Authenticator.create_system_users = False
c.Authenticator.delete_invalid_users = True
c.JupyterHub.spawner_class = 'dockerspawner.SystemUserSpawner'
c.Spawner.debug = True
c.Spawner.remove = True
c.Spawner.name_template = "{username}-{servername}"
def get_image_allowlist(Spawner):
    """
    Load the allowlist file and provide the mappings as options to users
    """
    allowlist_file_path = os.path.join(CONF, 'image_allowlist.txt')
    with open(allowlist_file_path, 'r') as f:
        data = f.readlines()
    data = [i.rstrip('\n').split(',') for i in data if i != '\n']
    image_allowlist = {k: v.replace(' ', '') for k, v in data}
    return image_allowlist

c.Spawner.allowed_images = get_image_allowlist

# entrypoint.sh will always call start.sh from docker stacks once it's done doing any fancy business
spawn_cmd = os.environ.get('DOCKER_SPAWN_CMD', "entrypoint.sh")
c.Spawner.extra_create_kwargs.update({'command': spawn_cmd})

# need to mount and chown extra volume/bind mounts
c.Spawner.run_as_root = True
c.Spawner.mem_guarantee = 17179869184
c.Spawner.cpu_guarantee = 10.0
c.Spawner.extra_host_config = {
    'runtime': 'nvidia',
    'pid_mode': 'host',
    'shm_size': '16G',
}
c.Spawner.volumes = {
    '<host_volume_name>': '/shares/local/<mount point>',
    '<host_bind_mount_path>': '/shares/network/<mount_point>',
}
  • Logs

journalctl -u jupyterhub since last hub restart:

Mar 29 23:52:31 <servernamee> systemd[1]: Started JupyterHub.
Mar 29 23:52:31 <servernamee> jupyterhub[25964]: [W 2021-03-29 23:52:31.947 JupyterHub app:678] JupyterHub.proxy_api_ip is deprecated in JupyterHub 0.8, use ConfigurableHTTPProxy.api_url
Mar 29 23:52:31 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:31.947 JupyterHub app:2240] Running JupyterHub version 1.1.0
Mar 29 23:52:31 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:31.948 JupyterHub app:2271] Using Authenticator: jupyterhub.auth.PAMAuthenticator-1.1.0
Mar 29 23:52:31 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:31.948 JupyterHub app:2271] Using Spawner: dockerspawner.systemuserspawner.SystemUserSpawner-12.0.0
Mar 29 23:52:31 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:31.948 JupyterHub app:2271] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.1.0
Mar 29 23:52:31 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:31.955 JupyterHub app:1349] Loading cookie_secret from /var/opt/jupyterhub/jupyterhub_cookie_secret
Mar 29 23:52:31 <servernamee> jupyterhub[25964]: [W 2021-03-29 23:52:31.969 JupyterHub configurable:168] Config option `delete_invalid_users` not recognized by `PAMAuthenticator`.
Mar 29 23:52:31 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:31.987 JupyterHub app:1655] Not using whitelist. Any authenticated user will be allowed.
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:32.004 JupyterHub app:2311] Initialized 0 spawners in 0.005 seconds
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: [W 2021-03-29 23:52:32.006 JupyterHub proxy:643] Running JupyterHub without SSL.  I hope there is SSL termination happening somewhere else...
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:32.006 JupyterHub proxy:646] Starting proxy @ http://127.0.0.1:8000/jupyterhub/
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: 23:52:32.405 [ConfigProxy] info: Proxying http://127.0.0.1:8000 to (no default)
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: 23:52:32.408 [ConfigProxy] info: Proxy API at http://0.0.0.0:8001/api/routes
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: 23:52:32.877 [ConfigProxy] info: 200 GET /api/routes
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:32.878 JupyterHub app:2556] Hub API listening on http://<ip addr>:8080/jupyterhub/hub/
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: 23:52:32.879 [ConfigProxy] info: 200 GET /api/routes
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:32.879 JupyterHub proxy:320] Checking routes
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: [I 2021-03-29 23:52:32.879 JupyterHub proxy:400] Adding default route for Hub: /jupyterhub/ => http://<ip addr>:8080
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: 23:52:32.880 [ConfigProxy] info: Adding route /jupyterhub -> http://<ip addr>:8080
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: 23:52:32.881 [ConfigProxy] info: Route added /jupyterhub -> http://<ip addr>:8080
Mar 29 23:52:32 <servernamee> jupyterhub[25964]: 23:52:32.881 [ConfigProxy] info: 201 POST /api/routes/jupyterhub
Mar 29 23:52:32 <servername> jupyterhub[25964]: [I 2021-03-29 23:52:32.882 JupyterHub app:2631] JupyterHub is now running at http://127.0.0.1:8000/jupyterhub/
Mar 29 23:52:36 <servername> jupyterhub[25964]: [I 2021-03-29 23:52:36.567 JupyterHub log:174] 200 GET /jupyterhub/hub/home (<username>@<user IP>) 78.24ms
Mar 29 23:52:37 <servername> jupyterhub[25964]: [I 2021-03-29 23:52:37.937 JupyterHub log:174] 200 GET /jupyterhub/hub/spawn/<username> (<username>@<user IP>) 8.33ms
Mar 29 23:52:42 <servername> python3[25964]: pam_loginuid(login:session): Error writing /proc/self/loginuid: Operation not permitted
Mar 29 23:52:42 <servername> python3[25964]: pam_loginuid(login:session): set_loginuid failed
Mar 29 23:52:42 <servername> python3[25964]: pam_unix(login:session): session opened for user <username> by (uid=0)
Mar 29 23:52:42 <servername> jupyterhub[25964]: [W 2021-03-29 23:52:42.665 JupyterHub auth:956] Failed to open PAM session for <username>: [PAM Error 14] Cannot make/remove an entry for the specified session
Mar 29 23:52:42 <servername> jupyterhub[25964]: [W 2021-03-29 23:52:42.666 JupyterHub auth:957] Disabling PAM sessions from now on.
Mar 29 23:52:42 <servername> jupyterhub[25964]: [I 2021-03-29 23:52:42.837 JupyterHub dockerspawner:942] Container '<username>-' is gone
Mar 29 23:52:43 <servername> jupyterhub[25964]: [I 2021-03-29 23:52:43.261 JupyterHub dockerspawner:1167] Created container <username>- (id: 8f74cbc) from image <imageA>
Mar 29 23:52:43 <servername> jupyterhub[25964]: [I 2021-03-29 23:52:43.261 JupyterHub dockerspawner:1190] Starting container <username>- (id: 8f74cbc)

Second container start doesn't seemed to be caught by logs and/or our server's journalctl isn't updating.

@dijksterhuis dijksterhuis added the bug Something isn't working label Mar 30, 2021
@welcome
Copy link

welcome bot commented Mar 30, 2021

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@manics
Copy link
Member

manics commented Mar 30, 2021

There was a discussion on the community forum recently:
https://discourse.jupyter.org/t/dockerspawner-container-deletion-by-user-choice/8024/4
Please add your thoughts there since it's a good forum for getting input from the rest of the community.

Setting remove = True should always remove the existing container. The problem with only removing the container if a different image is requested is that a user may not know whether their container will be deleted or not. It's therefore better to always remove the container and document which directories can be used for persistant data.

@dijksterhuis
Copy link
Author

Hey @manics hadn't seen that forum post so thanks for the link. Was an interesting read. Will add a new post (and link back here) as it's not exactly the same. Still think that this is weird/unexpected behaviour that should be handled if remove = False is made available as a configuration option.

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/dockerspawner-undocumented-behaviour-remove-false-does-not-replace-existing-image-a-with-new-image-b/8576/1

@minrk
Copy link
Member

minrk commented Apr 2, 2021

Summarizing what I think we should do from the forum:

  1. better document that remove=False is really just for testing and requires manual intervention to delete data
  2. warn when using remove=True about pitfalls that it’s not really what should be used
  3. persist a user home directory volume by default in a manner that's easily overridden by default
  4. switch to remove=True as the default, once home directories are persisted

There is no general 'home directory' persistence that will work in general, but we can pick a default that works for jupyter docker-stacks and make it easy to override. KubeSpawner's been doing this for ages and it works great.

@minrk minrk added enhancement and removed bug Something isn't working labels Apr 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants