Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for postBuildAdmin #1009

Open
g-braeunlich opened this issue Jan 28, 2021 · 24 comments · May be fixed by #1010
Open

Add support for postBuildAdmin #1009

g-braeunlich opened this issue Jan 28, 2021 · 24 comments · May be fixed by #1010

Comments

@g-braeunlich
Copy link
Contributor

Proposed change

In addition to postBuild script (userspace) also process postBuildAdmin scripts (system space).

Alternative options / Who would use this feature?

See discussions in #487 and https://gitter.im/jupyterhub/binder?at=5fce35b6fb7f155587ad481a

@welcome
Copy link

welcome bot commented Jan 28, 2021

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@g-braeunlich g-braeunlich linked a pull request Jan 28, 2021 that will close this issue
@betatim
Copy link
Member

betatim commented Jan 29, 2021

My initial reaction is that we shouldn't support this because it is not a frequently used feature and it adds another "magic" file.

repo2docker itself tries to support frequently used community practice workflows. And making it as effortless as possible for the user if they follow the community practices. This means we choose the trade-off in favour of the frequent and against the rare. For rare or expert use cases there is the escape hatch of providing your own Dockerfile.

The magic files we already have in repo2docker are one of the things we regret the most because they are totally custom. Which means they make things more complicated and are less well thought out. So we are trying to not add new ones unless there really, really is no other way.

This means: can we find more use cases where a user needs this feature and there is no alternative? For example the user in the linked gitter chat seemed happy to use a Dockerfile. Are there some specific cases in #487 that we could cover with this? Asking because in my mind "I want to change the base image" is often about "I don't want Ubuntu but X" which we can't solve with a postAdminBuild.

@trybik
Copy link

trybik commented Jan 29, 2021

This is the use-case that started this thread:

We want users (effectively jovian user), to be able to modify runtime environment; in particular, to run sudo apt-get update/install ... over Terminal in the Notebook Server session but also maybe install other scientific libraries which don't really support easily user installation.

#487 (comment)

The postBuildAdmin solution was suggested and briefly discussed w/ @manics in the #487 thread (cf. #487 (comment) ).

In both this and ImageMagic cases it's not about change of the actual base image but about admin-only config.

@manics
Copy link
Member

manics commented Jan 29, 2021

I think part of the issue is that repo2docker allows you to install operating system packages using apt, and since it's at the OS level that means it's often only configurable by root. As an aside, there's another future problem here- if the base image is upgraded from

FROM buildpack-deps:bionic

to the next LTS then since packages may have been dropped or undergone breaking changes this could result in binder repositories failing to build or run.

Compare this with most (maybe all?) of the other buildpacks which can be run as a non-admin.

@betatim
Copy link
Member

betatim commented Jan 31, 2021

My view on using sudo hasn't changed from what Yuvi wrote in #192 (comment)

As manics points out: the more we allow people to interact with the base image/distro the harder it becomes to change it. At some point bionic won't be supported any more and we will have to change. All this makes me think the answer to "I want to use distro level packages" and the resulting "I need sudo to edit system level packages" should be to reduce the need/desire to do this. For example by encouraging the use of package managers that install things in user editable directories (conda, nix, homebrew for linux?). This means I'd be excited to explore what those options would look like, how much they are used, etc.

Right now my feeling is that needing sudo is rare and that most of those wanting to use sudo shouldn't be wanting to use it. For the rest it seems like they are doing advanced/custom/one off stuff and it is Ok to point them to use a Dockerfile. I use "seems like", "appears like" and "my feeling" because I don't have any data to point to and I don't know how we could get some either. My feeling comes from thinking about support requests/PRs I've seen.

@trybik
Copy link

trybik commented Feb 1, 2021

Right, in principle apt.txt falls into the same bucket as the proposed here postBuildAdmin. I guess the main incentive to use either of these, at least in these "advanced/custom/one off" solutions, would be to stay up to date with r2d-generated Dockerfile. In that sense, the potentially breaking changes, like LTS update, would be a desired part of the general toolchain migration task (incl. discovery of repos that need update as well).

Considering alternatives for being able to change live runtime env by user (except maybe for some hard-core sci packages, that anyway require sudo installations) the most obvious one, other than a custom Dockerfile, would be to always use conda, right? Even to the extent that conda is actually always seamlessly used in the r2d-generated Dockerfile. The downsides I can potentially see is that conda introduces noticeable overhead on both image size and build time (*), but I bet you've discussed this before. Could you weigh-in on that?

(*) possibly less of an issue with mamba

@trybik
Copy link

trybik commented Feb 1, 2021

One more remark: to solve an issue of mass breaking build of repos in central solution like mybinder.org on r2d update (with potentially build breaking changes), an option would be to add spec of r2d version to use for repo build. (This could also include allowing to use e.g. r2d forks, though it seems somehow a dangerous idea.)

@minrk
Copy link
Member

minrk commented Feb 3, 2021

From a technical perspective, if we do support this feature, I think it should be through making sudo work in post-build, not adding a new file (as discussed in #192). That changes the discussion of the issue somewhat because it eliminates the 'new r2d-specific file' topic, and more to a capabilities discussion: should arbitrary root commands be allowed.

I'm +/- 0 at this point: I don't think there's a high cost to allowing it. Since we do allow custom Dockerfiles, no security argument can be made about it.

@trybik allowing request of a repo2docker version is absolutely a goal, see #490. Specifying a fork less so, and at least wouldn't be supported on mybinder.org for the reasons you mentioned.

@minrk
Copy link
Member

minrk commented Feb 3, 2021

Could you weigh-in on that?

Speed and size costs of using conda definitely vary a lot. Mamba has no effect on size, but does significantly improve dependency resolution time. imagemagick is a bit of a worst case scenario because:

  • it has lots of dependencies
  • most of those dependencies are not in the base environment, so need to be downloaded
  • it is actually already on the system (I'm not sure why, this surprised me), so adding it to apt.txt costs ~nothing and ~all dependencies are redundant

As a result, on my laptop adding:

dependencies:
- imagemagick

took an extra 90 seconds to build, and increased image size from 2.1GB to 3.2GB, an increase of 50% for one package.

@trybik
Copy link

trybik commented Feb 3, 2021

@minrk that's great input, thanks

So for practical (mostly image size) reasons it seems like supporting apt.txt makes perfect sense (I'm all for it). Then, it would also make sense to allow sudo to be able to at least configure whatever is provided via apt-get; or, in more general whatever other system settings need to be adjusted (like /etc/sudoers as in our case). To both ends /etc write access should suffice, but I guess technically it's equivalent to allowing arbitrary root commands.

making sudo work in post-build

So e.g. add $NBUSER to /etc/sudoers, run postBuild, rm $NBUSER from /etc/sudoers? If the last one would be skip, then it would sort out our issue: user being able to modify system during runtime.

@trybik
Copy link

trybik commented Feb 5, 2021

Again, one more follow-up remark:

should arbitrary root commands be allowed

If to be allowed, IMVHO postBuildAdmin is the approach to take rather than making sudo work in postBuild.

It creates a (micro-scale) separation of concerns. Here, separating all potentially image-breaking specifications (image-breaking because of either not being future-proof wrt r2d changes, or by being just bad usage practices, where a clean non-sudo solution should be employed). It's easier to put a big warning sign on postBuildAdmin saying, as @manics nicely put it in the other thread:

post-build-admin has the same support level as Dockerfile, i.e. not much.

Note: such warning is currently missing from the PR, but there seem no to be an equivalent one on Dockerfile too (at least not in config_files.rst).

@minrk
Copy link
Member

minrk commented Feb 8, 2021

separating all potentially image-breaking specifications

Given how much we install in the user-space, I don't think that's true. It's really easy to create a broken image with postBuild already. So it's more a question of which parts you can break, and there is a lot more that's relevant and required by r2d and breakable by the user than there is that requires root access.

post-build-admin has the same support level as Dockerfile, i.e. not much.

I'd say postBuild is already at this level of support. We've yet to make any real guarantees about what's available or going to be available in this scope, other than "it's run at the end of whatever else r2d does". It's very possible to break large chunks of functionality with conda install commands (or even pip).

@manics
Copy link
Member

manics commented Feb 8, 2021

Based on the above discussion I'm in favour of a postBuildAdmin script:

  • It seems like we'll need OS level setup such as as apt.txt for the forseeable future, so it seems reasonable to allow a script to be run as root for configuring a system
  • A separate script is easier than supporting sudo inside postBuild. Using
     USER root
     RUN postBuildAdmin
     USER ${NB_USER}
    
    is cleaner than enabling then disabling sudo, especially since sudo isn't installed by default

@minrk
Copy link
Member

minrk commented Feb 11, 2021

I still think that if this needs a new special repo2docker-only file, it's not a feature we should add.

is cleaner than enabling then disabling sudo

I also think we should not do that. If we allow sudo, it should be allowed, both in build and run. If we disallow it, it should be disallowed in both. That is:

  1. if sudo is allowed, add $NB_USER to sudoers
  2. post-build

that's all.

@trybik
Copy link

trybik commented Feb 16, 2021

It's really easy to create a broken image with postBuild already.

What I meant is "all potentially image-breaking specifications" on inevitable update or change of the base image. That's very likely also not true statement, although I guess the "almost all" affected specs would be in apt.txt or postBuildAdmin.

I still think that if this needs a new special repo2docker-only file, it's not a feature we should add. [...] If we allow sudo, it should be allowed, both in build and run.

It seems more safe to go about it in steps. If the postBuildAdmin does not work out in practice as a useful generalisation of apt.txt spec, you stay on the safe side and could deprecate it. Otherwise, you could deprecate apt.txt and take a step further by allowing sudo in postBuild (and runtime) and eventually deprecating also postBuildAdmin.

@g-braeunlich
Copy link
Contributor Author

Hey guys.
Quick reminder to this PR.
@betatim It is still controversial, I guess?

@manics
Copy link
Member

manics commented Mar 15, 2021

@g-braeunlich Unfortunately most of us are lacking in time (as always....). Since there are differing views amongst the maintainers we'll need to come to a consensus. This should eventually happen, but if you're interested in pushing this forward we have monthly JupyterHub meetings in alternating timezones that are open to everyone.

The next one is on Thursday 18 March 17:00 UTC: jupyterhub/team-compass#378
If you want to join please add your name and a point under the agenda!

@g-braeunlich
Copy link
Contributor Author

Hey @manics
Thank you very much for the invitation!
Unfortunately I wont be able to join. But maybe this will get sorted out somehow. If not this time, maybe the next time.

@jabbera
Copy link

jabbera commented Mar 23, 2021

Use at your own risk. I cherry picked @g-braeunlich PR into current master: jabberaa/repo2docker-admin:latest

@manics
Copy link
Member

manics commented Mar 24, 2021

If anyone wants to talk about this at the next meeting it's on Thursday 15 April 08:00 UTC: jupyterhub/team-compass#388

@jabbera
Copy link

jabbera commented Mar 24, 2021

5 am my time. Unless my kids are a total nightmare I'll just say that something like this is required and leave it at that.

@jabbera
Copy link

jabbera commented Mar 27, 2021

I'm likely going to fork this and allow for base image selection as well. The images that are being minted by repo2docker for our use case are enormous with no common layers. Having the ability to use our "stock" (docker-stack + special sauce) base images will greatly decrease pull/startup time as each user gets a dedicated node that comes with our base images.

@ctr26
Copy link

ctr26 commented Aug 12, 2021

I'd be very interested in this feature. For instance nvidia drivers and for linking folders outside of the home dir.

@consideRatio
Copy link
Member

consideRatio commented Jun 16, 2023

To add any new repo2docker configuration file adds work for sure, but I think supporting postBuildAsRoot or similar would be easy to implement, document, test, and maintain over time, and be a powerful and useful escape hatch.

I've hesitated on using repo2docker because the current limitation. Both because of concern that I would end up needing the root escape hatch, and because I knew I had to compromise not installing something such as gcloud.

With this in mind, I'm in clear support of adding a new file like postBuild but that is run as root. If there is support for this, I'm volunteering to implement it with docs and tests etc.

With regards to the discussion of either adding a new file or installing sudo and granting sudo rights, I currently think of adding a new file as a more robust and less invasive option. A key reasons for me appreciating having a dedicated file is that inspecting a folder that repo2docker will build allows you to quickly see if something is to be done as root with a dedicated file.

If there is agreement on adding support for such file, I suggest postBuildAsRoot as a file name as I think that describes what it is quite well and also helps convey that postBuild may be as non-root normal user (jovyan). Thinking of how I would write the docs for this file, I'd want to write something like "This is like the postBuild file, but executed as the root user after postBuild has executed."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants