Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Ability to configure the app fully via environment variables #1073

Open
samip5 opened this issue Feb 27, 2023 · 8 comments
Open

FR: Ability to configure the app fully via environment variables #1073

samip5 opened this issue Feb 27, 2023 · 8 comments

Comments

@samip5
Copy link

samip5 commented Feb 27, 2023

What is the feature about?

Ability to configure the container in a stateless manner

What would it require?

It would require the ability to specify all the things in the config file via environment variables.

Extra context

This would be a huge improvement for a Kubernetes environment.

@joeskeen
Copy link
Contributor

joeskeen commented Dec 8, 2023

This feels like a wider scope than #1048 but I think it gets at the heart of the same thing.

I would like to run this in a container without needing to mount a config volume or baking settings into my own image (it's easier to trust an app that you know is unmodified, running from the official image), and the way to do that in the container world would be environment variables.

As far as doing it in a backwards-compatible way (as mentioned by @elrido here), I've looked through how the config is applied and I'll have to do more testing but I think I can implement this in the following way:

  • Every setting starts out set to the value defined in the default configuration
  • If the custom config file cfg/conf.php exists, it's read/parsed as INI file (as it currently is)
    • for each setting set in that file, it overwrites the setting populated by the default
  • Then the environment is checked for each setting, which settings would be named by convention (for example, _configuration['main']['name'] could translate to the environment variable PRIVATEBIN_MAIN_NAME. If the environment variable is set, it would overwrite the setting populated by either the defaults.

That way, all your settings would effectively be the environment variables, with a fallback of the custom config, with a fallback of the defaults.

I'd be happy to implement this and submit a PR.

@elrido
Copy link
Contributor

elrido commented Dec 9, 2023

Serious question: Could you point me at some articles that explain benefits of using environment variables in a container environment over files?

I was always told that using config files attached read-only to a container are considered safer than using environment variables that may influence all kinds of services before reaching your web app, can't be read in a thread-safe way on Linux and expose you to additional risks like injecting commands or buffer overflows. A read-only file would not be able to get manipulated from an exploited application (while env vars can be set at run-time) and at least we only open ourselves to attacks possible due to flaws of PHP's ini parser. And in Kubernetes, either env vars or files can be attached to a pod using secrets, so AFAIK they are both equally (in-)conveniant. Am I overthinking this or simply not aware of some crucial benefit of env vars over read-only-files?

@joeskeen
Copy link
Contributor

joeskeen commented Dec 9, 2023

I'm not sure I can authoritatively prove that there are any benefits of using one over the other, but for people that have built their systems around environment variable configuration, it can be painful if there isn't a way to use them. That's why I was hoping my proposal could equally accommodate either use case.

@elrido
Copy link
Contributor

elrido commented Dec 10, 2023

I feel there is a learning opportunity for me here. As a more concrete example, here is how I have run the PB image with a custom config on k3s environment (I did this to test the darkmode config in issue 167):

---
apiVersion: v1
kind: Secret
metadata:
  name: privatebin-config
data:
  conf.php: |
    ;<?php http_response_code(403); /*
    [main]
    template = "bootstrap-dark"
    syntaxhighlightingtheme = "sons-of-obsidian"
    [model]
    [model_options]
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: privatebin-deployment
  labels:
    app: privatebin
spec:
  volumes:
    - name: cfg
      secret:
        secretName: privatebin-config
[...]
  replicas: 3
  selector:
    matchLabels:
      app: privatebin
  template:
    metadata:
      labels:
        app: privatebin
    spec:
      securityContext:
        runAsUser: 65534
        runAsGroup: 82
        fsGroup: 82
      containers:
      - name: privatebin
        image: privatebin/nginx-fpm-alpine:stable
        ports:
        - containerPort: 8080
[...]
        volumeMounts:
        - mountPath: /srv/cfg
          name: cfg
[...]

Hypothetically speaking, if we'd had env var support, here is how I assume the k8s YAML would look like:

---
apiVersion: v1
kind: Secret
metadata:
  name: privatebin-config
data:
  template: "bootstrap-dark"
  syntaxhighlightingtheme: "sons-of-obsidian"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: privatebin-deployment
  labels:
    app: privatebin
spec:
  volumes:
[...]
  replicas: 3
  selector:
    matchLabels:
      app: privatebin
  template:
    metadata:
      labels:
        app: privatebin
    spec:
      securityContext:
        runAsUser: 65534
        runAsGroup: 82
        fsGroup: 82
      containers:
      - name: privatebin
        image: privatebin/nginx-fpm-alpine:stable
        ports:
        - containerPort: 8080
        env:
        - name: TEMPLATE
          valueFrom:
            secretKeyRef:
              name: privatebin-config
              key: template
        - name: SYNTAXHIGHLIGHTINGTHEME
          valueFrom:
            secretKeyRef:
              name: privatebin-config
              key: syntaxhighlightingtheme
        - name: TZ
          value: Antarctica/South_Pole
        - name: PHP_TZ
          value: Antarctica/South_Pole
[...]

Is this what you would imagine doing or do you configure/apply your env vars differently?

Using a ConfigMap instead of a Secret would not make much difference syntactically, but since the config might contain a DB credential, a secret probably makes more sense to use, here.

To me, both of these are just a single YAML file in my config management (possibly a templated one, for injecting values from outside the role/module). The env var also seems like a more complicated way of using these, hence I assume I may be using them wrong or in a non-efficient way and therefore my question.

@rugk
Copy link
Member

rugk commented Dec 13, 2023

Hmm AFAIK env configuration is best practise for containers. Theoretically it may be worse, but well… everyone does it. So I fully support this proposal.

It is also not mentioned on the OWASP guide:
https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html

Regarding the risks you mentioned:

  • injecting commands would be the worse. Your link however applies to setuid binaries, which we don't are. Also, it just depends on how the env is handled in the application.
  • Buffer overflows generally, while interesting, also only apply to C software or similar. And of course they could also happen (see example 2 there) when you read a (config) file or whatever.

So IMHO these are no arguments against env vars.
Generally, any config or env vars should be treated as untrusted as they come from "outside". But same with config files…

A read-only file would not be able to get manipulated from an exploited application (while env vars can be set at run-time)

While theoretically yes, what is the threat model here? If an attacker has compromised the container, they can modify env vars, yeah likely and can also restart the PHP etc. But they already have access to all the data inside the container and can make network calls to the DB etc.

attacks possible due to flaws of PHP's ini parser

IMHO kind of out-of-scope. We have to trust our dependencies are somewhat secure against such things like buffer overflows, and here I think it really is (PHP is a big project and ini parsing should not be that complex).

And in Kubernetes, either env vars or files can be attached to a pod using secrets, so AFAIK they are both equally (in-)conveniant.

Docker Swarm and Docker compose have similar ways to handle secrets.
Also some images like MySQL have _FILE suffix, where they read a env var from a file instead. For secrets, I'd agree, this is useful as env vars can be accessed more easily than files usually. (Every application in the container gets access to it.)

But for non-secret config vars, the risk of data manipulation is IMHO irrelevant. After all, an attacker could possibly modify the config or such things, but they can do worse. If they can accidentally read an env var somehow (as they have non-persistent code execution or something), I agree this would be a risk if that contains a sensitive value. (though realistically also that is not a big risk)

@rugk
Copy link
Member

rugk commented Dec 13, 2023

Regarding the benefit: I guess it is with Docker-compose you have flexible ways for setting your env vars. You can separate the envs in an env file etc., create files overwriting etc. This is useful to git-commit non-sensitive files and have sensitive config being included in a file on your .gitignore file. This roughly follows the "everything as a code" mantra, so you have your config versioned and can automate deployments etc. (Without having secrets in your git repository, which is not recommend).

It just makes configuration much more flexible and is really best practice. It could also be included in the docker image, where an entrypoint script takes the envs and writes them to the config file. So PrivateBin PHPs code would not have to be adjusted. (I have seen many other images doing that, also some only offer this as "initial configuration", so no change in settings is applied afterwards. E.g. IIRC Nextcloud does it like this.)

@elrido
Copy link
Contributor

elrido commented Dec 14, 2023

Just to clarify, I'm not opposed to any PRs anyone might raise to add support for env var configuration - I just try to educate myself on why these are so popular, as I always perceived them as a risk and less convenient than a simple flat file that I can template using ansible, puppet or chef.

If anyone wants to add this, I don't think it would be too difficult to do so in lib/Configuration.php. It already has an array of all the defaults:

private static $_defaults = array(
'main' => array(
'name' => 'PrivateBin',

These get iterated over, and if the value isn't found in the config file, the default is inserted - as an extra step, you'd add a call to check if a corresponding env var existed, i.e. by upper casing 'PRIVATEBIN_' . $section . '_' . $key to turn i.e. ['main']['name'] into the env var key PRIVATEBIN_MAIN_NAME or similar:

foreach (self::getDefaults() as $section => $values) {
// fill missing sections with default values
if (!array_key_exists($section, $config) || count($config[$section]) == 0) {
$this->_configuration[$section] = $values;

// check for missing keys and set defaults if necessary
else {
foreach ($values as $key => $val) {
if ($key == 'dir') {
$val = PATH . $val;
}
$result = $val;
if (array_key_exists($key, $config[$section])) {
if ($val === null) {
$result = $config[$section][$key];


On the other questions, regarding risks as I understood them:

A read-only file would not be able to get manipulated from an exploited application (while env vars can be set at run-time)

While theoretically yes, what is the threat model here?

Ok, good question, I should have established my assumptions first. From having done some kubernetes security trainings, where you break into the container, then out of the container onto the host and then learn what best-practices you can apply on each level to mitigate the attacks, the gist of my threat model is:

  1. breaking into a container using an unpatched, well-known exploit in a common library used by one of the services in the container to run a payload as an unprivileged user inside the containers context
  2. elevate your privileges
  3. use the elevated privileges to persist your access, i.e. by patching one of the running service binaries or injecting your payload into the entrypoint script if such a thing is used
  4. finally find a loophole in the k8s setup, i.e. mounting the hosts root fs into the container or using some exposed secret to talk to the k8s API and spawn a less restricted environment to break out of

For more details, I can recommend this book - I got a chance to train with one of the authors and used the book to go into further details and secure a testing k8s env at work.

Point 1 we can help mitigate by installing as little software as possible into our image and updating it diligently. Which we have partially automated using dependabot, though tagging which triggers a image build, is still manual.

Point 2: I just verified and our image doesn't ship any setuid or setgid binary (using find / -xdev -user root \( -perm -4000 -o -perm -2000 \)). We also designed them so they would run as an unprivileged user (as is also recommended by the OWASP guide).

Point 3 can be mitigated by running the image with a read-only root and also attaching config files as such. That would only leave the data directory and temporary files locations to persist into. We do not read the pastes via PHP include, which would have allowed an attacker to inject a PHP script into a paste. The read-only root means that even if an attacker finds an exploit, they can't persist and a restart of the service will load a clean copy. We don't use an entrypoint script anymore, either, but s6 or straight nginx unit, in the newer image flavour.

Point 4 is not in the projects hands either.

It is also not mentioned on the OWASP guide:
https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html

Well, that is the guide on the additional specific risks when using Linux containers. Env var injection is a still a problem in this type of environment and is documented in example 3 (not even with a setuid binary) of https://owasp.org/www-community/attacks/Command_Injection

It could also be included in the docker image, where an entrypoint script takes the envs and writes them to the config file.

See the mitigations for point 3 above on why we would not want to do that in an entrypoint script. IMHO such scripts are a malpractice and should be avoided if possible. But yes, unfortunately they are very commonly used, still, and even major images such as the mysql and mariadb ones use them heavily to auto-initialize databases from attached SQL dumps or join existing galera clusters as nodes, which requires generating configuration at runtime and service restarts.

@rugk
Copy link
Member

rugk commented Dec 15, 2023

Yeah, dunno and in any case thanks for the interesting links. (Though that OWASP example 3 BTW also talks about "elevated privilege of the application" so it assumes some kind of setid or dunno other user context or whatever. Of course an attacker would have no need to inject anything otherwise hehe.

And yeah Official Docker images and if you click-through there are some guidelines and checklists, so one should expect more quality from them, but about such deep-level attacks and threat vector minimization well… it's not everyone's friend. And sometimes the usability gain for such a feature is bigger than the theoretical risk associated.
And… to close the loop, I guess this is such a case here. IMHO I would absolutely be fine with such a feature.

If we wanted to be extra secure, we could introduce a config file option that disables env configuration. (Note for obvious reason this would then have to be the only config option not configurable via the env vars.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants