Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: Revise DH param tests #1848

Merged
merged 4 commits into from
Dec 31, 2021

Conversation

polarathene
Copy link
Contributor

@polarathene polarathene commented Dec 21, 2021

A little later than I expected, but this is a follow-up to my previous DH params PR that adds the remaining test (site-specific DH params). I also made some other minor changes and refactoring (separate commit diffs).


Site-specific DH params

While developing this test, the other tests started failing, using the site-specific DH params which confused me. I inspected the generated nginx conf and the DH params were correctly scoped into their server block, so the other tests shouldn't have been affected... but apparently nginx can use that for responding to requests unrelated to the FQDN / VIRTUAL_HOST assigned to the server block when no other valid matching server block is available..

I then came across this old PR #614 (which was advocating not to mimic the existing HTTP approach by the template, thus would break these tests anyway), but decided to resolve it with DEFAULT_HOST feature and a 2nd web container. I've used the existing web2.nginx-proxy.tld, as it's already got a certificate file for testing with.

Being explicit with DEFAULT_HOST is a workaround, this upstream behaviour is unexpected personally for a reverse-proxy and seems a bit like a bug to the admin (not already familiar with the issue, as noted in this issue #1555).

I've used web2 for the site-specific VIRTUAL_HOST and copied over ffdhe2048.pem to the same location, renamed to trigger site-specific usage in nginx.tmpl. I would normally mount via volume instead, but there's an existing volume mounted for the directory, so I've committed the copy instead. I don't think that'll impact any other test (tests all passed locally).

Misc changes + Verify chain of trust utility method

I may have been a little verbose with comments, we can reduce that if you like, I just prefer to make it as easy to grok by future maintainers as possible :)

I'm not sure how useful the final commit is, that documents how to verify chain of trust for a certificate. The openssl command already hides the stderr output unless it fails, and the verification is unrelated to DH params. I did mention it in the previous PR, so it can be left there if it'd ever be helpful/useful, or feel free to remove it 馃憤

Commit Overview

Changes have been scoped into smaller commits with context provided in commit messages. Sharing that here if it helps with the review process (anything worthwhile is presently documented via comments in the relevant file too):

Commit Messages
  • chore: Refactor checksum comparisons

    • Use a DRY method instead.
    • ENV test changed from 2048-bit to 3072-bit to avoid confusion in a future test that should not be mixed up accidentally with 2048-bit elsewhere.
    • Custom DH file test comparison changed to match other comparisons for equality against the expected DH param content.
    • Related comments revised, additional comment for context added by the test definition.
    • Minor white-space adjustments.
  • tests: Verify correct DH group size when negotiating
    Additionally allows for adding extra openssl params when needed.

  • tests: Verify site-specific DH params feature works correctly
    This addition requires usage of DEFAULT_HOST on containers tested to ensure they don't accidentally use web2 as their default fallback (due to no SNI / -servername requested in openssl queries), otherwise they would be testing against the incorrect DH params response.

    They could alternatively request an FQDN explicitly as well, instead of relying on implicit fallback/default server selection behaviour.

    web2.nginx-proxy.tld.dhparam.pem is a copy of ffdhe2048.pem.

  • tests: Add utility method to verify TLS chain of trust

- Use a DRY method instead.
- ENV test changed from 2048-bit to 3072-bit to avoid confusion in a future test that should not be mixed up accidentally with 2048-bit elsewhere.
- Custom DH file test comparison changed to match other comparisons for equality against the expected DH param content.
- Related comments revised, additional comment for context added by the test definition.
- Minor white-space adjustments.
Additionally allows for adding extra openssl params when needed.
This addition requires usage of `DEFAULT_HOST` on containers tested to ensure they don't accidentally use `web2` as their default fallback (due to no SNI / `-servername` requested in openssl queries), otherwise they would be testing against the incorrect DH params response.

They could alternatively request an FQDN explicitly as well, instead of relying on implicit fallback/default server selection behaviour.

---

`web2.nginx-proxy.tld.dhparam.pem` is a copy of `ffdhe2048.pem`.
@polarathene
Copy link
Contributor Author

polarathene commented Dec 28, 2021

It would be good to see a new release tagged for the image at some point btw, last tagged release was 1st Sep. Not much changed since 0.9.3 (39 commits, less if PRs were squash merged instead), but that does contain the DH param changes allowing for removing an unnecessary volume.


Regarding the tests, they work fine via native python with pytest, but with the alternative pytest docker image, some network tests in the full suite fail, presumably DNS resolution related (related issue here), I'm not quite sure why but understand there is some code that messes around with the networking for tests?

The DH param tests all run in the container fine. I tried to run the pytest container with host networking mode instead and it just quits without any error output. I ran the container without command and opened a shell into it, then ran pytest from there and same behaviour, I was thrown out of the shell.

I don't have time to debug that, but thought I'd mention it in case anyone else brings that experience up with running tests via Docker too.


Adding some extra notes to this PR, may help a future reader / contributor regarding tests or this specific PR subject:

Some shell commands / output for future maintainers reference
git clone https://github.com/nginx-proxy/nginx-proxy
cd nginx-proxy

# Make any changes..

# Build the testing image `nginxproxy/nginx-proxy:test` with your changes included:
# (or if using pytest.sh, running `make test-debian` or `make test-alpine` will build the test image and run all tests).
make build-nginx-proxy-test-debian

# Test a specific test file. Current docs already cover this, but by calling the `pytest.sh` with args `-s test_ssl_test_dhparam.py` for example, that'll run the same logic,
# but if you need to troubleshoot, you can run this way and inspect files or mount extra volumes.
# Build the `nginx-proxy-tester` image first from: https://github.com/nginx-proxy/nginx-proxy/tree/main/test/requirements#nginx-proxy-tester

# From project root:
docker run --rm -it \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  --volume "${PWD}/test/:${PWD}/test/" \
  --workdir "${PWD}/test/" \
  nginx-proxy-tester test_ssl/test_dhparam.py


# =============================================================

# Alternative manual bring up for troubleshooting / inspection.
# Starting dummy web containers, build the `web` image first from: https://github.com/nginx-proxy/nginx-proxy/tree/main/test/requirements#web

# nginx-proxy container needs to explicitly set this VIRTUAL_HOST as the DEFAULT_HOST to avoid non-matching hosts using the site-specific DH param file for `web2` (even though nginx conf file wraps this within the web2 server block, it'd be used as a fallback otherwise):
docker run --rm -itd --name "web5" \
  --expose 85 \
  --env "WEB_PORTS=85" \
  --env "VIRTUAL_HOST=web5.nginx-proxy.tld" \
  web

# This container VIRTUAL_HOST has it's own DH param file at test/test_ssl_certs/web2.nginx-proxy.tld.dhparam.pem
docker run --rm -itd --name "web2" \
  --expose 85 \
  --env "WEB_PORTS=85" \
  --env "VIRTUAL_HOST=web2.nginx-proxy.tld" \
  web


# Start `with_custom_file` docker-compose equivalent container:
docker run --rm -itd --name "with_custom_file" \
  --volume "/var/run/docker.sock:/tmp/docker.sock:ro" \
  --volume "${PWD}/test/test_ssl/certs:/etc/nginx/certs:ro" \
  --volume "${PWD}/dhparam/ffdhe3072.pem:/etc/nginx/dhparam/dhparam.pem:ro" \
  --env "DEFAULT_HOST=web5.nginx-proxy.tld" \
  nginxproxy/nginx-proxy:test

# Enter container shell to run commands like openssl does for the test:
docker exec -it with_custom_file bash

# Test query that shows any `-servername` value will respond using the 3072-bit DH param override, unless it's the `web2.nginx-proxy.tld` container which has it's own custom DH params file:
echo '' | openssl s_client -connect localhost:443 -servername example.test -tls1_2 -cipher 'EDH' | grep 'Server Temp Key'

# Ouput:
# depth=0 CN = *.nginx-proxy.tld
# verify error:num=20:unable to get local issuer certificate
# verify return:1
# depth=0 CN = *.nginx-proxy.tld
# verify error:num=21:unable to verify the first certificate
# verify return:1
# depth=0 CN = *.nginx-proxy.tld
# verify return:1
# DONE
# Server Temp Key: DH, 3072 bits

# Whereas if you query for `web2.nginx-proxy.tld` specifically (instead of `web5.nginx-proxy.tld` or non-match like `example.test`), you'll get 2048-bit DH used instead.
# Use `-CAfile` arg to openssl command to avoid the verification errors for valid domains.

# Check nginx generated config if necessary:
cat /etc/nginx/conf.d/default.conf | grep ssl_dh

@buchdag
Copy link
Member

buchdag commented Dec 30, 2021

It would be good to see a new release tagged for the image at some point btw

@polarathene agreed, I'll look into this PR and tag a new release in the next few days


Regarding tests, the pytest docker image can't work at the moment for a very simple reason: the conftest.py's method to determine if it's running inside a container or not is based upon looking up for a file that existed inside Docker containers at some point but was removed a while ago. It's not even worth investigating other issues with the pytest docker image until this very basic one has been solved.

@nginx-proxy nginx-proxy deleted a comment from nduchon Dec 30, 2021
@polarathene
Copy link
Contributor Author

I'll look into this PR and tag a new release in the next few days

馃帀

It's not even worth investigating other issues with the pytest docker image until this very basic one has been solved.

grep -q docker /proc/1/cgroup looks to be a sufficient way to check if running in a docker container, or the other answer there about just adding an ENV to the image to check.

grep method to check (note, this may only work with cgroupv1, not cgroupv2 (eg: Debian 11 uses v2 by default)):

import subprocess

def is_running_in_docker():
  return subprocess.run(
    "grep -q docker /proc/1/cgroup",
    shell=True,
    text=True,
    stderr=subprocess.PIPE,
  ).returncode == 0

I_AM_RUNNING_INSIDE_A_DOCKER_CONTAINER = is_running_in_docker()
# True

or via ENV check:

# Give any images under test an ENV of `I_AM_RUNNING_INSIDE_A_DOCKER_CONTAINER=1` where this would be checked, and it should reliably let you check if in a container, seems fine for test purposes:

I_AM_RUNNING_INSIDE_A_DOCKER_CONTAINER = os.environ.get('I_AM_RUNNING_INSIDE_A_DOCKER_CONTAINER') == "1"
# True

I tried both and they're working in my environment, but the ENV is probably easier to reason with and perfectly acceptable in this context. You could choose a more suitable ENV name instead, I'm not sure which container it applies to, I gave it to the nginx-proxy, web and pytest.sh images, but the networking issue is still present.

@polarathene
Copy link
Contributor Author

This might be a bad TLD choice btw:

with ipv6():
nginxproxy.get("http://something.nginx-proxy.local") # force use of IPv6
with ipv6(False):
nginxproxy.get("http://something.nginx-proxy.local") # legacy behavior

.local is reserved for mDNS and resolvers are likely to handle that differently than intended. It's also kinda invalid too, in that most mDNS resolvers don't support multi-label mDNS hostnames, those that do usually need to be configured to allow it as it's against spec IIRC.

I personally use .test which is also a reserved TLD, public resolvers should ignore it, but local resolvers can be configured to respond to it. I actually use CoreDNS (docker image, with Corefile as simple config) when I needed to do some local testing with such, since I could provision .test certs via Smallstep CA container as an ACME provider instead of LetsEncrypt. Or provision cert files manually with step-cli to my .test FQDNs.

I'm not familiar with the test-suite here, so not sure if that's related to the problems at all. It may be easier to drop a bunch of that python code if it's just meant to manipulate DNS when something like CoreDNS is easier to add and grok/maintain for certain things (eg: *.nginx-proxy.tld routing).

@polarathene
Copy link
Contributor Author

polarathene commented Dec 31, 2021

Good news, I looked into the conftest.py a bit more and found the issue with --network host to be due to remove_all_containers() comparing the container ID to the pytest container hostname. In host mode, the container shares the hostname of the host (and well the host network of course).

There is the unique /etc/machine-id but I am not sure how that could be matched to the container, the Docker Python API docs do have container image and name attributes though. I had trouble with the image field value (inexperienced with python), but using name seems fine too, we just need to run the container with the --name option. With that sorted it ran all tests via the container without issue 馃憤

It still has trouble running in a container without using the host network however. (UPDATE: I've resolved that too, networks were not being connected properly due to missing API param when listing networks, I'll create a separate PR with these fixes for you 馃槑 )

@buchdag
Copy link
Member

buchdag commented Dec 31, 2021

Regarding the default fallback with HTTPS : I agree that the behaviour should be corrected, I'll look again into the issue and PR you mentioned. Fell free to tag me into any other issue or PR on this topic that you think is of interest.

I'm not sure how useful the final commit is, that documents how to verify chain of trust for a certificate. The openssl command already hides the stderr output unless it fails, and the verification is unrelated to DH params. I did mention it in the previous PR, so it can be left there if it'd ever be helpful/useful

At some point we're probably going to merge acme-companion into this repo, so yep it'll probably be useful in the future (and before that for migrating acme-companion tests to pytest).

I may have been a little verbose with comments, we can reduce that if you like, I just prefer to make it as easy to grok by future maintainers as possible :)

I'd rather have verbose comments than terse or no comments, that's fine 馃憤

Adding some extra notes to this PR, may help a future reader / contributor regarding tests or this specific PR subject:

This should probably be added as a README.md file under test/test_ssl, I'll do it.

PR looks good to me, merging. I'll answer the other topics in a separate message.

@buchdag buchdag merged commit a96135b into nginx-proxy:main Dec 31, 2021
@buchdag
Copy link
Member

buchdag commented Dec 31, 2021

the ENV is probably easier to reason with and perfectly acceptable in this context

Agreed ! 馃憤

Same thing regarding DNS, the .local TLD is a bad idea and the very contrived Python DNS manipulating code should be dumped in favor of using a local DNS resolver for tests. I don't have any experience with CoreDNS but if you're vouching for it that's enough for me.

If you manage to entirely fix the tests with the pytest Docker image I think we should move the GitHub Actions tests to this method and promote it as the default test method, rather than running pytest with a local Python environment.

Many thanks for the excellent work ! 馃檹

@buchdag
Copy link
Member

buchdag commented Dec 31, 2021

It would be good to see a new release tagged for the image at some point btw

Release drafted and planned for tomorrow just for the sake of having a new minor release on the first day of the year 馃槑

@polarathene
Copy link
Contributor Author

If you manage to entirely fix the tests with the pytest Docker image I think we should move the GitHub Actions tests to this method and promote it as the default test method, rather than running pytest with a local Python environment.

Yep, I think I finally resolved them all. At least the results are matching what local pytest results are now (minus 5 skipped for IPv6 unless running in host networking mode). Should be good to use in Github Actions once merged.


Same thing regarding DNS, the .local TLD is a bad idea and the very contrived Python DNS manipulating code should be dumped in favor of using a local DNS resolver for tests.

It looks like you can change the .local to .test and it works fine (at least it did for test_composev2.py after changing that and the associated .yml config).

As for the DNS resolver, I'm not sure now if it'll be much cleaner. You'd be resolving the FQDN to nginx-proxy container IP, but for that you'd need the IP or to configure it in each docker-compose .yml test file, and the current logic that provides some extra handling and additional logs in the tests that can be helpful, might not be worth the effort? (I don't think I'd have the time to contribute any more for a while)

I don't have any experience with CoreDNS but if you're vouching for it that's enough for me.

There's a few options, I don't do anything extensive, but ended up using CoreDNS as the config file met my needs. It'd look something like this:

# coredns docker-compose service ...
# Publish both TCP and UDP DNS ports, and have other containers use this DNS gateway IP as their `dns` option:
ports:
      - "${DNS_GATEWAY_IP}:53:53/udp"
      - "${DNS_GATEWAY_IP}:53:53"
    volumes:
      - ${PWD}/coredns/Corefile:/Corefile:ro

You can then provide the Corefile with zones using a RFC 1035 zone file or inline via the records plugin (a bit limited but should be fine for basic A/AAAA records:

# Loads an external RFC 1035 zone file, a portable approach for managing DNS records:
example.test {
    file /zones/example.test {
        reload 15s
    }
    log
}

# IPv4 A record via the `records` plugin, the leading `.` should let it match any FQDN with it as a suffix:
.nginx-proxy.test {
  records {
    @ 60 IN A 192.168.1.42
  }
  log
}

# Everything else that didn't match, forward to public resolvers:
. {
    forward . 8.8.8.8 9.9.9.9
    log
}

Many thanks for the excellent work !

You're welcome :) Happy to give back where I could for a project that was quite helpful to me over the years!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants