Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS operation timeout with DNSroboCert 3.25.0 #1101

Open
Vertganti opened this issue Nov 29, 2023 · 4 comments
Open

DNS operation timeout with DNSroboCert 3.25.0 #1101

Vertganti opened this issue Nov 29, 2023 · 4 comments

Comments

@Vertganti
Copy link

Using a fresh setup of DNSroboCert 3.25.0 on a new host resulted in a DNS operation timeout error. Expand the following block for the full log:

Log
dnsrobocert_1  | 2023-11-29 09:24:00 a7a1181186b5 dnsrobocert.core.main[1] INFO Creating missing certificates if needed (~1min for each)
dnsrobocert_1  | 2023-11-29 09:24:01 a7a1181186b5 dnsrobocert.core.certbot[1] INFO Handling the certificate for domain(s): sub.domain.example
dnsrobocert_1  | ----------
dnsrobocert_1  | 2023-11-29 09:24:01 a7a1181186b5 dnsrobocert.core.utils[1] INFO Launching command: /usr/local/bin/python3 -m dnsrobocert.core.certbot certonly -n --user-agent-comment DNSroboCert/3.25.0 --preferred-chain "ISRG Root X1" --config-dir /etc/letsencrypt --work-dir /etc/letsencrypt/workdir --logs-dir /etc/letsencrypt/logs --manual --preferred-challenges=dns --manual-auth-hook "/usr/local/bin/python3 -m dnsrobocert.core.hooks -t auth -c \"/tmp/tmptklzdm7l/dnsrobocert-runtime.yml\" -l \"sub.domain.example\"" --manual-cleanup-hook "/usr/local/bin/python3 -m dnsrobocert.core.hooks -t cleanup -c \"/tmp/tmptklzdm7l/dnsrobocert-runtime.yml\" -l \"sub.domain.example\"" --expand --deploy-hook "/usr/local/bin/python3 -m dnsrobocert.core.hooks -t deploy -c \"/tmp/tmptklzdm7l/dnsrobocert-runtime.yml\" -l \"sub.domain.example\"" --server https://acme-v02.api.letsencrypt.org/directory --cert-name sub.domain.example --force-renew --key-type rsa -d sub.domain.example
dnsrobocert_1  | Saving debug log to /etc/letsencrypt/logs/letsencrypt.log
dnsrobocert_1  | Requesting a certificate for sub.domain.example
dnsrobocert_1  | Hook '--manual-auth-hook' for sub.domain.example reported error code 1
dnsrobocert_1  | Hook '--manual-auth-hook' for sub.domain.example ran with output:
dnsrobocert_1  |  Executing auth hook for domain sub.domain.example, lineage sub.domain.example.
dnsrobocert_1  | Hook '--manual-auth-hook' for sub.domain.example ran with error output:
dnsrobocert_1  |  Error while executing the auth hook:
dnsrobocert_1  |  The resolution lifetime expired after 5.402 seconds: Server Do53:127.0.0.11@53 answered The DNS operation timed out.; Server Do53:127.0.0.11@53 answered The DNS operation timed out.; Server Do53:127.0.0.11@53 answered The DNS operation timed out.
dnsrobocert_1  |  Traceback (most recent call last):
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dnsrobocert/core/hooks.py", line 40, in main
dnsrobocert_1  |      globals()[parsed_args.type](dnsrobocert_config, parsed_args.lineage)
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dnsrobocert/core/hooks.py", line 61, in auth
dnsrobocert_1  |      txt_challenge(certificate, profile, token, domain, action="create")
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dnsrobocert/core/challenge.py", line 52, in txt_challenge
dnsrobocert_1  |      with Client(ConfigResolver().with_dict(config_dict)) as operations:
dnsrobocert_1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/lexicon/client.py", line 106, in __init__
dnsrobocert_1  |      zone_name = dns.resolver.zone_for_name(domain)
dnsrobocert_1  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dns/resolver.py", line 1706, in zone_for_name
dnsrobocert_1  |      answer = resolver.resolve(
dnsrobocert_1  |               ^^^^^^^^^^^^^^^^^
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dns/resolver.py", line 1321, in resolve
dnsrobocert_1  |      timeout = self._compute_timeout(start, lifetime, resolution.errors)
dnsrobocert_1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dns/resolver.py", line 1075, in _compute_timeout
dnsrobocert_1  |      raise LifetimeTimeout(timeout=duration, errors=errors)
dnsrobocert_1  |  dns.resolver.LifetimeTimeout: The resolution lifetime expired after 5.402 seconds: Server Do53:127.0.0.11@53 answered The DNS operation timed out.; Server Do53:127.0.0.11@53 answered The DNS operation timed out.; Server Do53:127.0.0.11@53 answered The DNS operation timed out.
dnsrobocert_1  |
dnsrobocert_1  | Certbot failed to authenticate some domains (authenticator: manual). The Certificate Authority reported these problems:
dnsrobocert_1  |   Domain: sub.domain.example
dnsrobocert_1  |   Type:   dns
dnsrobocert_1  |   Detail: DNS problem: NXDOMAIN looking up TXT for _acme-challenge.sub.domain.example - check that a DNS record exists for this domain
dnsrobocert_1  |
dnsrobocert_1  | Hint: The Certificate Authority failed to verify the DNS TXT records created by the --manual-auth-hook. Ensure that this hook is functioning correctly and that it waits a sufficient duration of time for DNS propagation. Refer to "certbot --help manual" and the Certbot User Guide.
dnsrobocert_1  |
dnsrobocert_1  | Hook '--manual-cleanup-hook' for sub.domain.example reported error code 1
dnsrobocert_1  | Hook '--manual-cleanup-hook' for sub.domain.example ran with output:
dnsrobocert_1  | Hook '--manual-cleanup-hook' for sub.domain.example ran with error output:
dnsrobocert_1  |  Error while executing the cleanup hook:
dnsrobocert_1  |  The resolution lifetime expired after 5.402 seconds: Server Do53:127.0.0.11@53 answered The DNS operation timed out.; Server Do53:127.0.0.11@53 answered The DNS operation timed out.; Server Do53:127.0.0.11@53 answered The DNS operation timed out.
dnsrobocert_1  |  Traceback (most recent call last):
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dnsrobocert/core/hooks.py", line 40, in main
dnsrobocert_1  |      globals()[parsed_args.type](dnsrobocert_config, parsed_args.lineage)
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dnsrobocert/core/hooks.py", line 123, in cleanup
dnsrobocert_1  |      txt_challenge(certificate, profile, token, domain, action="delete")
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dnsrobocert/core/challenge.py", line 52, in txt_challenge
dnsrobocert_1  |      with Client(ConfigResolver().with_dict(config_dict)) as operations:
dnsrobocert_1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/lexicon/client.py", line 106, in __init__
dnsrobocert_1  |      zone_name = dns.resolver.zone_for_name(domain)
dnsrobocert_1  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dns/resolver.py", line 1706, in zone_for_name
dnsrobocert_1  |      answer = resolver.resolve(
dnsrobocert_1  |               ^^^^^^^^^^^^^^^^^
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dns/resolver.py", line 1321, in resolve
dnsrobocert_1  |      timeout = self._compute_timeout(start, lifetime, resolution.errors)
dnsrobocert_1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dnsrobocert_1  |    File "/usr/local/lib/python3.11/site-packages/dns/resolver.py", line 1075, in _compute_timeout
dnsrobocert_1  |      raise LifetimeTimeout(timeout=duration, errors=errors)
dnsrobocert_1  |  dns.resolver.LifetimeTimeout: The resolution lifetime expired after 5.402 seconds: Server Do53:127.0.0.11@53 answered The DNS operation timed out.; Server Do53:127.0.0.11@53 answered The DNS operation timed out.; Server Do53:127.0.0.11@53 answered The DNS operation timed out.
dnsrobocert_1  |  Executing cleanup hook for domain sub.domain.example, lineage sub.domain.example.
dnsrobocert_1  | Some challenges have failed.
dnsrobocert_1  | Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /etc/letsencrypt/logs/letsencrypt.log or re-run Certbot with -v for more details.
dnsrobocert_1  | ----------
dnsrobocert_1  | 2023-11-29 09:24:16 a7a1181186b5 dnsrobocert.core.certbot[1] ERROR An error occurred while processing certificate config {'domains': ['sub.domain.example'], 'force_renew': True, 'profile': 'some_profile_name'}:
dnsrobocert_1  | Command '['/usr/local/bin/python3', '-m', 'dnsrobocert.core.certbot', 'certonly', '-n', '--user-agent-comment', 'DNSroboCert/3.25.0', '--preferred-chain', 'ISRG Root X1', '--config-dir', '/etc/letsencrypt', '--work-dir', '/etc/letsencrypt/workdir', '--logs-dir', '/etc/letsencrypt/logs', '--manual', '--preferred-challenges=dns', '--manual-auth-hook', '/usr/local/bin/python3 -m dnsrobocert.core.hooks -t auth -c "/tmp/tmptklzdm7l/dnsrobocert-runtime.yml" -l "sub.domain.example"', '--manual-cleanup-hook', '/usr/local/bin/python3 -m dnsrobocert.core.hooks -t cleanup -c "/tmp/tmptklzdm7l/dnsrobocert-runtime.yml" -l "sub.domain.example"', '--expand', '--deploy-hook', '/usr/local/bin/python3 -m dnsrobocert.core.hooks -t deploy -c "/tmp/tmptklzdm7l/dnsrobocert-runtime.yml" -l "sub.domain.example"', '--server', 'https://acme-v02.api.letsencrypt.org/directory', '--cert-name', 'sub.domain.example', '--force-renew', '--key-type', 'rsa', '-d', 'sub.domain.example']' returned non-zero exit status 1.
dnsrobocert_1  | 2023-11-29 09:24:16 a7a1181186b5 dnsrobocert.core.certbot[1] INFO Revoke and delete certificates if needed

Using the dig command for dns.hetzner.com (we use Hetzner DNS) and acme-v02.api.letsencrypt.org returned correct results within a few milliseconds both on the host and in the container. We restarted the docker service which also restarted the container, but the issue persisted. Since all our other hosts work perfectly with DNSroboCert 3.24.2 we downgraded to that version which fixed the issue.

The issue seems to be in the DNS zone name resolution update. I have looked a bit through the sources and I assume the addition of the "resolve_zone_name": True config option to the config_dict passed to ConfigResolver().with_dict causes the error, as it leads to the call of dns.resolver.zone_for_name later on. However, I can't find any hint why the dnspython call to dns.resolver.zone_for_name would timeout when normal DNS queries work perfectly.

@adferrand
Copy link
Owner

Hello @Vertganti !

Indeed I made a change in Lexicon to do a smarter resolution of the actual DNS zone name using requests to DNS servers. Your analysis is correct, there is a timeout when dnspython is doing the request. I do not have a clue either sadly about why this would create a problem here...

While investigating, I think I should put anyway a way to disable the DNS zone name resolution entirely when it is not needed and creates problem. I still think this should be the default, but this should be a possible opt-out.

I will try to work on the issue in the next days. Sorry for the inconvenience.

@Vertganti
Copy link
Author

Thank you, a config option to opt-out sounds like a good solution.

@DotOnedotNL
Copy link

FWIW, I see the same issue with TransIP.
I upgraded to 3.25 and see the same timeout. Downgrading to 3.24 resolved the issue.

@jhomer-hscl
Copy link

I have what I think is the same issue with OVH, I'm running 3.24.1 and working fine. If I move to anything newer it all goes wrong.

Happy to test and proposed fixes, changes, future versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants