Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: DNS time-to-live not respected #2859

Open
6 tasks done
barryprice opened this issue Sep 27, 2023 · 4 comments
Open
6 tasks done

[Bug]: DNS time-to-live not respected #2859

barryprice opened this issue Sep 27, 2023 · 4 comments

Comments

@barryprice
Copy link

Before you file a bug report

Mattermost Desktop Version

5.5.0 commit: 4f266a3

Operating System

Ubuntu Linux 22.04 LTS x64 (but seen across various series)

Mattermost Server Version

7.8.0

Steps to reproduce

We migrated a production Mattermost server instance between data centres earlier today.

During the downtime period we intentionally took both source and target instances offline (in such a way that users would receive a 503 error) to avoid skew between the two installations during the sync.

Prior to and during this period, DNS TTL was reduce to 60s.

Once migration was complete, we restored service on the target instance but intentionally kept the source instance offline.

Connecting to the migrated service via web browser (as well as e.g. matterircd) worked fine at this point, but trying to use an already-running mattermost-desktop app just showed 503 errors, confirmed by several users.

We tried logging out and logging in again, but this didn't make any difference, further investigation revealed it was still trying to connect to the intentionally-down source service.

It appears that the app does a DNS lookup on startup/login and then caches that result for far longer than expected, possibly indefinitely.

Fully stopping and relaunching the app resolved the problem.

Expected behavior

The app should have noticed that the target IP changed, and attempted to reconnect to the new target.

If not immediately, then certainly at the logout/login step.

Observed behavior

The stale IP from the source service was cached for many times longer than the set TTL, while the local machine's resolver was well aware of the new one.

Log Output

main.log shows nothing relevant from before the restart that fixed the issue, just repeats of this:

[2023-09-27 08:23:21.408] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 08:24:35.590] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 08:49:52.024] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 08:50:56.047] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 08:54:47.388] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 08:55:24.986] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 09:41:03.193] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 09:41:43.804] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 09:42:06.971] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 09:43:04.916] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 09:43:11.380] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 09:43:42.473] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 09:46:17.176] [info]  [App.Config] config.autostart has been configured: false
[2023-09-27 09:47:08.429] [info]  [App.Config] config.autostart has been configured: false

Additional Information

No response

@devinbinnie
Copy link
Member

@barryprice This seems like it might be an issue with Electron/Chromium, since it's the one managing the DNS lookups. Is this reproducible in the browser?

@barryprice
Copy link
Author

That was my assumption, but I am not at all familiar with Electron - so I'm unsure whether it's something that can be potentially tweaked via options in the mattermost-desktop build, or whether this needs to be targeted upstream to Electron itself (or some component(s) thereof).

Several users tested with various browsers while we were seeing this issue with the app (Firefox, standalone Chrome/Chromium), and all reported that the DNS TTL was respected in those cases with no issues.

@devinbinnie
Copy link
Member

I would wager this would be something we'd want to file upstream to Electron. I can try and reproduce the issue using Electron Fiddle, but it will be tough and take some time. Let me get back to you.

@devinbinnie devinbinnie changed the title [Bug]: [Bug]: DNS time-to-live not respected Oct 3, 2023
@devinbinnie
Copy link
Member

@barryprice I actually was able to reproduce this on Chrome myself on macOS with a CNAME entry with a TTL of 60s. It took a restart of the app to get it to update, not even a Hard Reload would jig it.

This seems like an issue with Chromium itself.

@devinbinnie devinbinnie added the Electron null label Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants