Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FusionInventory Agent DNS error handling on Windows #1033

Open
cmeh opened this issue Nov 21, 2022 · 7 comments
Open

Fix FusionInventory Agent DNS error handling on Windows #1033

cmeh opened this issue Nov 21, 2022 · 7 comments

Comments

@cmeh
Copy link

cmeh commented Nov 21, 2022

-- Pour la version française, veuillez faire défiler verse le bas --

First of all thank you very much for the FusionInventory Agent - it's a really great tool and your work is highly appreciated.

Bug report: We would like to report a problem in the current version 2.6 (Windows) with DNS handling: We've noticed some of our Windows 10 Enterprise clients do not respond to forcing an inventory via the FusionInventory for GLPi button, the reason being obviously an error in resolving the server's FQDN (anonymized here):

[Tue Oct 25 12:55:24 2022][info] FusionInventory Agent service starting
[Tue Oct 25 12:55:24 2022][error] unable to get address for 'inventory.sub.domain.de': Der angegebene Host ist unbekannt. 
[Tue Oct 25 12:55:24 2022][error] unable to get address for 'inventory.sub.domain.de': Der angegebene Host ist unbekannt.
[Tue Oct 25 12:55:24 2022][debug] Trusted target ip: 
[Tue Oct 25 12:55:24 2022][debug] Trusted client ip: 127.0.0.1/32, ::ffff:7f00:1/128, ::1/128

("Der angegebene Host ist unbekannt." is German and means 'The host is unknown').

The cause for that is obviously that on the affected clients the FusionInventory Agent Windows service starts shortly before DNS resolving becomes available:

As far as I understand, the trusted IP addresses are filled at line 88 of lib/FusionInventory/Agent/HTTP/Server.pm (setTrustedAddresses). This happens only twice - once on launch (line 41) and once on relaunch (line 558), and compile and resolve from lib/FusionInventory/Agent/Tools/Network.pm are used for that.

_isTrusted (starting at line 417) then only checks against the static (!) values. In other words: If - for whatever reason - DNS resolution is not working right on the agent's service launch, the client will never contact the server again, until the service is relaunched. Furthermore, the problem is probably very relevant for notebooks in a home office scenario that start to query internal DNS servers over a VPN tunnel which has been established long after the FusionInventory Agent's service start.

Request: Wouldn't it be possible to call setTrustedAddresses within _isTrusted once again? And you could even limit that on the 'The host is unkown' scenario above. The loss of performance for that should be negligible.

@cmeh
Copy link
Author

cmeh commented Nov 21, 2022

-- Version française (please scroll up for the English version) --

Tout d’abord un grand merci pour le FusionInventory Agent – c’est un outil formidable et votre travail est vraiment apprecié !

Bug report : Nous aimerions bien rapporter un problème de la version actuelle 2.6 (Windows) avec la gérance du DNS : Nous avons remarqué que quelque de nos clients Windows 10 Enterprise ne repondent pas aux réquêtes de créer un inventaire par le bouton « Forcer l’inventaire ». La raison pour ça est apparemment une erreur de la résolution du FQDN du serveur (anonymisé ci-dessous) :

[Tue Oct 25 12:55:24 2022][info] FusionInventory Agent service starting
[Tue Oct 25 12:55:24 2022][error] unable to get address for 'inventory.sub.domain.de': Der angegebene Host ist unbekannt. 
[Tue Oct 25 12:55:24 2022][error] unable to get address for 'inventory.sub.domain.de': Der angegebene Host ist unbekannt.
[Tue Oct 25 12:55:24 2022][debug] Trusted target ip: 
[Tue Oct 25 12:55:24 2022][debug] Trusted client ip: 127.0.0.1/32, ::ffff:7f00:1/128, ::1/128

("Der angegebene Host ist unbekannt." est allemand pour 'hôte inconnu').

La raison pour l’erreur est très probablement que sur les clients affectés, le service FusionInventory Agent est démarré avant que la résolution DNS soit complètement disponible :

Si je comprends bien, les addresses IP confiées sont etablies à partir de la ligne 88 de lib/FusionInventory/Agent/HTTP/Server.pm (setTrustedAddresses). Cela n’a lieu que deux fois – une fois lors du démarrage du service (ligne 41) et l’autre fois lor du redémarrage du service (line 558), compile et resolve de lib/FusionInventory/Agent/Tools/Network.pm sont invoqués pour ça.

_isTrusted (commençant à la ligne 417) vérifie seulement contre des valeurs statiques (!). En d’autres termes: Si – pour n’importe quelle raison – la résolution DNS ne marche pas tout de suite lors du démarrage du service de l’agent, le client ne va jamais contacter le serveur, jusqu’au redémarrage du service. En plus, la problématique nous semble très pertinente pour des ordinateurs portables au bureau à domicile qui commencent à contacter des serveurs internes DNS à travers un tunnel VPN etabli longtemps après le démarrage du service FusionInventory agent.

Demande : Serait-il peut-être possible d’invoquer setTrustedAddresses dans _isTrusted encore une fois? Cela pourrait même être limité aux cas du « hôte inconnu » scénario. La perte de performance devrait être négligible.

g-bougard added a commit to glpi-project/glpi-agent that referenced this issue Nov 21, 2022
@g-bougard
Copy link
Contributor

Hi @cmeh
you're right there's an issue, but not only on windows.
The problem with _isTrusted() is it can be called a lot of time. This can lead to an agent DDOS if we try to resolv trusted host on each call.
I developed a solution for GLPI-Agent which is a FusionInventory agent fork.
The solution can be easily back-ported by FusionInventory maintainer if he wants. Anyway, you can yourself try the next GLPI-Agent nightly build to validate it works for you. You should know GLPI-Agent is compatible with current FusionInventory plugin.
My solution is to cache the trusted addresses with a one minute expiration, see glpi-project/glpi-agent@edc3b50.
Thank you to have remember me this issue I still was facing a long time ago.

@cmeh
Copy link
Author

cmeh commented Nov 22, 2022

Hi @g-bougard ,

thank you very much for your fast reaction and for adding the DNS cache to GLPI-Agent.

To be honest, I haven't tested the GLPI-Agent yet (we've currently deployed the FusionInventory Agent on some 250+ Windows 10 Enterprise clients for deploy and inventory tasks and I'll have to figure out first to what degree the GLPI-Agent is compatible with our FusionInventory Agent / FusionInventory for GLPI configuration after the fork).

Just as a thought: From the code it seems to me that both agents still resolve their trusted server's FQDN only once on their service launch (please feel free to correct me if I'm wrong). So if the user has to change the IP address of his/her trusted inventory server e.g. as part of let's say a server migration, the clients won't notice that at all: They'll still take the trusted IP established by their DNS query at launch time, right? (assuming glpi-project/glpi-agent@edc3b50 only covers the 'FQDN couldn't be resolved at all' scenario?)

It's certainly not that you couldn't work around that by e.g. a relaunch of the agent service on the clients or by simply using fix IPs without any DNS resolution, at all. But I wonder if the agents' current approach of resolving the trusted server name only once during runtime isn't a bit contrary to the purpose of using DNS resolution in a client? I would at least expect a client with DNS resolution to notice changes for its server's FQDN.

Thank you once again!

@ddurieux
Copy link
Member

I will check the setup of the agent to add a dependency on the DNS client service.

@g-bougard
Copy link
Contributor

Hi @cmeh

the server DNS resolution is also done on cache expiration with my fix. This really means your problem should be fixed:
I renamed setTrustedAddress() to _handleTrustedAddressesCache(). _handleTrustedAddressesCache() is called during start up but also on each _isTrusted() check. If on the check the expiration has been reached (it is after a minute), _handleTrustedAddressesCache() does again the server DNS resolution as for httpd-trust option and will cache the result for only one minute.
I tested manually by using a servername which fails to resolv and without setting localhost in trusted ips: if I reach http://localhost:62354/now to ask to run tasks now, I'm getting a "Forbidden' answer. In the minute, I update my /etc/hosts file adding the servername to 127.0.0.1. I still obtain a failure if I request before the first minute after the service start. After one minute, I'm authorized and the tasks run.

For your information, GLPI-Agent remains fully compatible with current FusionInventory for GLPI plugin, and this remains a constraint for us on the current development.

We should use a cache in setTrustedAddress() as if DNS is failing the request will take time on each request. And this can become a problem in some case, especially if we don't cache, it could become a vector attack.

@ddurieux
Copy link
Member

@cmeh you can asd service DNS dependency after installation of FusionInventory agent with command:

sc config FusionInventory-Agent depend=Dnscache

If you confirm it's working, I will integrate it into the installer

@g-bougard
Copy link
Contributor

Anyway, this doesn't fix the problem if the server ip changes during the service is running or even if the dns resolving just fails for any reason outside windows system. And of course, this doesn't fix the problem on other OS.
@ddurieux you should also backport my fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants