Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Healtcheck sometimes fails, on a retry it passes #439

Closed
TheReptile opened this issue Jul 19, 2022 · 5 comments
Closed

Healtcheck sometimes fails, on a retry it passes #439

TheReptile opened this issue Jul 19, 2022 · 5 comments

Comments

@TheReptile
Copy link

  • Passbolt Version: 3.6.0.
  • Platform and Target:
    -- Operating system: Ubuntu 20.04.04
    -- PHP: 7.4
    -- Web server: Nginx 1.18.0
    -- Database server: MariaDB 10.3.34

What you did

I created a cron job to extract the health check. For monitoring purposes.
Basically this command: ./bin/cake passbolt healthcheck > /data/flusso/passbolt/output/passbolt_healthcheck.txt

What happened

Every now and then, there are errors in the output of the health-check. The errors only occur temporarily and when I retry, the errors are gone.
These are the 2 errors shown:

 [FAIL] The private key cannot be used to decrypt and verify a message
 [FAIL] The public key cannot be used to verify a signature.

Our Passbolt installation is working fine, so I assume the health-check is sometimes wrong.

What you expected to happen

I would expect to the health-check to give consistent results.

@stripthis
Copy link
Member

HI @TheReptile this checks rely on functionalities provided by php-gnupg. This could mean you have some issues with Gnupg on your system. It could come from either some clock issue (can you check the server time?) or entropy issue (on virtualized environment you can use haveged or rngtools).

@TheReptile
Copy link
Author

@stripthis That's strange, on all the vms we use we have ntp and haveged installed.

# ps wauxxx | grep -e ntp -e haveged
root         396  0.0  0.2   8296  4772 ?        Ss   Jul15   0:16 /usr/sbin/haveged --Foreground --verbose=1 -w 1024
ntp          532  0.0  0.2  74632  4044 ?        Ssl  Jul15   0:37 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 110:115

Also this problem almost seems to be a race condition, once it fails, if I retry immediately, the test passes.

@stripthis
Copy link
Member

Can you check the entropy pool size when it fails? Using /proc/sys/kernel/random/entropy_avail I think.

I'm not sure which issue this could be, but would be very grateful if you can help us narrow it down. Can you check if there are some additional information on the Gnupg side (https://www.gnupg.org/documentation/manuals/gpgme/Debugging.html)? Do you have any particular setup filesystem wise? Something that would prevent Gnupgp to read/write on the file system like concurent access or latency issues (network disk?).

Thank for your help

@TheReptile
Copy link
Author

I managed to quickly reproduce this:

# echo `date +'%Y%m%d %H:%M:%S'`;/data/scripts/passbolt/passbolt_healthcheck.sh; grep FAIL passbolt_healthcheck.txt; echo -n "Entropy: "; cat  /proc/sys/kernel/random/entropy_avail 
20220719 15:29:39
 [FAIL] The private key cannot be used to decrypt and verify a message
 [FAIL] The public key cannot be used to verify a signature.
 [FAIL] 2 error(s) found. Hang in there!
Entropy: 2711
# echo `date +'%Y%m%d %H:%M:%S'`;/data/scripts/passbolt/passbolt_healthcheck.sh; grep FAIL passbolt_healthcheck.txt; echo -n "Entropy: "; cat  /proc/sys/kernel/random/entropy_avail 
20220719 15:29:42
Entropy: 2722

# echo `date +'%Y%m%d %H:%M:%S'`;/data/scripts/passbolt/passbolt_healthcheck.sh; grep FAIL passbolt_healthcheck.txt; echo -n "Entropy: ";cat /proc/sys/kernel/random/entropy_avail 
20220719 15:35:22
 [FAIL] The public key cannot be used to verify a signature.
 [FAIL] 1 error(s) found. Hang in there!
Entropy: 2916

This is a pretty default 20.04 VPS from Hetzner. It's using local storage.

@stripthis
Copy link
Member

Can you try to set

GPGME_DEBUG=9:/home/user/mygpgme.log

And see if any information shows when the operation is failing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants