Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supervisor cannot Handle NFS Mounts with Stale File Handles #4985

Open
AngellusMortis opened this issue Mar 27, 2024 · 7 comments
Open

Supervisor cannot Handle NFS Mounts with Stale File Handles #4985

AngellusMortis opened this issue Mar 27, 2024 · 7 comments
Labels
bug network-storage Network storage related bugs

Comments

@AngellusMortis
Copy link

AngellusMortis commented Mar 27, 2024

Describe the issue you are experiencing

Occasionally (usually at least once a day), my NFS backup mount will just disconnect, and I will get a repair saying it failed. Reloading does not work, and it says check the logs. The logs say it is a stale file handle. The only way to resolve the issue is to reboot the host for HAOS. 

I know I am using Unraid and Unraid has been shown to not have the best performance, but I have 9 machines (7 Linux and 2 Windows) connected to this server via NFS and Home Assistant is the only one that that just breaks regularly so there seems to be something wrong in how HA Supervisor is managing the NFS mount.

It is rather annoying needing to reboot HA potentially multiple times a day.

What type of installation are you running?

Home Assistant OS

Which operating system are you running on?

Home Assistant Operating System

Steps to reproduce the issue

  1. Setup NFS Mount
  2. ???
  3. It Breaks

Anything in the Supervisor logs that might be useful for us?

Supervisor logs:

24-03-27 08:15:30 INFO (MainThread) [supervisor.mounts.manager] Reloading mount: backup
24-03-27 08:15:30 ERROR (MainThread) [supervisor.mounts.mount] Reloading backup did not succeed. Check host logs for errors from mount or systemd unit mnt-data-supervisor-mounts-backup.mount for details.
24-03-27 08:15:38 INFO (MainThread) [supervisor.mounts.manager] Reloading mount: backup
24-03-27 08:15:38 ERROR (MainThread) [supervisor.mounts.mount] Reloading backup did not succeed. Check host logs for errors from mount or systemd unit mnt-data-supervisor-mounts-backup.mount for details.

Host logs:

Mar 27 12:15:30 home systemd[1]: Reloading Supervisor nfs mount: backup...
Mar 27 12:15:30 home mount[2768611]: mount.nfs: Stale file handle
Mar 27 12:15:30 home systemd[1]: mnt-data-supervisor-mounts-backup.mount: Mount process exited, code=exited, status=1/FAILURE
Mar 27 12:15:30 home systemd[1]: Reload failed for Supervisor nfs mount: backup.
Mar 27 12:15:37 home systemd[1]: run-docker-runtime\x2drunc-moby-ff86bbb670854cdfd1f1814e02525b7fd2c179dc9087c182208732cd9b618704-runc.0pBNcO.mount: Deactivated successfully.
Mar 27 12:15:38 home systemd[1]: Reloading Supervisor nfs mount: backup...
Mar 27 12:15:38 home mount[2769226]: mount.nfs: Stale file handle
Mar 27 12:15:38 home systemd[1]: mnt-data-supervisor-mounts-backup.mount: Mount process exited, code=exited, status=1/FAILURE
Mar 27 12:15:38 home systemd[1]: Reload failed for Supervisor nfs mount: backup.

System Health information

System Information

version core-2024.3.2
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.2
os_name Linux
os_version 6.6.20-haos
arch x86_64
timezone America/New_York
config_dir /config
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
GitHub API Calls Remaining 4885
Installed Version 1.34.0
Stage running
Available Repositories 1406
Downloaded Repositories 21
Home Assistant Cloud
logged_in true
subscription_expiration December 31, 2017 at 7:00 PM
relayer_connected false
relayer_region null
remote_enabled false
remote_connected false
alexa_enabled false
google_enabled false
remote_server null
certificate_status null
instance_id 6d1891b515664ee79d1010b60c891d36
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 12.1
update_channel stable
supervisor_version supervisor-2024.03.0
agent_version 1.6.0
docker_version 24.0.7
disk_total 109.3 GB
disk_used 67.6 GB
healthy true
supported true
board generic-x86-64
supervisor_api ok
version_api ok
installed_addons Studio Code Server (5.15.0), Mosquitto broker (6.4.0), Advanced SSH & Web Terminal (17.2.0), Promtail (2.2.0), Node-RED (17.0.10), Z-Wave JS UI (3.4.1), Github Actions Runner (3), Zigbee2MQTT (1.36.0-1), SQLite Web (4.1.2), Whisper (2.0.0), Piper (1.5.0), Silicon Labs Multiprotocol (2.4.4), Matter Server (5.4.1), openWakeWord (1.10.0), ESPHome (2024.3.1)
Dashboards
dashboards 3
resources 17
views 35
mode yaml
Recorder
oldest_recorder_run March 17, 2024 at 7:48 PM
current_recorder_run March 27, 2024 at 8:23 AM
estimated_db_size 6373.67 MiB
database_engine sqlite
database_version 3.44.2

Supervisor diagnostics

config_entry-hassio-56582080815fc62cb9d95187042d8c10.json

Additional information

No response

@agners
Copy link
Member

agners commented Apr 8, 2024

I know I am using Unraid and Unraid has been shown to not have the best performance, but I have 9 machines (7 Linux and 2 Windows) connected to this server via NFS and Home Assistant is the only one that that just breaks regularly so there seems to be something wrong in how HA Supervisor is managing the NFS mount.

Are those other Linux systems also constantly connected?

Occasionally (usually at least once a day), my NFS backup mount will just disconnect, and I will get a repair saying it failed.

I guess this is kinda the root of the problem: The question is why does it disconnect?

Home Assistant is doing a mount manager reload every 15 minutes, which leads to a check. My guess is that this check fails every now and then. And from that fail onwards, the (re-)mount is not possible.

The StackExchange thread mount.nfs: Stale file handle error - cannot umount has some information why remounting fails in this case.

I guess we can't really do something client side. Is the server constantly on?

@agners agners added the network-storage Network storage related bugs label Apr 8, 2024
@AngellusMortis

This comment was marked as abuse.

@AngellusMortis

This comment was marked as abuse.

@agners
Copy link
Member

agners commented Apr 9, 2024

For Home Assistant network storage, the Supervisor creates a systemd mount unit on Home Assistant OS, which in turn calls the operating system mount commands (in your case the unit name is mnt-data-supervisor-mounts-backup.mount). In the end, this is not much different from a fstab entry.

The host logs you shared are from the point when you try to reload the mount using the repair, correct?

Is there maybe some log entry before that, when it potentially breaks? Can you also check the Supervisor logs, if they have something when the potential breakage happens?

I suspect that the 15 minutes mount reload causes havoc somehow, in your case.

@AngellusMortis

This comment was marked as abuse.

Copy link

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label May 13, 2024
@AngellusMortis

This comment was marked as abuse.

@github-actions github-actions bot removed the stale label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug network-storage Network storage related bugs
Projects
None yet
Development

No branches or pull requests

2 participants