Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zpool shows permanent errors but does not point to any files #16158

Open
develroo opened this issue May 3, 2024 · 17 comments
Open

Zpool shows permanent errors but does not point to any files #16158

develroo opened this issue May 3, 2024 · 17 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@develroo
Copy link

develroo commented May 3, 2024

Distribution Name Debian
Distribution Version Testing
Kernel Version 6.6.15-amd64
Architecture amd64
OpenZFS Version

zfs-2.2.3-1
zfs-kmod-2.2.3-1

So I did a regular monthly scrub and it reported back permanent errors. I checked the disk smart status and found one with missed write counts mounting over the threshold. It had not failed completely yet, so I failed the device and replaced it with a new disk and re-ran the scrub. The errors persisted. Running zpool clear did nothing.

zpool status -v
  pool: mediapool-z1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 12:37:16 with 3 errors on Thu May  2 21:45:49 2024
config:

	NAME                                          STATE     READ WRITE CKSUM
	mediapool-z1                                  ONLINE       0     0     0
	  raidz1-0                                    ONLINE       0     0     0
	    ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N5VUN75D  ONLINE       0     0     0
	    ata-ST3000VN007-2AH16M_ZGY7N50P           ONLINE       0     0     0
	    sdc                                       ONLINE       0     0     0
	    sdd                                       ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        <0x931>:<0x348c>

What is going on here, and why can I not reset the error flag if there are indeed no files failing?

Any thoughts?

@develroo develroo added the Type: Defect Incorrect behavior (e.g. crash, hang) label May 3, 2024
@neurotensin
Copy link

Hi,

I think I have some more data. The errors I think are caused by creating and removing snapshots where there is a contention. On my system I use zfs-autosnap to create snapshots and syncoid to copy the snapshots to target systems (not creating).

Sometimes syncoid failed because it cannot read the snapshot:

warning: cannot send 'rootvol_nvme######@zfs-auto-snap_frequent-2024-05-03-0600': Input/output error

If I remove manually (zfs destroy) the syncoid will then correctly run.

I have put a semaphore lock on zfs-autosnap and syncoid (using /tmp/ as a test), in the hope of preventing contention, but clearly something else is going on here. It happens frequently, but can still be cleared with a double scrub.....

Any ideas out there?

@develroo
Copy link
Author

develroo commented May 9, 2024

Well I have cleaned snapshots yes, but I never saw an error doing it. As it is now everything seems fine. I opened every file I could just to verify it would but still could not trace an error.

Oh did I not make myself clear? This is persistent though multiple scrubs. Both before I changed out the failing disk (it had not failed yet but had a higher threshold for write failures ) and afterwards. I had hoped that the scrub would clear it but it not, neither does clear.

It would help if the feedback given by the command was perhaps a little less obtuse? Are they genuine errors or some metadata mismatch or something ?

@aerusso
Copy link
Contributor

aerusso commented May 10, 2024

@develroo Is this an encrypted dataset?

@develroo
Copy link
Author

@develroo Is this an encrypted dataset?

No. It is a RAIDZ-1 running docker and a VM. In theory, the VM could have caused some sort of locking issue, but honestly the server has been running for many years now, and I have never had this problem before.

Any more debug info I can give just let me know. Would zdb help me drill down to the missing issues? In which case how because this seems a bit like voodoo at this stage.

Thanks.

@neurotensin
Copy link

neurotensin commented May 10, 2024

@aerusso for mine, yes it is encrypted. I provided the mechanism above - I suspect it is the interaction between zfs-autosnapshot and syncoid. Each is doing the right thing, but even though I put a lock (to prevent them both running simultaneously - I just checked and expanded I will report back...) there is clearly some break of atomic behavior.

These are the system specs:
Distribution Name: Kubuntu
Distribution Version: 24.04 LTS
Kernel Version: 6.8.0-31-lowlatency
Architecture: x86_64
OpenZFS Version:
zfs-2.2.2-0ubuntu9
zfs-kmod-2.2.2-0ubuntu9

@neurotensin
Copy link

Looking around I suspect this https://github.com/openzfs/zfs/issues/15474 is a related issue. The core issue I suspect is the race condition of zfs destroy being used at the same time as a zfs send -I . The zfs snapshot is removed, between the time of the zfs send identifying the range, and the actual zfs send starting.

In my case with a lock on the creation (all cron processed create/remove a /tmp/ lock) and the syncoid doesn't run until the lock is given up (which issues the zfs send). The time is 5 mins (15/30/45/60) vs (20/40/60). I should probably make it relatively prime to push out the repeat sequence, but you get the idea...

@develroo
Copy link
Author

Looking around I suspect this https://github.com/openzfs/zfs/issues/15474 is a related issue. The core issue I suspect is the race condition of zfs destroy being used at the same time as a zfs send -I . The zfs snapshot is removed, between the time of the zfs send identifying the range, and the actual zfs send starting.

In my case with a lock on the creation (all cron processed create/remove a /tmp/ lock) and the syncoid doesn't run until the lock is given up (which issues the zfs send). The time is 5 mins (15/30/45/60) vs (20/40/60). I should probably make it relatively prime to push out the repeat sequence, but you get the idea...

Well, that is a theory I guess, though I was not sending zfs snapshots anywhere. I did batch delete snapshots which took about a min to complete, so in theory it could be trying to take a snapshot then. Either way, because no actual files are involved, there seems to be no way to clear the error.

Is there any way to find out what the errors actually are and if necessary clear them?

@neurotensin
Copy link

@aerusso some more debugging, I have found a relevant zfs event...

May 11 2024 09:20:09.121469698 ereport.fs.zfs.authentication
class = "ereport.fs.zfs.authentication"
ena = 0xf9162f4003200c01
detector = (embedded nvlist)
version = 0x0
scheme = "zfs"
pool = 0x99d944fa950e7d8b
(end detector)
pool = "tank"
pool_guid = 0x99d944fa950e7d8b
pool_state = 0x0
pool_context = 0x0
pool_failmode = "wait"
zio_objset = 0xbc74
zio_object = 0x0
zio_level = 0x0
zio_blkid = 0x1
time = 0x663f7089 0x73d7b02
eid = 0x2a3ff

The zio_objset matches the zero length ( <0xbc74>:<0x0>) error given in the message. Is this the correct way to interpret this?

@neurotensin
Copy link

zpool events -v

@develroo
Copy link
Author

develroo commented May 11, 2024

Hmm well I ran this.

 zdb -c mediapool-z1

Traversing all blocks to verify metadata checksums and verify nothing leaked ...

loading concrete vdev 0, metaslab 173 of 174 ...
8.35T completed ( 847MB/s) estimated time remaining: 0hr 00min 00sec          
	No leaks (block sum matches space maps exactly)

	bp count:              52531792
	ganged count:           1892148
	bp logical:       6697248716800      avg: 127489
	bp physical:      6646730882560      avg: 126527     compression:   1.01
	bp allocated:     9185567629312      avg: 174857     compression:   0.73
	bp deduped:                   0    ref>1:      0   deduplication:   1.00
	bp cloned:                    0    count:      0
	Normal class:     9185563574272     used: 77.26%
	Embedded log class        1695744     used:  0.00%

	additional, non-pointer bps of type 0:     141716
	Dittoed blocks on same vdev: 1337489

space map refcount mismatch: expected 225 != actual 189

zpool event -v just listed a whole lot of snapshots with no reference to the above numbers

zpool events -v | grep 0x43ad

Anyone know how I can drill down to the errors in the first comment?

EDIT:

Hey.. I just noticed the numbers have changed?!

 on a command or topic, run: zpool help [<topic>]
root@zfsforn:~# zpool status -v 
  pool: mediapool-z1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 12:48:20 with 3 errors on Fri May 10 09:21:10 2024
config:

	NAME                                          STATE     READ WRITE CKSUM
	mediapool-z1                                  ONLINE       0     0     0
	  raidz1-0                                    ONLINE       0     0     0
	    ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N5VUN75D  ONLINE       0     0     6
	    ata-ST3000VN007-2AH16M_ZGY7N50P           ONLINE       0     0     6
	    sdc                                       ONLINE       0     0     6
	    sdd                                       ONLINE       0     0     6

errors: Permanent errors have been detected in the following files:

        <0x43ad>:<0x348c>

@neurotensin
Copy link

zdb -c tank
zdb_blkptr_cb: Got error 52 reading <85099, 1204, 1, 0> -- skipping
zdb_blkptr_cb: Got error 52 reading <85099, 0, 1, 2> -- skipping
err == ENOENT (0x34 == 0x2)
ASSERT at module/zfs/dsl_dataset.c:383:load_zfeature()Aborted (core dumped)

FYI - system is still accessible so the core dump was not in the kernel.

@develroo
Copy link
Author

Yes.. but I am asking about my pool. I don't think your issue is related to mine because you have actual file handles to look at whereas I do not.

@danielb2
Copy link

Also experiencing the above.

errors: Permanent errors have been detected in the following files:

        <0x1de80>:<0x0>
        <0x143fc>:<0x0>
        <0x143ff>:<0x0>

I've been using syncoid to make bkups with snapshots

@develroo
Copy link
Author

OK I am adding another post here, because this is officially weird behaviour.

So I did another scrub yesterday, the error were still there (though each time I did a scrub and check the numbers woudl change?). So I off-lined then on-lined them again one disk at a time, letting it resilver. Now the error has cleared, and I am back to a clean pool?

zpool status -v
  pool: mediapool-z1
 state: ONLINE
  scan: resilvered 56.7G in 00:45:19 with 0 errors on Sun May 26 12:30:35 2024
config:

	NAME                                          STATE     READ WRITE CKSUM
	mediapool-z1                                  ONLINE       0     0     0
	  raidz1-0                                    ONLINE       0     0     0
	    ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N5VUN75D  ONLINE       0     0     0
	    ata-ST3000VN007-2AH16M_ZGY7N50P           ONLINE       0     0     0
	    sdc                                       ONLINE       0     0     0
	    sdd                                       ONLINE       0     0     0

errors: No known data errors

So I have no idea what happened or how I fixed it, really. But maybe this will help someone else?

@danielb2
Copy link

danielb2 commented May 26, 2024

The fix for me was to initiate a scrub and then stop it with zpool scrub -s <pool>. Subsequent scrub did not show any errors anymore

@develroo
Copy link
Author

Hmm interesting.

@GregorKopka
Copy link
Contributor

Maybe related to #16147 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

5 participants