You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kernel version (if applicable): 6.3.8-100.fc37.x86_64
Expected behavior
I expect that when enabled, the smart plugin does not increase the num_err_log_entries of a Seagate FireCuda 530 NVMe drive.
Actual behavior
With the smart plugin enabled, every minute, the num_err_log_entries values increments by one
Steps to reproduce
I have few hard disks on my server, including a Samsung SSD 980 PRO 2T. collectd is configured with the smart plugin and worked perfectly fine with the Samsung NVMe.
But today I added a new Seagate FireCuda 530. After a reboot, I was curious to see if the smart plugin picks it up - and it did, however, I spotted that num_err_log_entries was increasing.
I found this https://www.osso.nl/blog/kioxia-nvme-num-err-log-entries-0xc004-smartctl/ website describing a similar problem - but there smartctl was used directly and smartctl bug was fixed long time ago. This pointed me into the direction of collectd / smart plugin, and thus, I started testing.
Without a <Plugin "smart"> tag (thus, auto-detect I assume), or with the drive enabled -> the num_err_log_entries is increasing. The only way to not have it increase is to disable it from being monitored:
<Plugin "smart">
Disk "sda"
Disk "sdb"
Disk "sdc"
Disk "sdd"
Disk "sde"
# Disk "nvme0n1"
Disk "nvme1n1"
IgnoreSelected false
</Plugin>
The error reported is:
# nvme error-log /dev/nvme0n1
Error Log Entries for device:nvme0n1 entries:63
.................
Entry[ 0]
.................
error_count : 52
sqid : 0
cmdid : 0x9010
status_field : 0x2002(Invalid Field in Command: A reserved coded value or an unsupported value in a defined field)
phase_tag : 0
parm_err_loc : 0x4
lba : 0
nsid : 0xffffffff
vs : 0
trtype : The transport type is not indicated or the error is not transport related.
cs : 0
trtype_spec_info: 0
.................
Entry[ 1]
.................
error_count : 51
sqid : 0
cmdid : 0xa014
status_field : 0x2002(Invalid Field in Command: A reserved coded value or an unsupported value in a defined field)
phase_tag : 0
parm_err_loc : 0x4
lba : 0
nsid : 0xffffffff
vs : 0
trtype : The transport type is not indicated or the error is not transport related.
cs : 0
trtype_spec_info: 0
.................
Entry[ 2]
The text was updated successfully, but these errors were encountered:
viulian
added a commit
to viulian/collectd
that referenced
this issue
Jul 18, 2023
Expected behavior
I expect that when enabled, the smart plugin does not increase the
num_err_log_entries
of a Seagate FireCuda 530 NVMe drive.Actual behavior
With the smart plugin enabled, every minute, the
num_err_log_entries
values increments by oneSteps to reproduce
I have few hard disks on my server, including a Samsung SSD 980 PRO 2T. collectd is configured with the smart plugin and worked perfectly fine with the Samsung NVMe.
But today I added a new Seagate FireCuda 530. After a reboot, I was curious to see if the smart plugin picks it up - and it did, however, I spotted that
num_err_log_entries
was increasing.I found this https://www.osso.nl/blog/kioxia-nvme-num-err-log-entries-0xc004-smartctl/ website describing a similar problem - but there smartctl was used directly and smartctl bug was fixed long time ago. This pointed me into the direction of collectd / smart plugin, and thus, I started testing.
Without a <Plugin "smart"> tag (thus, auto-detect I assume), or with the drive enabled -> the
num_err_log_entries
is increasing. The only way to not have it increase is to disable it from being monitored:The error reported is:
The text was updated successfully, but these errors were encountered: