Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collectdctl return error "Connection refused" #4142

Open
Rohlik opened this issue Sep 27, 2023 · 2 comments
Open

collectdctl return error "Connection refused" #4142

Rohlik opened this issue Sep 27, 2023 · 2 comments
Labels
Bug A genuine bug

Comments

@Rohlik
Copy link

Rohlik commented Sep 27, 2023

  • Version of collectd: 5.9.0 and 5.12.0
  • Operating system / distribution: Centos Stream 8/9
  • Kernel version (if applicable): 6.1.27 and 6.5.2

Expected behavior

Command collectdctl listval -s /var/run/collectd-sock should return many values.

Actual behavior

# Running command as a root
collectdctl listval -s /var/run/collectd-sock
ERROR: Failed to connect to daemon at unix:/var/run/collectd-sock: Connection refused.

Related configuration:

<LoadPlugin unixsock>
  Globals false
</LoadPlugin>

<Plugin unixsock>
  SocketFile  "/var/run/collectd-sock"
  SocketGroup "root"
  SocketPerms "0770"
  DeleteSocket "false"
</Plugin>

strace output:

execve("/bin/collectdctl", ["collectdctl", "listval", "-s", "/var/run/collectd-sock"], 0x7fffad805118 /* 61 vars */) = 0
brk(NULL)                               = 0x55c072c24000
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffe2c86d5b0) = -1 EINVAL (Invalid argument)
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/glibc-hwcaps/x86-64-v3/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/glibc-hwcaps/x86-64-v3", 0x7ffe2c86c7b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/glibc-hwcaps/x86-64-v2/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/glibc-hwcaps/x86-64-v2", 0x7ffe2c86c7b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/tls/x86_64/x86_64/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/tls/x86_64/x86_64", 0x7ffe2c86c7b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/tls/x86_64/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/tls/x86_64", 0x7ffe2c86c7b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/tls/x86_64/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/tls/x86_64", 0x7ffe2c86c7b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/tls/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/tls", {st_mode=S_IFDIR|0555, st_size=4096, ...}) = 0
openat(AT_FDCWD, "/usr/lib64/x86_64/x86_64/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/x86_64/x86_64", 0x7ffe2c86c7b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/x86_64/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/x86_64", 0x7ffe2c86c7b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/x86_64/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/x86_64", 0x7ffe2c86c7b0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/libcollectdclient.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\"\0\0\0\0\0\0"..., 832) = 832
lseek(3, 34552, SEEK_SET)               = 34552
read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
fstat(3, {st_mode=S_IFREG|0755, st_size=52864, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f94eb650000
lseek(3, 34552, SEEK_SET)               = 34552
read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
mmap(NULL, 2134040, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94eb000000
mprotect(0x7f94eb009000, 2093056, PROT_NONE) = 0
mmap(0x7f94eb208000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x8000) = 0x7f94eb208000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib64/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\256\3\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2089984, ...}) = 0
lseek(3, 808, SEEK_SET)                 = 808
read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
mmap(NULL, 3950816, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94eac00000
mprotect(0x7f94eadbb000, 2097152, PROT_NONE) = 0
mmap(0x7f94eafbb000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bb000) = 0x7f94eafbb000
mmap(0x7f94eafc1000, 14560, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f94eafc1000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib64/tls/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \305\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1598840, ...}) = 0
mmap(NULL, 3674432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94ea800000
mprotect(0x7f94ea980000, 2097152, PROT_NONE) = 0
mmap(0x7f94eab80000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x180000) = 0x7f94eab80000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib64/tls/libgcrypt.so.20", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/libgcrypt.so.20", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\272\0\0\0\0\0\0"..., 832) = 832
lseek(3, 1143856, SEEK_SET)             = 1143856
read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
fstat(3, {st_mode=S_IFREG|0755, st_size=1187312, ...}) = 0
lseek(3, 1143856, SEEK_SET)             = 1143856
read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
mmap(NULL, 3268552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94ea400000
mprotect(0x7f94ea518000, 2093056, PROT_NONE) = 0
mmap(0x7f94ea717000, 28672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x117000) = 0x7f94ea717000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib64/tls/libdl.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\16\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=19128, ...}) = 0
mmap(NULL, 2109600, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94ea000000
mprotect(0x7f94ea003000, 2093056, PROT_NONE) = 0
mmap(0x7f94ea202000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f94ea202000
close(3)                                = 0
openat(AT_FDCWD, "/usr/lib64/tls/libgpg-error.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib64/libgpg-error.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0PH\0\0\0\0\0\0"..., 832) = 832
lseek(3, 125832, SEEK_SET)              = 125832
read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
fstat(3, {st_mode=S_IFREG|0755, st_size=145984, ...}) = 0
lseek(3, 125832, SEEK_SET)              = 125832
read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
mmap(NULL, 2228800, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f94e9c00000
mprotect(0x7f94e9c1f000, 2097152, PROT_NONE) = 0
mmap(0x7f94e9e1f000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1f000) = 0x7f94e9e1f000
close(3)                                = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f94eb64e000
arch_prctl(ARCH_SET_FS, 0x7f94eb64f500) = 0
mprotect(0x7f94eafbb000, 16384, PROT_READ) = 0
mprotect(0x7f94e9e1f000, 4096, PROT_READ) = 0
mprotect(0x7f94ea202000, 4096, PROT_READ) = 0
mprotect(0x7f94ea717000, 8192, PROT_READ) = 0
mprotect(0x7f94eab80000, 4096, PROT_READ) = 0
mprotect(0x7f94eb208000, 4096, PROT_READ) = 0
mprotect(0x55c072c03000, 4096, PROT_READ) = 0
mprotect(0x7f94eb62e000, 4096, PROT_READ) = 0
getrandom("\x34\xf4\x50\x9e\x29\x67\x51\x98", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x55c072c24000
brk(0x55c072c45000)                     = 0x55c072c45000
access("/etc/system-fips", F_OK)        = -1 ENOENT (No such file or directory)
access("/etc/gcrypt/fips_enabled", F_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/proc/sys/crypto/fips_enabled", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(3, "0\n", 1024)                    = 2
close(3)                                = 0
socket(AF_UNIX, SOCK_STREAM, 0)         = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/collectd-sock"}, 110) = -1 ECONNREFUSED (Connection refused)
close(3)                                = 0
write(2, "ERROR: Failed to connect to daem"..., 87ERROR: Failed to connect to daemon at unix:/var/run/collectd-sock: Connection refused.
) = 87
exit_group(1)                           = ?
+++ exited with 1 +++
stat /var/run/collectd-sock
  File: /var/run/collectd-sock
  Size: 0               Blocks: 0          IO Block: 4096   socket
Device: 18h/24d Inode: 14032455    Links: 1
Access: (0770/srwxrwx---)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:var_run_t:s0
Access: 2023-09-22 10:41:01.474309510 +0200
Modify: 2023-09-15 13:14:23.423802028 +0200
Change: 2023-09-15 13:14:23.423802028 +0200
 Birth: 2023-09-15 13:14:23.423802028 +0200

We are able to solve it by running these commands:

rm -f /var/run/collectd-sock
systemctl restart collectd.service

Steps to reproduce

  • Sadly, I don't have a reproducer for this issue as it randomly happens on our nodes until the commands above are executed.
@eero-t
Copy link
Contributor

eero-t commented Dec 1, 2023

Is the collectd daemon it's trying to connect, still otherwise working?

If yes, when that issue happens next, please strace -f also what the collectd daemon it tries to connect, is doing.

@octo
Copy link
Member

octo commented Dec 6, 2023

The output of

lsof -p $(< /var/run/collectd.pid)

would also be very useful.

@octo octo added the Bug A genuine bug label Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A genuine bug
Projects
None yet
Development

No branches or pull requests

3 participants