Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rrdcached daemon stops responding in batch mode #1188

Open
robbykrlos opened this issue Sep 7, 2022 · 2 comments
Open

rrdcached daemon stops responding in batch mode #1188

robbykrlos opened this issue Sep 7, 2022 · 2 comments

Comments

@robbykrlos
Copy link

Bug description

I'm using rrdtool 1.7.0 on RedHad 8. I'm starting rrdcached as a daemon using these parameters:

rrdcached -p /path/to/pid -j /path/to/journals -b /path/to/base/ -B -R -l unix:///path/to/socket

The backend, a php application, uses php_socket module/extension and a socket client lib for communication (https://github.com/MitinSany/rrdcached-php)

My php script hangs / freezes indefinitely after ~623 UPDATE commands or ~143 CREATE commands while in BATCH mode. This only happens when in BATCH mode and only if I exceed these numbers. A batch update with ~200 commands works very fast. If I do not use BATCH mode, it works for more commands (>1000).

I traced down the exact moment where it freezes, and it all falls on this: socket_write() (https://www.php.net/manual/en/function.socket-write.php)

Debug code:

echo "==>";
$this->write('UPDATE ' . $fileName . ' ' . implode(':', $options) . PHP_EOL);
echo ">==" . PHP_EOL;
...

public function write($buffer)
    {
        echo ">>>@socket_write-- ";
        $ret = @socket_write($this->resource, $buffer);
        echo $ret, " --@socket_write>>>" . PHP_EOL;
        if ($ret === false) {
            var_dump(socket_last_error($this->resource));
            throw Exception::createFromSocketResource($this->resource);
        }
        return $ret;
    }

Output:

[RrdCached] --> Sending batch command "update" #622
==> >>>@socket_write-- 128 --@socket_write>>> >==
[RrdCached] --> Sending batch command "update" #623
==> >>>@socket_write-- 

So, conclusion is that somewhere outside my control, in php_socket module, or the rrdcached daemon things are frozen.

I have monitored rrdcached daemon and it's child worker processes and I saw that rrdcached it keeps renewing child processes periodically. with an average of ~3-5% CPU activity.

Initially the back-end opened the BATCH mode too soon and a ~30 second wait was needed for an API to answer, and I thought that might be a reason - maybe a timeout. But I've excluded this wait, and even if the BATCH mode was just started, it still fails after ~600 commands.

  • OS: Red Hat Enterprise Linux 8.6
  • RRDtool 1.7.0

If you have any ideas, or suggestions, let me know,

Thanks!

@robbykrlos
Copy link
Author

robbykrlos commented Sep 7, 2022

PS: No error logs are generated by rrdcached (even with -V LOG_DEBUG). No errors are generated in the php code. I also removed the "@" error suppression from socket_write() and still no errors displayed.

@robbykrlos
Copy link
Author

Some later feedback after some more tweaks and tests:

  • issue is not influenced by -j (journal).
  • issue is not influenced by -a 2 (alloc).
  • issue is not influenced by -z 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant