Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while reading line from the server #32

Closed
abesto opened this issue Jun 7, 2011 · 28 comments
Closed

Error while reading line from the server #32

abesto opened this issue Jun 7, 2011 · 28 comments

Comments

@abesto
Copy link

abesto commented Jun 7, 2011

CommunicationExceptions are thrown with the message 'Error while reading line from the server' from time to time. The Predis part of an example stack trace:

Message: Error while reading line from the server
Thrown at: /var/www/lib/redis/Predis.php:1568
Trace:
#0 /var/www/lib/redis/Predis.php(1636): Predis\Connection->onCommunicationException('Error while rea...')
#1 /var/www/lib/redis/Predis.php(724): Predis\Connection->readLine()
#2 /var/www/lib/redis/Predis.php(1577): Predis\FastResponseReader->read(Object(Predis\Connection))
#3 /var/www/lib/redis/Predis.php(1586): Predis\Connection->readResponse(Object(Predis\Commands\Auth))
#4 /var/www/lib/redis/Predis.php(225): Predis\Connection->executeCommand(Object(Predis\Commands\Auth))
#5 /var/www/lib/redis/RedisProxy.php(21): Predis\Client->__call('auth', Array)
#6 /var/www/lib/redis/RedisProxy.php(21): Predis\Client->auth('password')

We couldn't find anything in common in the affected scripts. Some errors are from a local Redis server, some from a remote one. An example that sometimes generates this error:

The script is run by Apache, browsers call it directly by URL, so there can be a lot of instances running at once. The following Redis commands are run via Predis (in this order; a, b , c and d are different tables):

  • AUTH
  • SELECT a
  • EXISTS
  • (if a condition is met) SETEX
  • SELECT b
  • RPUSH
  • if a condition is met
    • SELECT b
    • INCR
    • SELECT c
    • GET
    • SELECT d
    • RPUSH

The bold commands are those that have ever generated the error. The majority (99%) is on AUTH.

Environment:

$ redis-server -v
Redis server version 2.2.4 (00000000:0)

Predis version 0.6.6

$ php -v
PHP 5.3.3-0.dotdeb.1 with Suhosin-Patch (cli) (built: Oct  1 2010 08:49:29) 
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
    with Xdebug v2.1.0, Copyright (c) 2002-2010, by Derick Rethans

$ /usr/sbin/apache2ctl -v
Server version: Apache/2.2.9 (Debian)
Server built:   Apr 20 2010 15:40:17

@nrk
Copy link
Contributor

nrk commented Jun 7, 2011

Your issue is quite strange. That kind of exception is usually thrown when the socket (which is already connected) reaches a timeout during read or write operations, but the fact that you are seeing these kind of errors even when dealing with a local instance of Redis makes it indeed weird. Did you use a custom (and low) value for the read_write_timeout parameter to configure the connection to Redis?

@abesto
Copy link
Author

abesto commented Jun 7, 2011

No, we didn't. The only configuration parameter passed to the client is connection_timeout with a value of 30 (which is higher than the default).

This may not be interesting, but I hacked the client a bit to try to read again at the moment the exception occurs, and append the read string to the exception text. I didn't take too many samples, but the read string looked to be a few characters of valid output (didn't actually verify). Anyway, it wasn't empty.

Edit. Another data point: there's usually a whole bunch of error instances within about one second, but nothing before or after it. I realize this could point towards a problem in Redis; is there a way to rule out Predis as the culprit (other than rewriting the calls to use another language/client)?

@nrk
Copy link
Contributor

nrk commented Jun 7, 2011

Can you try to wrap the calls that usually end up generating the exception in a try ... catch block with a code that looks like the following one?

<?php
try {
    // Redis commands
}
catch (Predis\CommunicationException $exception) {
    $stream = $exception->getConnection()->getSocket();
    $metadata = var_export(stream_get_meta_data($stream), true);
    // log the $metadata string somewhere so that you can retrieve it
    throw $exception;
}

With $metadata we should get some useful details about the state of the underlying PHP stream to start guessing what's happening on your server when those exceptions are thrown.

By the way the value of connection_timeout has effects only on the connect operation of the client socket stream to a remote endpoint. Instead read_write_timeout is used to set the timeout of read and write operations over a connected socket (the default should be 30 seconds if I remember correctly). You could try to set an higher value for read_write_timeout just to see if you notice some changes, but do that only after having captured a few samples of the stream metadata like described above.

@abesto
Copy link
Author

abesto commented Jun 7, 2011

Very well, we'll try both. Work starts about 12 hours from now (we're on CET), I'll report back with the results.

Thank you.

@nrk
Copy link
Contributor

nrk commented Jun 8, 2011

Sorry, I just noticed that the current behaviour of Predis doesn't play nice with that snippet of code. You should temporarily modify this line of code to return false so that the underlying stream resource is not closed before the exception is allowed to propagate out.

@abesto
Copy link
Author

abesto commented Jun 8, 2011

OK, done. There was no exception yet.

@abesto
Copy link
Author

abesto commented Jun 8, 2011

Got a few errors, here's an example (they all look the same):

Message: Predis socket metadata dump: array (
  'stream_type' => 'tcp_socket/ssl',
  'mode' => 'r+',
  'unread_bytes' => 0,
  'seekable' => false,
  'timed_out' => true,
  'blocked' => true,
  'eof' => false,
)

The timed_out => true confirms your theory, and raising read_write_timeout should fix the problem, right? (or at least make it go away for now)

@nrk
Copy link
Contributor

nrk commented Jun 8, 2011

Raising the value of read_write_timeout might help alleviate the issue, but it really depends on the source of the timeouts and the default value is 60 seconds anyway (verified right now) which already is a very high value. Also, the fact that your scripts can connect successfully to Redis leaves out problems related to ephemeral TCP/IP port exhaustion.

How many concurrent client are connected to your Redis instances on average? Would it be possible for you to use UNIX domain sockets instead of TCP/IP to connect to the Redis instances that run on the locahost and see if you still get those random errors out of them?

@abesto
Copy link
Author

abesto commented Jun 9, 2011

On two Redis servers with the issue there are 17 +/- 5 connections. One had a maximal spike of 30, the other has a daily spike of 60-100 (both last for a few minutes, and they don't match the times of the errors). I'll try using UNIX sockets for one of the servers.

We got a very small number of errors similar to the following:

Message: Connection timed out
Thrown at: /var/www/lib/redis/Predis.php:1568
Trace:
#0 /var/www/lib/redis/Predis.php(1518): Predis\Connection->onCommunicationException('Connection time...', 110)
#1 /var/www/lib/redis/Predis.php(1494): Predis\Connection->redisStreamInitializer(Object(Predis\ConnectionParameters))
#2 /var/www/lib/redis/Predis.php(1648): Predis\Connection->connect()
#3 /var/www/lib/redis/Predis.php(1599): Predis\Connection->getSocket()
#4 /var/www/lib/redis/Predis.php(1573): Predis\Connection->writeBytes('*2??$4??AUTH??$...')
#5 /var/www/lib/redis/Predis.php(1582): Predis\Connection->writeCommand(Object(Predis\Commands\Auth))
#6 /var/www/lib/redis/Predis.php(225): Predis\Connection->executeCommand(Object(Predis\Commands\Auth))
#7 /var/www/lib/redis/RedisProxy.php(21): Predis\Client->__call('auth', Array)

This could be just because of this change, but maybe it has some extra info for you.

@nrk
Copy link
Contributor

nrk commented Jun 9, 2011

Well, seeing that you get timeouts also during connect() operations is somewhat reassuring, it would have been totally weird otherwise :-) Anyway, 100 simultaneous connections to Redis shouldn't pose any problem (at least on Redis-side).

By the way I just noticed that you are running PHP with Xdebug loaded. Aside from the fact that Xdebug can slow your PHP code down quite a bit for a production environment, I don't know if it could mess with PHP streams or something under load. You should try to disable it, if possible, and if even doing this doesn't solve the problem... well, I'm at loss for words.

From how things are turning out this doesn't seem a bug of Predis, and the only things left I can think of are:

  • connectivity issues (but this wouldn't make much sense for connections on the localhost)
  • some sort of misconfiguration of the PHP + Apache environment (I don't use Apache since years, so I can't be of any help)

@abesto
Copy link
Author

abesto commented Jun 9, 2011

Very well, if disabling Xdebug doesn't solve the problem, then I'll try to reproduce this with another client library. Hopefully that will narrow things down enough for the Redis folks to figure this out. We'll use unix sockets for loopback connections, even if they don't solve this (speedup is good for the health), so thanks for that pointer as well.

Thank you for your time :)

@abesto abesto closed this as completed Jun 9, 2011
@nrk
Copy link
Contributor

nrk commented Jun 9, 2011

Let me know how things go, now I'm curious :-)

@abesto
Copy link
Author

abesto commented Jun 9, 2011

Sure, I'll write here with the results. There weren't any errors since I disabled Xdebug, I hope it'll stay that way.

@nrk
Copy link
Contributor

nrk commented Jun 16, 2011

Just adding a link to an ongoing discussion on this matter (timeouts) from the Redis mailing list since it has some interesting tips.

@abesto
Copy link
Author

abesto commented Jun 16, 2011

Actually, I wrote the OP of that thread :) It was still very nice of you to post the link.

@abesto
Copy link
Author

abesto commented Jul 7, 2011

The cause of the issue was a faulty Realtek driver (r8169 before kernel 2.6.30 is buggy, have to use r8168). Nothing to do with Redis or Predis.

@nrk
Copy link
Contributor

nrk commented Jul 7, 2011

Thanks for the update, this might be a valuable information for others. Debugging these kind of weird issues can be hard and time consuming when the cause is somewhere at a lower level.

@rookie7799
Copy link

Abesto, sorry to bother - but we are experiencing the same issue i was wondering how did u go about find out that "The cause of the issue was a faulty Realtek driver" ??

Thanks!

@abesto
Copy link
Author

abesto commented Feb 5, 2012

@rookie7799 No problem at all. /var/log/kernel.log contained "interface up" messages at the same times that the errors occured; the link was down for some time before these messages.

Mention of the problem and a detailed solution at Hetzner: http://wiki.hetzner.de/index.php/Installation_des_r8168-Treibers/en

Sorry for taking so long to answer.

@terry8416
Copy link

I got the same issue when I use command "brpop bar 0" in code. setting read_write_timeout to -1 can solve it but...are there any better solutions? or can this solution cause other unknown issues?by the way,for our system, 60s timeout is not enough:(

@abesto
Copy link
Author

abesto commented Mar 14, 2012

You could do blocking operations with a non-zero second argument. This releases the block once that "timeout" is reached. You can then run some checks, and issue the blocking command again. This way you also avoid having to configure the server for not dropping very long connections.

@terry8416
Copy link

It's a good idea, thanks for your answer:)

@razvanphp
Copy link

Sorry for bringing this up again after so much time, but I think I have a different reason why I get this error message. I tried debugging the stream as mentioned earlier and got this:

array(7) {
  ["stream_type"]=>
  string(14) "tcp_socket/ssl"
  ["mode"]=>
  string(2) "r+"
  ["unread_bytes"]=>
  int(0)
  ["seekable"]=>
  bool(false)
  ["timed_out"]=>
  bool(false)
  ["blocked"]=>
  bool(true)
  ["eof"]=>
  bool(false)
}

read_write_timeout is set to -1, and we use one master and one slave with replication => true. Judging by the IP address, the error is triggered on the master.

Any idea about blocking mode connection? I saw other PHP Redis library has this configurable.

We just switched from phpredis and this is a cron job, the error happens exactly after 60 seconds.

Thank you!

@pravindahal
Copy link

pravindahal commented May 16, 2016

This is how I managed to capture the debug info:

          catch (Predis\CommunicationException $exception) {
                $stream = $exception->getConnection()->getResource();
                $errorMessage = var_export(stream_get_meta_data($stream), true);
                //Log::info($errorMessage);
                continue;
            }

I get the following:

array (
  'timed_out' => false,
  'blocked' => true,
  'eof' => false,
  'stream_type' => 'tcp_socket/ssl',
  'mode' => 'r+',
  'unread_bytes' => 0,
  'seekable' => false,
)

I have set read_write_timeout to -1, timeout in redis.conf to 0, second param to blpop as 0. Has anyone managed to find out more about this?

More info:
Using redis 3 (docker image, alpine variant) with Laravel 5.1 in a long running artisan command.

@pravindahal
Copy link

pravindahal commented Nov 30, 2016 via email

@bencromwell
Copy link

bencromwell commented Nov 30, 2016 via email

@bencromwell
Copy link

I find myself back here almost a year later and feel compelled to comment.

Composer doesn't work (install/update packages) with that socket setting changed.

No idea why the package doesn't seem to respect the configuration settings and is unable to override PHP's default. Leaving this comment here in case anyone is reading this and knows the solution!

@ZekiJohn
Copy link

ZekiJohn commented Dec 31, 2021

default_socket_timeout

In my case configuring Redis was not enough setting default_socket_timeout=-1 in php.ini makes it work.
thanks @bencromwell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

8 participants