Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection trouble with 0.10.0 #89

Open
miccolis opened this issue Jun 4, 2016 · 14 comments
Open

Connection trouble with 0.10.0 #89

miccolis opened this issue Jun 4, 2016 · 14 comments

Comments

@miccolis
Copy link
Contributor

miccolis commented Jun 4, 2016

After updating to 0.10.0 we starting seeing connection issues which were hard to diagnose. We've downgraded back to 0.9.0 and it doesn't look like we're seeing them anymore.

What would happen was that simply that a process would start spewing Error: socket timed out waiting on response. ~10 times a minute and be unable to re-establish a connection to the memcache server (we're using AWS Elasticache). New processes would connect with no issue.

I wish I had more details to report, but so far I've been unable to reproduce this outside of a production environment. I'll update this post as I learn more.

@alevy
Copy link
Member

alevy commented Jun 4, 2016

OK, thanks for the report. 0.10 changed a specific way in which connections are handled so that's enough information to look.

I wish there were a way to mark versions as beta / not-yet-well-tested versions to try and catch these kinds of issues before they into production...

@dterei
Copy link
Collaborator

dterei commented Jun 6, 2016

@miccolis Do you mind trying out 0.9.1 please and letting us know if everything works fine? We made some changes to timeout handling in both 0.9.1 and 0.10.0, so telling us if 0.9.1 works or fails will help narrow this down a lot for us.

@mikemorris
Copy link

I just tested 0.9.1 and hit a bunch of the socket timed out error message (haven't seen that in 0.9.0) plus a whole bunch of

(node) warning: possible EventEmitter memory leak detected. 11 timeout listeners added. Use emitter.setMaxListeners() to increase limit.

@dterei
Copy link
Collaborator

dterei commented Jun 7, 2016

Can you share how you have memjs configured please? Ignore the warning messages, those are expected with 0.9.1.

@dterei
Copy link
Collaborator

dterei commented Jun 17, 2016

@mikemorris Any update here? I can't replicate this on my end or see any problems from starring at the code so won't be able to make progress without help.

@azmsrwr
Copy link

azmsrwr commented Jul 9, 2016

Hi @alevy , we are using memjs version 0.8.5 in our Heroku app and we randomly get a "MemJS: Server <..> failed after (2) retries with error - This socket has been ended by the other party ". This issue starts with a daily recycling of a server and it goes away when we restart the server manually.

As you mentioned, 0.10.0 handles the connection differently, should we update our app to use 0.10.0, will that help with this error ? given you pointed out 0.10.0 to be not well tested?

@dterei
Copy link
Collaborator

dterei commented Jul 15, 2016

@azmsrwr Please try out 0.10.0 and let us know your results. We aren't able to replicate this issue right now so the more feedback we can get the better.

If you run into issues with 0.10.0, then downgrade to 0.9.0.

@dterei
Copy link
Collaborator

dterei commented Jul 21, 2016

Another MemCachier customer reports:

We start getting ECONNRESET errors on all cache reads/writes.
Only way to resolve is to restart all dynos.

Error details:
/app/controllers/programs.coffee:101:24
at /app/node_modules/memjs/lib/memjs/memjs.js:149:23
at Object.errorHandler (/app/node_modules/memjs/lib/memjs/memjs.js:617:25)
at [object Object].Server.error (/app/node_modules/memjs/lib/memjs/server.js:71:18)
at Socket.\u003canonymous\u003e (/app/node_modules/memjs/lib/memjs/server.js:159:12)
at emitOne (events.js:77:13)
at Socket.emit (events.js:169:7)
at emitErrorNT (net.js:1269:8)
at nextTickCallbackWith2Args (node.js:442:9)
at process._tickCallback (node.js:356:17)

"msg": {
"msg": "Cache set error",
"err": {
"code": "ECONNRESET",
"errno": "ECONNRESET",
"syscall": "read"
}
}

Using memjs: 0.10.0 and node: v4.

ursm added a commit to enishitech/fastboot-memcached-cache that referenced this issue Nov 13, 2016
@leohihimax
Copy link

Any update for this issue?
I also met this problem when the network was not that stable. Memjs unable to re-establish a connection to the memcached server.
Then I must restart the node process to establish a new connection, it would work again.

@dterei
Copy link
Collaborator

dterei commented Nov 17, 2016

@leohihimax No update sorry, hopefully Amit or I will find some time soon to sit down with this one. The problem is we aren't able to reproduce this. Any extra information you can provide to help us there would be great.

@azmsrwr
Copy link

azmsrwr commented Nov 17, 2016

We were randomly getting following issue with memjs version 0.8.5 in our Heroku app "MemJS: Server <..> failed after (2) retries with error - This socket has been ended by the other party ". We imagine this issue was around some sort of connection pooling where occasionally the server was restarted (daily planned restart) but connection was still not being renewed. This issue used to goes away when we restart the server manually.

After updating to memjs 0.10.0 we do not see this issue anymore.

@leohihimax
Copy link

@dterei
memjs version: 0.10.0
Actually, it is really hard to reproduce this in a good network condition or the memcached server is localhost.
Because my memcached server is running on a VPC, so the network is not that stable.

@dterei
Copy link
Collaborator

dterei commented Nov 18, 2016

OK, I'll try to setup a fake network environment that simulates a poor network and see what happens.

alloy added a commit to artsy/metaphysics that referenced this issue Jan 20, 2017
The version we were using had an issue where too many timeout handlers
were being defined: memcachier/memjs#86.

This lead to many warnings like the following in the production logs:

    (node:9852) Warning: Possible EventEmitter memory leak detected.
    11 timeout listeners added. Use emitter.setMaxListeners() to
    increase limit

The logs also show many `ECONNRESET` and similar errors, which may be
related to this: memcachier/memjs#89. According to
that thread, however, it seems like version 0.10.0 could also be
suffering from issues, although others report it working for them.
@saschat
Copy link
Member

saschat commented Mar 22, 2018

I tried to reproduce this by connecting to a memcached server on the other side of the pond (I think it is called the Atlantic ocean) and setting the timeout to a value very close to the ping times I got from the server. I get occasional timeouts (which will return a timeout error for all outstanding requests) and sometimes even a ECONNRESET error (which will also return an error for all outstanding requests) but in all cases memjs reconnects fine for future requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants