Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate change events after confirm timeout and restart #53

Open
mjq opened this issue Oct 23, 2014 · 7 comments
Open

Duplicate change events after confirm timeout and restart #53

mjq opened this issue Oct 23, 2014 · 7 comments

Comments

@mjq
Copy link

mjq commented Oct 23, 2014

Here's the relevant code block to follow along.

In Feed.prototype.confirm, a request is made to check if the DB is reachable, and a timeout is set to detect a slow response from Couch. If the timeout is hit, the Feed is killed (self.die is called). But, the request object isn't destroyed. That means that if Couch responds after the timeout, the happy path callback db_response still gets called.

Normally, this isn't that noticeable, since the Feed object is dead and everything short-circuits. But, if the user called restart on the feed in response to the error, dead will be false, and the Feed ends up getting set up twice (once in response to the timed-out request, and once due to restart(). This results in every change event getting called twice.

The fix would seem to be adding destroy_req(req); here before dieing. I haven't figured out how to write a test for this though. Any ideas?

@jcrugzz
Copy link
Member

jcrugzz commented Oct 23, 2014

@mjq do you have any sample code that reproduces this? thats the best place to start for a test

@mjq
Copy link
Author

mjq commented Oct 23, 2014

Sorry, sure. Simplified, it's:

var follow = require('follow');
var db = '...';

var feed = new follow.Feed({db: db, include_docs: true});

feed.on('change', function(change) {
  console.log('got change %d', change.seq);
});

feed.on('error', function(err) {
  console.log('got error %s, restarting in 5s', err.message);
  setTimeout(function() {
    console.log('restarting');
    feed.restart();
  }, 5000);
});

feed.start();

Normally, the logs would look like

got change 5
got change 6
got change 7

But, if the first attempt to reach the database times out but responds shortly after, you'll see

got error "Timeout confirming database: <db name>", restarting in 5s
restarting
got change 5
got change 5
got change 6
got change 6
got change 7
got change 7

@jcrugzz
Copy link
Member

jcrugzz commented Oct 24, 2014

@mjq this is fascinating, I've never seen this happen. Destroy_req, should be called by the die function but it seems like there is a race condition leaving two requests? Ill have to dig deeper on this when i have a minute

@mjq
Copy link
Author

mjq commented Oct 24, 2014

@jcrugzz die destroys self.pending.request, but the request in confirm is a local variable, so if it isn't destroyed in confirm, nothing will (or so it seems to me).

A simpler bug to test, repro and fix may just be:

  • request in confirm takes longer than the timeout, but
  • db_response is called anyway (even though the timeout killed the feed).

Since db_response only applies to the success case, that alone is weird/wrong behaviour, and just by fixing that (by e.g. destroying the request in the timeout fn), it should prevent the double-listener stuff.

re: race conditions: We've got a single process simultaneously following an ever-changing set of a few thousand databases (with all those databases on the same CouchDB box). So, when requests to that box start stalling... well, if there's a race condition to be found, we'll find it, heh.

I'm giving this patch a trial by fire right now, but I don't know how long it will take for us to trigger the bug again.

@jcrugzz
Copy link
Member

jcrugzz commented Oct 26, 2014

@mjq gotcha, this is before it is piped into the changes-stream. Let me know if you can reproduce that but that looks like a valid fix. Super edge case but I can see the potential for it happening.

@arikon
Copy link

arikon commented Jun 3, 2015

@mjq @jcrugzz Are you going to fix this?

@carrotalan
Copy link

+1 - This is still an issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants