Cluster dies on mem3_rep_manager #106

opie4624 · 2012-05-23T18:04:50Z

Yesterday a single node was unable to start due to errors in '''mem3_rep_manager''' after a few hours, all 6 nodes are unable to start.

Each node's run, right now, looks like this:

[Tue, 22 May 2012 19:04:07 GMT] [info] [<0.87.0>] [--------] Apache CouchDB has started on http://undefined:5986/
[Tue, 22 May 2012 19:04:08 GMT] [error] [emulator] [--------] Error in process <0.171.0> on node 'bigcouch@couchdb1' with exit value: {{badmatch,nil},[{fabric_view,remove_down_shards,2},{rexi_utils,process_mailbox,6},{fabric_view_changes,receive_results,5},{fabric_view_changes,send_changes,6},{fabric_view_changes,go,5}]}


[Tue, 22 May 2012 19:04:08 GMT] [error] [<0.164.0>] [--------] ** Generic server mem3_rep_manager terminating 
** Last message in was {'EXIT',<0.171.0>,
                               {{badmatch,nil},
                                [{fabric_view,remove_down_shards,2},
                                 {rexi_utils,process_mailbox,6},
                                 {fabric_view_changes,receive_results,5},
                                 {fabric_view_changes,send_changes,6},
                                 {fabric_view_changes,go,5}]}}
** When Server state == {state,<0.165.0>,10,nil,[<0.171.0>]}
** Reason for termination == 
** {unexpected_msg,{'EXIT',<0.171.0>,
                           {{badmatch,nil},
                            [{fabric_view,remove_down_shards,2},
                             {rexi_utils,process_mailbox,6},
                             {fabric_view_changes,receive_results,5},
                             {fabric_view_changes,send_changes,6},
                             {fabric_view_changes,go,5}]}}}

The last http request was retrieving a view, then a stack trace just like above happened and the node went down. Since the load balancer would kick each request to a working node, that eventually downed the entire cluster.

opie4624 · 2012-05-25T01:48:20Z

Here's a crash dump and the console output from trying to start up one of the nodes. http://ge.tt/3EPR6AI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster dies on mem3_rep_manager #106

Cluster dies on mem3_rep_manager #106

opie4624 commented May 23, 2012

opie4624 commented May 25, 2012

Cluster dies on mem3_rep_manager #106

Cluster dies on mem3_rep_manager #106

Comments

opie4624 commented May 23, 2012

opie4624 commented May 25, 2012