Congestion Avoidance #306

prasadtalasila · 2018-09-26T15:58:35Z

Description

The number of pending requests grow exponentially towards the assignment submission deadline. This is due to two reasons.

Number of unique submissions is large
Number of repetitive submissions is large

Steps to Reproduce

Make many evaluation requests for single user simultaneously. OR
Make many evaluation requests for different users simultaneously. and THEN
Wait for system to respond

Expected behavior: [What you expect to happen]
System quickly and transparently rejects excess load.

Actual behavior: [What actually happens]
System slows down significantly or crashes.

Reproduces how often:
Every time near the submission deadline

Additional Information

Silently failing the requests is a quick fix, but not a good solution.

prasadtalasila · 2018-09-26T16:21:39Z

The problem is on line-96 of the load balancer. It accepts all the incoming requests irrespective of the existing load.

We can put a restriction saying, the request is only accepted when it satisfies the following conditions.

no outstanding requests from the same user
no more than 20 requests outstanding at any time. We can change the hard-limit based on the local configuration.

Hint: We need to give an load message to the user.

We can also apply the following builtin methods of ES6 Array object.

no outstanding requests from the same user

pending_ids = job_queue.map(request => request.id);
//get userid of incoming request and store in id
if pending_ids.findIndex(id) != -1
    job_queue.push(req.body);

no more than 20 requests outstanding at any time. We can change the hard-limit based on the local configuration.
```
job_queue.length < 20
```

We need to refactor lines 145-174 into a separate module named congestion_avoid and importing it as a function to be used near line-96.

prasadtalasila · 2018-09-26T16:33:43Z

@AnkshitJain This seems simpler than having a separate micro service for congestion management. What do you think?

prasadtalasila · 2018-10-04T15:45:02Z

The above approach is good if there are no evaluation failures in execution nodes. There are going to be cases where execution nodes fail with exception due to corner cases and abandon the evaluation. In such cases, the pending evaluation remains pending in the load balancer.

We had a similar problem in the past when users submit evaluation from webpage, refresh the webpage and submit a fresh request. This problem also arises when user submits evaluation requests from multiple browser tabs.

A better solution is to use node-cache which offers configurable timeouts of pending entries in a map. The salient features of the module are:

configure entry timeout and cleanup periodicity

const NodeCache = require( "node-cache" );
const myCache = new NodeCache( { stdTTL: 100, checkperiod: 120 } );

add an evaluation (set a key)

myCache.set( key, val, [ ttl ], [callback] )

retrieve a pending evaluation (get a key)
```
myCache.get( key, [callback] )
```
delete a pending evaluation (delete a key)
```
myCache.del( key, [callback] )
```
get number of pending evaluations (get cache key size)
```
myCache.getStats().keys
```

The module has nice callbacks that get called on specific events happening on the cache. So a simple check would be as follows.

if myCache.get(id) != undefined 
    //send a JSON indicating a pending evaluation
else if myCache.getStats().keys > 20
    //send a JSON indicating a pending evaluation
else
    //evaluate
    myCache.set(req.body.id, req.body);

The requirement is that we change the queue from array to a cache. A good way to achieve this change is to import the pending_queue data structure from a new queue module.

prasadtalasila · 2018-12-31T00:00:43Z

Check node-rate-limiter or express-rate-limit or bottleneck

prasadtalasila · 2019-01-23T16:03:29Z

We can use request-ip to consider the client IP from which the evaluation requests are received.

prasadtalasila added bug load-balancer labels Sep 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Congestion Avoidance #306

Congestion Avoidance #306

prasadtalasila commented Sep 26, 2018

prasadtalasila commented Sep 26, 2018 •

edited

prasadtalasila commented Sep 26, 2018

prasadtalasila commented Oct 4, 2018

prasadtalasila commented Dec 31, 2018 •

edited

prasadtalasila commented Jan 23, 2019

Congestion Avoidance #306

Congestion Avoidance #306

Comments

prasadtalasila commented Sep 26, 2018

Description

Steps to Reproduce

Additional Information

prasadtalasila commented Sep 26, 2018 • edited

prasadtalasila commented Sep 26, 2018

prasadtalasila commented Oct 4, 2018

prasadtalasila commented Dec 31, 2018 • edited

prasadtalasila commented Jan 23, 2019

prasadtalasila commented Sep 26, 2018 •

edited

prasadtalasila commented Dec 31, 2018 •

edited