Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7.x: new router and middleware system #1561

Merged
merged 4 commits into from Feb 23, 2018
Merged

7.x: new router and middleware system #1561

merged 4 commits into from Feb 23, 2018

Conversation

hekike
Copy link
Member

@hekike hekike commented Nov 9, 2017

Replace the router and middleware logic with a more performant solution.
Goal: zero or almost zero API/functionality change with significant performance improvement.

Pt 1. (#1554) was nice for dumb servers and synthetic tests; this should be more useful for real-world applications, especially ones with many routes.

Changes

  • Replaces the router with the Radix Tree based find-my-way
  • New middleware system called chain.js, similar to connect
  • Move versioning and contentType routing into a conditionalHandler plugin, see:
server.use(restify.plugins.conditionalHandler({
   contentType: 'application/json',
   version: '1.0.0'
   handler: function (req, res, next) {
       next();
   })
});

server.get('/hello/:name', restify.plugins.conditionalHandler([
  {
     version: '1.0.0',
     handler: function(req, res, next) { res.send('1.x') }
  },
  {
     version: ['1.5.0', '2.0.0'],
     handler: function(req, res, next) { res.send('1.5.x, 2.x') }
  },
  {
     version: '3.0.0',
     contentType: ['text/html', 'text/html']
     handler: function(req, res, next) { res.send('3.x, text') }
  },
  {
     version: '3.0.0',
     contentType: 'application/json'
     handler: function(req, res, next) { res.send('3.x, json') }
  }
]);

Hints for review:

  • response.js: 100% new, do not review it by diff
  • server.js: the complete request handling logic is new, I recommend to review it as a new file

TODO

Breaking Changes

  • Nonstrict-routing is not available
  • Limited RegExp usage in router path see https://github.com/delvedor/find-my-way
  • Remove already deprecated next.ifError
  • disable DTrace probes by default ( I know it's painful, but most of the people don't know/use it, it causes ~10% throughput improvement in all benchmark tests), createServer({ dtrace: true })
  • Router versioning and contentType routing moved to conditioalHandler plugin, see above
  • once is not applied automatically for multiple next() calls, use the new onceNext: true server option, strictNext is the same

Results

Test machine: Mid 2014 MBP, 2.6 i5 16GB RAM
(~30% improvement compared to 6.3.2 (#1554), ~50% compared to 6.3.1)

Result: dumb server

Note: Most of the performance improvements in these benchmarks come from the new middleware logic and other optimizations, the router performance is not tested here as it uses LRU cache.

---- response-json ----
✔ Results saved for head/response-json
✔ Results saved for stable/response-json
response-json throughput:
{
    "significant": "***",
    "equal": false,
    "wins": "head",
    "diff": "30.21%"
}

---- response-text ----
✔ Results saved for head/response-text
✔ Results saved for stable/response-text
response-text throughput:
{
    "significant": "***",
    "equal": false,
    "wins": "head",
    "diff": "34.18%"
}

Result: middleware

---- middleware ----
✔ Results saved for head/middleware
✔ Results saved for stable/middleware
middleware throughput:
{
    "significant": "***",
    "equal": false,
    "wins": "head",
    "diff": "44.68%"
}

Result: heavy router

Note: Disabling (mocking out) cache in a stable version: it's not fair as it aims to the worst case, cache hit ratio is 0%. However, it's still better than the worst case scenario as it least doesn't require extra time to maintain the LRU cache.
There is no other way to simulate 100+ different endpoint calls with the current benchmark suite.

---- router-heavy ----
✔ Results saved for head/router-heavy
✔ Results saved for stable/router-heavy
router-heavy throughput:
{
    "significant": "***",
    "equal": false,
    "wins": "head",
    "diff": "3086.8%"
}

@hekike hekike changed the title Performance improvements pt2: new router and middleware system [WIP] Performance improvements pt2: new router and middleware system Nov 9, 2017
@hekike
Copy link
Member Author

hekike commented Nov 9, 2017

@DonutEspresso @retrohacker @yunong what do you think, should I continue? Even if I'm going to end up with necessary API change?

I mean it's huge and risky but those numbers look great :)

Copy link
Member

@retrohacker retrohacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the overall approach! Surprisingly reduces complexity in the process!

lib/middle.js Outdated
var proto = {};

/* istanbul ignore next */
var defer =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking for the existence of setImmediate?

Copy link
Member Author

@hekike hekike Nov 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sry, I didn't make it clear that this is connect, I copy-pasted it, because I needed to extend it with the next(false) -> abort logic. :( Later we can try to open a PR or replace next(false) with something else.

@hekike
Copy link
Member Author

hekike commented Nov 10, 2017

@retrohacker I added two more benchmarks: middleware and router-heavy
You can find the results in the description.

@hekike
Copy link
Member Author

hekike commented Nov 10, 2017

Question: should next(false) only prevent to run the handlers in the same category or all of the handlers, which one:

  1. next(false) in pre, stops the next pre handlers but runs all the use and router handlers
  2. next(false) in pre, stops the next pre and use handlers but runs router
  3. next(false) in pre, stops all the upcoming handlers

@hekike
Copy link
Member Author

hekike commented Nov 11, 2017

Question: I don't find where the createServer({ dtrace }) options it handled, does it still exist? I mean the option itself?

@hekike hekike force-pushed the feat/new-router branch 5 times, most recently from bd2c358 to 4b2c8fb Compare November 15, 2017 15:50
@hekike hekike changed the title [WIP] Performance improvements pt2: new router and middleware system 7.x: new router and middleware system Nov 15, 2017
@DonutEspresso
Copy link
Member

Whoaaaaaaa this is awesome 🎉 🎉 🎉

Question: should next(false) only prevent to run the handlers in the same category or all of the handlers

The intention of false was to originally stop all subsequent middlewares in the event you had flushed a response already and considered this request complete. So I think 3 should be the desired behavior (if it isn't already).

@hekike
Copy link
Member Author

hekike commented Nov 16, 2017

@DonutEspresso @yunong @retrohacker I have an issue with the 'close' and 'aborted' test cases:
https://github.com/restify/node-restify/blob/master/test/server.test.js#L2648

No matter how I end/destroy/close the req/res/socket I always have both the server.on('aborted') and server.on('close') events, I cannot abort without close event, and I cannot close without aborted event. Is it possible to trigger close event without aborted? Does it make sense to separate the two connection states in restify? If I understand correctly we will always have close https://nodejs.org/api/http.html#http_event_aborted

@yunong
Copy link
Member

yunong commented Nov 17, 2017

@hekike I think it may be the case that Node core actually fires both events? We could check the source code to be sure, but if that's the case, then it's okay that both events fire.

@DonutEspresso
Copy link
Member

Hmm great question - I was under impression a close wouldn't cause an abort but I can see both events being fired now. Short answer: I don't know! May have to dig into core to see if there is anything worth differentiating there.

@hekike
Copy link
Member Author

hekike commented Nov 18, 2017

close and aborted: Thanks for the help! I've checked out Node core and seems to be that emit('aborted') is always immediately followed by an emit('close') in node core. (I'm not 100% sure) I also opened a question here: nodejs/help#978

See: _http_server.js#L417 and http2/compat.js#L14

in 7.x I unified server close and aborted events. restify throws only RequestCloseError, see: 0da8219

Copy link
Member

@DonutEspresso DonutEspresso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ridiculously awesome @hekike!! 💯 🍾 🎉 Thank you for all the hard work. I left some questions and minor comments. I am in favor of nearly all the proposed changes in this PR. The only one I have some concerns with are:

  • Not super comfortable dropping the free from regexp routes. This may not be palatable to some of the power users. As it's not supported directly via find-my-way, this would be a pretty big undertaking if we wanted to continue supporting. Would love to have more discussions around this one.

More general feedback around the proposed implementation:

  • Unless I'm missing something, I think the order of request processing being proposed in this PR (pre -> use -> route vs pre -> route -> use) is different than the current behavior. If we keep the existing ordering I think we can keep bothserver.param and current usage of req.params.
  • There's lots of async orchestration in the processing of a request (pre/route/use/handler/chain) - would love to see if we could use something like vasync to help facilitate that. Generally, an error anywhere in the processing jumps us directly to the end, so would like to see if this would help with that (try vasync.waterfall or vasync.pipeline). Since vasync also keeps returns internal async state as a variable to us, we could store it on the request, which could be used in core dumps. That combined with req._currentHandler would make debugging stuck requests a lot easier!
  • With conditionalHandler, how will we handle multiple matches/order of precedence?

* @public
* @class Chain
* @param {Object} [options] - options
* @param {Boolean} [options.onceNext=false] - Prevents calling next multiple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case where we would want this to be false? Or is this primarily for backwards compatibility? Actually, now that I think about it, I would be in favor of moving to strictNext as the default behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's for backward compatibility. Making strictNext to the default causes performance degradation.

}

// continue
next(error, req, res);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is something for prettify in another PR - but I'd like to add some rules around always returning on known cb handles (callback, next, cb, done, etc.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on leaving style changes to their own PR and fixing there if they pass lint. Suspect this isn't the only place in the code base we do this.

lib/chain.js Outdated
*/
Chain.prototype.getHandlers = function getHandlers() {
return this._stack.map(function map(stackItem) {
return stackItem.handle;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like an item is just { handle: <fn> }. Could we add some documentation maybe in the ctor section describing structure of the items in the stack?

lib/chain.js Outdated
* @param {Function|Chain} handle - handler or Chain instance
* @returns {undefined} no return value
*/
Chain.prototype.use = function use(handle) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're probably trying to reflect well known nomenclature, but is this closer to doing something like "add a function to this chain"? If so, what do you think about just calling it chain.add()/append()?

lib/chain.js Outdated
* @param {Function} done - final handler
* @returns {undefined} no return value
*/
Chain.prototype.handle = function handle(req, res, done) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on calling it run() or execute()? I found handle.handle() a little confusing :)

*/
Server.prototype._routeAndRun = function _routeAndRun(req, res) {
Server.prototype._onRequest = function _onRequest(req, res) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So clean, so simple 💯

lib/server.js Outdated

/**
* Wrapper method for emitting the after event. this is needed in scenarios
* where the async formatter has failed, and the ot assume that the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks a little outdated since we don't have async formatters anymore - don't want to bikeshed on this PR but could make a note to come back and clean up comments in another PR.

lib/server.js Outdated
@@ -1528,7 +1283,7 @@ function errEvtNameFromError(err) {
// remap the name for router errors
return 'VersionNotAllowed';
} else {
return err.name.replace(/Error$/, '');
return err.name ? err.name.replace(/Error$/, '') : 'Error';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what scenarios does err.name not exist?

lib/server.js Outdated
// 4.2 _afterRoute (runs after last handler called next)
// 4.3 _afterRoute (when, next('string') called)
// 4.3 _routeErrorResponse (when error happaned in router)
// 5. _finishReqResCycle (on response "finish" and "error" events)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This waterfall is great! The additional granularity is nice. Is this order different than the current behavior? I thought the order is as follows:

  1. pre
  2. router.find
  3. use (use only applies to routes that were successfully routed)
  4. done/done(err)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I was dumb enough to work based on my express and koa experience instead of reading the docs carefully :/

Allows you to add in handlers that run for all routes. Note that handlers added via use() will run only after the router has found a matching route. If no match is found, these handlers will never run. - restify docs

(In express/koa everything is a middleware, the router as well, so they don't make difference between pre/after route, it depends on the order of the middleware "registration")

Btw maybe it's misleading for others as well coming from the express world, shouldn't we keep use as an alias and introduce a new term that reflects it's behavior better? Like afterRoute() or useRoute(), these are not very good names, but something like these :)

: require('restify');

var server = restify.createServer();
var path = '/whiskeys/scotch/islay/lagavulin/16-years/50';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍺

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥃

@hekike hekike changed the base branch from master to next February 1, 2018 20:53

## Breaking Changes

### Server throws `RequestCloseError` instead of `RequestAbortedError`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the server actually "throws" right? Do we mean that the server just logs a RequestCloseError? We may also want to update RequestCloseError's default status code to 444.

DTrace probes comes with some performance impact that's fine for the sake of
observability but you may don't use it at all.

### Change in calling `next` multiple times
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is performance the motivation behind this new default behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also have a preference for this to remain unchanged. Throwing by default helps us catch a whole class of bugs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, performance is the motivation behind.
@retrohacker do you mean at Netflix? We can re-enable it with the option for NQ services.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hekike I just mean in general. We are a framework for building production ready REST services, I'm not sure this is in line with our goals as a project.

### Router versioning and content type

`accept-version` and `accept` based conditional routing moved to the
`conditioalHandler` plugin, see docs or example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/conditioalHandler/conditionalHandler

@@ -3,4 +3,4 @@
var errors = require('restify-errors');

errors.makeConstructor('RequestCloseError');
errors.makeConstructor('RequestAbortedError');
errors.makeConstructor('RouteMissingError');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this error used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error existed before but was created locally.
When a code calls next('route-name') but the route doesn't exist.
It's a programmatic error.

@@ -65,8 +65,7 @@ function inflightRequestThrottle(opts) {
},
'maximum inflight requests exceeded, rejecting request'
);
res.send(plugin._err);
return next(false);
return next(plugin._err);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on the exact line, but I think this file here needs to create the error inline to get a proper stack trace (rather than creating it ahead of time).

argumentsToChain(handlers).forEach(function forEach(h) {
self.before.push(h);
argumentsToChain(handlers).forEach(function forEach(handler) {
handler._name = handler.name || 'pre-' + self.preChain.count();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already happening in router?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Router, pre, and use are registered separately with a different prefix.

lib/server.js Outdated
// 3.2. _afterRoute (runs after route handlers are finised,
// triggers use)
// 4.1 _runUse
// 4.2. _afterUse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we chatted about this before, are we changing the pre -> use -> route order? Looks like this is breaking from 6.x.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's the same, the doc was wrong, fixed.

lib/server.js Outdated

req.timers = [];
// Preserve handler err for finish event
res.err = res.err || err;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see where this is initially set - what's the use case for storing it on the response directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used when the request succeeds but the response is an error:
https://github.com/restify/node-restify/blob/feat/new-router/lib/server.js#L1093

This is how it's ensured that error is always emitted in after.
It's not an elegant solution decorating the res object, but in the earlier version it was way to complex, I like that now it's based on the response events.

res.removeListener('finish', onFinish);
self._finishReqResCycle(req, res, err);
}
res.once('finish', onFinish);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a breaking change from 6.x? IIRC in 6.x we consider a request finished once it has reached the end of the handler chain. Using this pattern, it won't fire until the request is completely done.

I have to think about whether this is better/worse/different, but definitely has an impact on how we might view metrics. cc @yunong

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that we fire it before the response is finished. What's the reason behind it?

lib/server.js Outdated

// after event has signature of function(req, res, route, err) {...}
if (!res.finished && !req.closed()) {
res.once('finish', function resFinished() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since finishReqResCycle() doesn't get triggered until finish event fires, is this scenario possible anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@DonutEspresso
Copy link
Member

DonutEspresso commented Feb 5, 2018

Great work! Love that we added benchmarks - thanks for digging in and getting those perf numbers. 💯 Left some comments - mostly minor, the two big things for me are:

1) 6.x order is pre->use->route, order here looks like pre->route->use. Did we agree to a breaking change on that? EDIT: I can't read, this is fine now. Just needs update on the JSDoc.
2) In 6.x a request is considered "complete" (i.e., causing audit log and metrics plugin to fire) when reaching the end of the next chain. "Completion" is synonymous with the 'after' event, where audit log and metrics are triggered. With this PR, "completion" gets delayed until the request object's finish event is fired. Not sure if this is good/bad yet but it is definitely different and may have an impact on how we interpret metrics. Thoughts from the rest of the team? cc @yunong @retrohacker

@hekike
Copy link
Member Author

hekike commented Feb 9, 2018

We discussed the request "complete" topic in person with @DonutEspresso and we decided that the new behavior is acceptable and even better in some use cases like load shedding calculations where you really want to know when the request doesn't use your resources anymore.

To have the correct timing about handler categories like pre, use and router execution durations a new timing object will be introduced in the next version of restify, related issue: #1603

cc @yunong @retrohacker

@hekike
Copy link
Member Author

hekike commented Feb 13, 2018

@DonutEspresso I added the new request timings and explained the change in the migration guide in a separate commit.

@hekike hekike force-pushed the feat/new-router branch 2 times, most recently from f9f91e8 to 891e737 Compare February 13, 2018 16:27
@hekike
Copy link
Member Author

hekike commented Feb 15, 2018

We discussed in person that we need two latencies one when the request is flushed and one when all handlers are finished. The after event and inflight request counter will wait for both request flushing and handler finish.

@@ -39,7 +39,7 @@ function createMetrics(opts, callback) {
// REST verb
method: req.method,
// overall request latency
latency: hrTimeDurationInMs(req._time, process.hrtime()),
latency: hrTimeDurationInMs(req._time, req._timeFlushed),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

I was thinking there might be one more timer would be useful - one that is started _onRequest and finishes in _finishReqResCycle (when both res._flushed and res._handlersFinished are true). This would probably be the closest "end to end" time view of a request we see in our system. Maybe something like totalLatency?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minus this last timer I think we are good to go!

@hekike
Copy link
Member Author

hekike commented Feb 23, 2018

@DonutEspresso totalLatency added! 1acc1be

@hekike hekike force-pushed the feat/new-router branch 2 times, most recently from 844ccbe to c43d236 Compare February 23, 2018 18:51
@hekike hekike merged commit 785d7a6 into next Feb 23, 2018
@hekike hekike deleted the feat/new-router branch February 23, 2018 19:20
hekike added a commit that referenced this pull request Mar 13, 2018
BREAKING CHANGE:
- Server returns `RequestCloseError` instead of `RequestAbortedError`
- Non-strict routing is gone
- Different `RegExp` usage in router path and wildcards
- Remove already deprecated `next.ifError`
- Disable DTrace probes by default
- Change in calling `next` multiple times
- Router versioning and content type as a separate plugin: `conditionalHandler`
- After event fires when both request is flushed and the last handler is finished
- Metrics plugin latency logic changes and new latencies were added

For more info see the `/guides/6to7guide.md`.
hekike added a commit that referenced this pull request Mar 20, 2018
BREAKING CHANGE:
- Server returns `RequestCloseError` instead of `RequestAbortedError`
- Non-strict routing is gone
- Different `RegExp` usage in router path and wildcards
- Remove already deprecated `next.ifError`
- Disable DTrace probes by default
- Change in calling `next` multiple times
- Router versioning and content type as a separate plugin: `conditionalHandler`
- After event fires when both request is flushed and the last handler is finished
- Metrics plugin latency logic changes and new latencies were added

For more info see the `/guides/6to7guide.md`.
hekike added a commit to sfaizanh/node-restify that referenced this pull request Mar 27, 2018
BREAKING CHANGE:
- Server returns `RequestCloseError` instead of `RequestAbortedError`
- Non-strict routing is gone
- Different `RegExp` usage in router path and wildcards
- Remove already deprecated `next.ifError`
- Disable DTrace probes by default
- Change in calling `next` multiple times
- Router versioning and content type as a separate plugin: `conditionalHandler`
- After event fires when both request is flushed and the last handler is finished
- Metrics plugin latency logic changes and new latencies were added

For more info see the `/guides/6to7guide.md`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants