Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a serialized variant of the Trie-router #402

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Joannis
Copy link
Contributor

@Joannis Joannis commented Mar 17, 2024

Since this trie is contiguous memory, routing performance should improve. This does not implement recursive wildcards yet, and I found that those tests are missing too.

In a quick test setup, the gap between no-router and having the old trie router was ~3-4k req/sec on a baseline (with router) of 98k. That gap shrunk by roughly 1k req/sec, leading to an estimated improvement of 25% in a small routing table (PerformanceBenchmarks).

I expect, and seemed to find a more significant performance gap when as the amount of path components increased. The expectation is based on the fact that more trie nodes are created when there are more path components. By keeping the routing table contiguous, it's expected that these scenarios are more performant.

APIs tend to have more path components to organise themselves, with many backends opting for /api/v1-type of prefixes for versioning. /api/v1/users/:id/profile is not a very long route.

TODO

  • Proper Benchmarks
  • Add a Recursive Wildcard test
  • Add a recursive wildcard implementation

@Joannis
Copy link
Contributor Author

Joannis commented Mar 20, 2024

Benchmark Results

BinaryTrieRouter
╒═══════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                    │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═══════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *          │      24 │      24 │      24 │      24 │      24 │      24 │      24 │     164 │
├───────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Throughput (# / s) (K)    │     167 │     166 │     165 │     164 │     162 │     158 │     157 │     164 │
├───────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ns) *   │    6001 │    6038 │    6058 │    6091 │    6169 │    6332 │    6353 │     164 │
╘═══════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

BinaryTrieRouterParameters
╒═══════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                    │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═══════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *          │      28 │      28 │      28 │      28 │      28 │      28 │      28 │     145 │
├───────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Throughput (# / s) (K)    │     148 │     148 │     147 │     145 │     142 │     138 │     138 │     145 │
├───────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ns) *   │    6768 │    6795 │    6820 │    6906 │    7033 │    7201 │    7210 │     145 │
╘═══════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

TrieRouter
╒═══════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                    │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═══════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *          │      64 │      64 │      64 │      64 │      64 │      64 │      64 │     110 │
├───────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Throughput (# / s) (K)    │     112 │     111 │     110 │     109 │     107 │     105 │     105 │     110 │
├───────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ns) *   │    8966 │    9003 │    9069 │    9191 │    9347 │    9454 │    9514 │     110 │
╘═══════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

TrieRouterParameters
╒═══════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                    │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═══════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *          │      78 │      78 │      78 │      78 │      78 │      78 │      78 │      90 │
├───────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Throughput (# / s) (K)    │      91 │      90 │      90 │      90 │      89 │      84 │      84 │      90 │
├───────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (μs) *   │      11 │      11 │      11 │      11 │      11 │      12 │      12 │      90 │
╘═══════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

@adam-fowler
Copy link
Member

adam-fowler commented Mar 20, 2024

Can you post a benchmark comparison, with the original

@Joannis
Copy link
Contributor Author

Joannis commented Mar 22, 2024

@adam-fowler the comparison is between TrieRouter / TrieRouterParameters and BinaryTrieRouter / BinaryTrieRouterParameters

@Joannis
Copy link
Contributor Author

Joannis commented Mar 22, 2024

I've reduces the bloat from my benchmark comment

@adam-fowler
Copy link
Member

This is failing CI

@Joannis Joannis marked this pull request as ready for review March 22, 2024 20:33
Copy link
Member

@adam-fowler adam-fowler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a series of initial comments. These should make the code more readable and more performant


// Serialize the path constant
try trie.writeLengthPrefixed(as: Integer.self) { buffer in
buffer.writeSubstring(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given you only ever write Strings with a prefixed length why don't you add a ByteBuffer. writeLengthPrefixedString()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In actual fact you could fix the issue where you are throwing an error from init and provide more detailing error information.

func writeLengthPrefixedString(_ string: SubString) {
    do {
        try trie.writeLengthPrefixed(as: Integer.self) { buffer in
            buffer.writeSubstring(path)
        }
    } catch {
        preconditionFailure("Failed to write \"\(string)\" into BinaryTrie")
    }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also have you checked performance if you use a UInt16. 64 bit integer is quite a bit of overkill here

let _token: Integer = trie.readInteger(),
let token = TokenKind(rawValue: _token),
let nextSiblingNodeIndex: UInt32 = trie.readInteger()
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a ByteBuffer.readNode() that returns a BinaryTrieNode type to improve readability

// Serialize the node's component
switch node.key {
case .path(let path):
trie.writeInteger(TokenKind.path.rawValue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a ByteBuffer.writeNode() to match the ByteBuffer.readNode() mentioned above

private func descendPath(
in trie: inout ByteBuffer,
index: UInt16,
parameters: inout Parameters,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given you return the parameters, why is this marked inout?

return token
}

mutating func readBinaryTrieNode() -> BinaryTrieNode? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't compile and it doesn't seem to be used

return nil
}

component = components[components.startIndex]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we guarantee at this point that the array is non-empty?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha. just saw the check above, perhaps you could change this to

guard let component = components.first else { return nil }

import NIOCore

public struct BinaryRouterResponder<Context: BaseRequestContext>: HTTPResponder {
let trie: BinaryTrie<EndpointResponders<Context>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an odd place to add this type. Might be worthwhile adding its own file. The original RouterResponder has its own file.

if self.options.contains(.autoGenerateHeadEndpoints) {
self.trie.forEach { node in
node.value?.autoGenerateHeadEndpoint()
}
}
return RouterResponder(
return try! .init(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!? We should remove this.


// Serialize the path constant
try trie.writeLengthPrefixed(as: Integer.self) { buffer in
buffer.writeSubstring(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In actual fact you could fix the issue where you are throwing an error from init and provide more detailing error information.

func writeLengthPrefixedString(_ string: SubString) {
    do {
        try trie.writeLengthPrefixed(as: Integer.self) { buffer in
            buffer.writeSubstring(path)
        }
    } catch {
        preconditionFailure("Failed to write \"\(string)\" into BinaryTrie")
    }
}


// Serialize the path constant
try trie.writeLengthPrefixed(as: Integer.self) { buffer in
buffer.writeSubstring(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also have you checked performance if you use a UInt16. 64 bit integer is quite a bit of overkill here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants