Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpRequest that survives past async functions #1036

Open
uNetworkingAB opened this issue Mar 15, 2024 · 20 comments
Open

HttpRequest that survives past async functions #1036

uNetworkingAB opened this issue Mar 15, 2024 · 20 comments

Comments

@uNetworkingAB
Copy link
Contributor

There are ways to easily support more abstract interfaces such as the fetch one, without any significant perf. loss in fast path.

Request is stack allocated and is only valid for the "sync" callback. Adding a call, Request.keep() that would dynamically allocate it and make it "keepable" could be automatically done for callbacks that do not return a Response.

This means, in the fast path, there is no overhead, while in the slow async path, you can still easily "keep" the Request for as long as you need as a slow path.

The benefit here is obvious: wrappers that implement more abstract interfaces do not need to:

  • Loop over all headers
  • Create JavaScript string for all headers
  • Hand over this object regardless if it will be kept or not

This is really bad for fast path performance

Internally, all that is neeed is adding 2 functions: getMemory, setMemory so that keep() calls malloc, getMemory, memcpy, setMemory on the Request and drop calls free

@uNetworkingAB
Copy link
Contributor Author

Internally it should be

uWS::HttpRequest* = uWS::HttpRequest*->makeDynamicClone();
uWS::HttpRequest*->freeDynamicClone();

it just makes 1 block with the headers and all

@uasan
Copy link

uasan commented Mar 15, 2024

Those. if we call request.keep(), we can access the request object, in the following microtasks?

@uNetworkingAB
Copy link
Contributor Author

A wrapper is:

function myFetch(request) {
return new Response("hello");
}

app.get('*', (res, req) => {
if myFetch(req) returned Response then
res.end(the response)
else if returned void or unresolved promise
req.keep(); // moves uWS.HttpRequest from global to local, swaps the underlying ptr to makeDynamicCopy(), on GC calls freeDynamicCopy()
endif
});

@uNetworkingAB
Copy link
Contributor Author

Yes if you call keep() it would move from being a stack allocated thing to being dynamic and available until it gets GC:d. So it's a very simple function and wrappers would not need to copy out all headers one by done but can just keep the Request after calling keep() on it

@uNetworkingAB
Copy link
Contributor Author

But it's slow as heck, so the key here is to only do it if the first invocation of the CB does not return a Response or calls end. So the whole idea is: only use it for slow path

@uasan
Copy link

uasan commented Mar 15, 2024

I'm understood, thank you.

But the problem with all abstractions is that they usually do not know whether the request handler will be executed in the fast way or in the slow way, so the abstraction always chooses the slow way to be universal, so they will always call the keep method.

By what percentage does the response slow down if you always call the request method keep?

@uNetworkingAB
Copy link
Contributor Author

whether the request handler will be executed in the fast way or in the slow way

You don't need to know if it will be. You need to know if it was. Finding out if the request was ended is easy. That's the whole point of doing this after as a late stage slow path

@uNetworkingAB
Copy link
Contributor Author

The library could even do this automatically, but it would add a cost to all async handlers. But those are kind of costly either way

@uNetworkingAB
Copy link
Contributor Author

Making it automatic looks easiest. Then it would just work. And wrappers can remove all their header copying crap and always use the given Request object

@uNetworkingAB uNetworkingAB changed the title Request.keep, drop for abstract interfaces HttpRequest that survives past async functions Mar 15, 2024
@uasan
Copy link

uasan commented Mar 15, 2024

Here's a more realistic example of abstraction:

async function myFetch(request) {
  return await sql;
}

app.get('*', async (response, request) => {
  try {
    const result = await myFetch(request);
    response.end(result);
  } catch (error) {
    response.end(error);
  }
});

Comment from your example:

else if returned void or unresolved promise

There are misconceptions in this statement, the fact is that it is impossible to find out the state of a promise, whether it has resolved or not, such a promise interface, we can only wait for the result, so this is always the slow way and always calls the keep method

@uNetworkingAB
Copy link
Contributor Author

There is no misconception. The library knows perfectly well whether the Response has ended or not. Like said, async/await functions will always take the slow path for a number of reasons (in Node.js, the microtask queue is not drained immediately after a call into JS which is terrible for perf. but I don't care it is a Node.js problem and doesn't affect non-async functions)

@uNetworkingAB
Copy link
Contributor Author

Non-async functions like those in HelloWorld.js will end the request by calling Response.end() immediately and in this case there is no overhead since it never clones the request object

@uasan
Copy link

uasan commented Mar 15, 2024

The library knows perfectly well whether the Response has ended or not.

No, the point of abstractions is that the application code of the route handlers does not know anything about the implementation of the server, i.e. an asynchronous function that an abstraction calls never calls a response end, because this is the exclusive privilege of the abstraction, those. The response state after calling the handler will always be unfinished.

P.S.
If the application functions themselves will call the response end then there is no point in abstraction )

@uNetworkingAB
Copy link
Contributor Author

"the library" = uWS. It knows whether you responded or not

@uasan
Copy link

uasan commented Mar 15, 2024

Then the uWS should automatically call keep method for all asynchronous handlers, i.e. all asynchronous requests will be on the slow path, this will reduce the speed?

@porsager
Copy link

I think what you mean @uasan is that you'd like an option to "not" copy the request if you know you will not use anything from it in some of your async handlers?

@uasan
Copy link

uasan commented Mar 15, 2024

Yes, it’s better that by default uWS doesn’t do something that can be expensive, I agree if copying is done only explicitly, through calling the keep method

@uNetworkingAB
Copy link
Contributor Author

it could be negligible in comparison with async flow either way. needs benchmarking

@kartikk221
Copy link

This is a great proposal and would significantly help performance for wrappers / libraries like hyper-express where currently, the headers and other request properties are always consumed from HttpRequest and cached locally. With the ability to instruct HttpRequest to live past the first sync execution, we could convert the properties into getters which then lazy initialize the neccessary data and consume it as needed.

As for the best way of implementing it, automatic is ideal but having a method like request.keep() along with a request.destroy() would be fine as well since it is very easy to know If a request will be handled asynchronously.

@kartikk221
Copy link

You don't need to know if it will be. You need to know if it was. Finding out if the request was ended is easy. That's the whole point of doing this after as a late stage slow path

I am asssuming to automate this behavior, you will be checking to see whether a response was sent after the route handler returns? I use a similar trick here to determine whether I need to cork before sending a response with a boolean flag:
https://github.com/kartikk221/hyper-express/blob/f712634ed5648b9c91d34beb0a845ab95ff3c6a8/src/components/Server.js#L526-L530

One more thing that I thought of is wouldn't someone calling uWS.HttpResponse.onAborted(() => {}) be a strong indicator that they will handle the request in an asynchronous manner (slow path) hence keeping uWS.HttpRequest around in memory would be a safe bet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants