New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chttpd request processing does not reset pdict predictably #4909
Comments
reusing an erlang process between requests is surprising to me but I can't find docs that reassure me that this was ever a promise from mochiweb. however, if true, it does point out our folly in using the process dictionary so freely given the blanket advice not to do so. I'd rather see us fix the real problem(s) instead of this PR. Or, perhaps, a patch for mochiweb to ensure a new process at the start of each request, even for keepalive connections. |
chatted with davisp and he confirms the general problem too. 🤷 he did have a better suggestion for fix. 1) record the pdict keys at the start of our handle_request 2) delete any keys not in that set at the end of handle_request (in an |
noting that neither davisp or I think that a mochiweb erlang process is reused across sockets though. If that is happening we have a big problem, as that would be state/data passing between different clients. |
Yeah I was surprised as well about the Hmmmm... what do you consider "better" about that suggestion? It avoids the licensing and bleed over of including the names of the Mochiweb pdict values, but I personally prefer an explicit whitelist of the fields we will allow, rather than a dynamic approach that assumes the found live pdict value is allowed. We already hardcode I'm not really attached to any particular approach, but I don't really like the dynamic nature of the alternative suggestion. |
my point is that whatever is in the pdict before handle_request is called is, by definition, mochiweb's entries. removing anything other than those at the end should avoid any "leak" now and in the future. your static approach invites problems down the line if mochiweb adds to its set and we forget to update our hard-coded list. The true fix is to remove all reliance on the process dict in our chttpd handling. have you a full list of potential issues caused by this process reuse across requests? |
stashing away the mochiweb pdict entries and clearing everything except those at the end makes sense. we may also have some stray messages in the mailbox and could be worth checking if we return early to the client while there are still workers finishing any doc updates or something like that |
I think this makes sense, but how do you assert those pdict entries are for mochiweb without having a list of all mochiweb pdict entries or doing something we should avoid like |
We need to potentially extract the usage delta from the incoming RPC message. Rather than pattern match on all possible message formats that could potentially include usage deltas, we instead utilize rexi_util:extract_delta which matches against tuples ending in `{delta, Delta}`, and separates that out from the underlying message. The subtlety here is that receiving the message to extract the delta changes the behavior as this was previously doing a selective receive keyed off of the Ref, and then ignoring any other messages that arrived. I don't know if the selective receive was intended, but I don't think it's appropriate to leave unexpected messages floating around, especially given things like issue #4909. Instead of utilizing a selective receive, this switches to extracting the message and delta like we need to do, and then in the event it finds unexpected messages they're logged and skipped. This selective receive was masking the lack of unlink on the linked rexi_mon pid in fix #4906. I've also noticed some rpc responses arriving late as well, but I haven't tracked that down, so let's log when it does happen.
We need to potentially extract the usage delta from the incoming RPC message. Rather than pattern match on all possible message formats that could potentially include usage deltas, we instead utilize rexi_util:extract_delta which matches against tuples ending in `{delta, Delta}`, and separates that out from the underlying message. The subtlety here is that receiving the message to extract the delta changes the behavior as this was previously doing a selective receive keyed off of the Ref, and then ignoring any other messages that arrived. I don't know if the selective receive was intended, but I don't think it's appropriate to leave unexpected messages floating around, especially given things like issue #4909. Instead of utilizing a selective receive, this switches to extracting the message and delta like we need to do, and then in the event it finds unexpected messages they're logged and skipped. This selective receive was masking the lack of unlink on the linked rexi_mon pid in fix #4906. I've also noticed some rpc responses arriving late as well, but I haven't tracked that down, so let's log when it does happen.
Description
The Mochiweb http server uses a shared and re-used connection pool with
chttpd:handle_request
as the entry point for the Mochiweb request processing handoff.The bug is that we don't start with a fresh process dictionary between requests, as the
chttpd
processes are reused so pdict values that are not expressly cleared will persist. Even worse, pdict values that are only set by specific code paths will persist until the next time an incoming request makes its way to this particularchttpd
worker in the pool, at which point the old values will be updated.An analogy would be if we used paper receipt to log each request, but instead of starting with a fresh form every time, we re-use the old form but only erase the old values when we need to modify them, so any other form entries would persist from whatever prior request filled them.
We also don't set the
nonce
value untilcouchdb/src/chttpd/src/chttpd.erl
Line 317 in 4e0b66a
nonce
value will re-use the previously generatednonce
value and return that as the "unique" value of the request, and will continue to return the existing value until a request is processed that reaches that line of code to generate a newnonce
.I propose we clear the pdict between requests. I don't think it's safe to do
erlang:erase()
as that'll also clear out theMochiweb
pdict values from https://github.com/apache/couchdb-mochiweb/blob/main/src/mochiweb_request.erl#L64-L78 so instead I propose a filtered clearing of the pdict values, like so:Where the skipped values are the pdict names from the linked Mochiweb code above. I'm not entirely sure about the legality nuances from copying in the Mochiweb names, so before I PR'ed that I figured it would be to hear back from @janl / @rnewson / etc on preferred approach here.
Alternatively, we could explicitly delete the values set but I really dislike how that results in any values falling through persisting for an unknown amount of time until it's cleared out by a similar request. So I proposed the "clear everything but Mochiweb entries" approach. Let me know what you think.
The text was updated successfully, but these errors were encountered: