Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve parsing content-range header #2779

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Uzlopak
Copy link
Contributor

@Uzlopak Uzlopak commented Feb 19, 2024

  • range header parser was not handling the size * variant.
  • maybe improved slightly performance
  • removed unnecessary assertions, like start is always a finite number
  • renamed parseRangeHeader to parseContentRangeHeader

Regarding unsatisfied ranges:
The line const { start, size, end = size } = contentRange was only making sense, when the size was set but end was not set. This was not possible in the previous implementation. So I guess, that line was expected to handle unsatisfied ranges. So for cases like "bytes */1000". But I am unsure as there is no test regarding this case. Also we check if the statusCode is 206, while unsatisfied ranges are sent with status code 416.

So i think we have to discuss about what we have to do with unsatisfied ranges.

benchmarks

before:

aras@aras-Lenovo-Legion-5-17ARH05H:~/workspace/undici$ node ./benchmarks/parseRangeHeader.mjs 
cpu: AMD Ryzen 7 4800H with Radeon Graphics
runtime: node v21.6.2 (x64-linux)

benchmark                             time (avg)             (min … max)       p75       p99      p999
------------------------------------------------------------------------ -----------------------------
• parseRangeHeader
------------------------------------------------------------------------ -----------------------------
parseRangeHeader undefined         1'243 ps/iter       (478 ps … 150 ns)  1'364 ps  1'433 ps  5'388 ps
parseRangeHeader empty             1'240 ps/iter       (477 ps … 250 ns)  1'364 ps  1'603 ps  5'354 ps
parseRangeHeader bytes 0-400/400     172 ns/iter       (163 ns … 355 ns)    168 ns    292 ns    350 ns
parseRangeHeader bytes 0-400/*    35'965 ps/iter    (32'158 ps … 208 ns) 35'329 ps 46'651 ps    166 ns

(last case is invalid in main and returns null, thats why it is in picoseconds...)

after:

aras@aras-Lenovo-Legion-5-17ARH05H:~/workspace/undici$ node ./benchmarks/parseContentRangeHeader.mjs 
cpu: AMD Ryzen 7 4800H with Radeon Graphics
runtime: node v21.6.2 (x64-linux)

benchmark                                    time (avg)             (min … max)       p75       p99      p999
------------------------------------------------------------------------------- -----------------------------
• parseContentRangeHeader
------------------------------------------------------------------------------- -----------------------------
parseContentRangeHeader undefined         1'293 ps/iter       (478 ps … 310 ns)  1'364 ps  2'796 ps 11'186 ps
parseContentRangeHeader empty             1'285 ps/iter       (648 ps … 108 ns)  1'398 ps  1'637 ps  5'388 ps
parseContentRangeHeader bytes 0-400/400     152 ns/iter       (144 ns … 402 ns)    150 ns    278 ns    363 ns
parseContentRangeHeader bytes 0-400/*       119 ns/iter       (111 ns … 286 ns)    117 ns    234 ns    251 ns

summary for parseContentRangeHeader
  parseContentRangeHeader empty
   1.01x faster than parseContentRangeHeader undefined
   92.53x faster than parseContentRangeHeader bytes 0-400/*
   118.71x faster than parseContentRangeHeader bytes 0-400/400

@codecov-commenter
Copy link

codecov-commenter commented Feb 19, 2024

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (e39a632) 85.54% compared to head (df8193d) 85.34%.
Report is 317 commits behind head on main.

Files Patch % Lines
lib/handler/RetryHandler.js 57.14% 3 Missing ⚠️
lib/core/util.js 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2779      +/-   ##
==========================================
- Coverage   85.54%   85.34%   -0.21%     
==========================================
  Files          76       85       +9     
  Lines        6858     7602     +744     
==========================================
+ Hits         5867     6488     +621     
- Misses        991     1114     +123     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@ronag ronag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the retry handler changes are an improvement.

start: +m[1],
end: (m[2] && +m[2]) ?? null,
size: (m[3] && +m[3]) ?? null
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this less compact and more readable? When how does this return null?

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Feb 19, 2024

I will first implement some unit tests, as some of the expectations to the code seem to be wrong.

Copy link
Member

@KhafraDev KhafraDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about using the range header parsing from fetch?

@KhafraDev
Copy link
Member

sorry, didn't get a notification for your comment in the issue?

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Feb 19, 2024

@KhafraDev

Hmm, yeah looks right. Probably a magnitude slower. But I have stomacheache about the unsatisfied range case.

* @see https://www.rfc-editor.org/rfc/rfc9110#field.content-range
*/
function parseContentRangeHeader (range) {
if (range == null || range === '') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a header value can't be empty and I don't think range can be null here, only undefined

@@ -224,7 +230,7 @@ class RetryHandler {
const { start, size, end = size } = contentRange
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KhafraDev
Here you see start, size and then end = size, which makes only sense in the unsatisfied case. But original parseRangeHeader can NOT return unsatisfied cases (e.g. bytes */2222). So this makes no sense.

originally we only have start and end.

@metcoder95
Copy link
Member

metcoder95 commented Feb 20, 2024

Sadly I don't see much room for improving the current range parsing implementation. The only improvement is to replace it with the fetch one and adjust accordingly, so we have a single implementation shared across and we can optimize that single implementation and have benefits at different ends.

Regarding unsatisfied ranges, I believe that 416 status code should result in an Aborted request and give back control to the implementation; I am not so sure what else we can do in those scenarios beyond stopping the request and not retrying more.

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Feb 20, 2024

If we abort on 416 i can implement it properly. Also it results in a faster parsing.

@metcoder95
Copy link
Member

Sure, that SGTM 👍

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Uzlopak Uzlopak marked this pull request as draft February 24, 2024 20:29
@Uzlopak
Copy link
Contributor Author

Uzlopak commented Feb 24, 2024

LOL, I dont want that this gets merged accidently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants