fix: improve parsing content-range header #2779

Uzlopak · 2024-02-19T00:37:32Z

range header parser was not handling the size * variant.
maybe improved slightly performance
removed unnecessary assertions, like start is always a finite number
renamed parseRangeHeader to parseContentRangeHeader

Regarding unsatisfied ranges:
The line const { start, size, end = size } = contentRange was only making sense, when the size was set but end was not set. This was not possible in the previous implementation. So I guess, that line was expected to handle unsatisfied ranges. So for cases like "bytes */1000". But I am unsure as there is no test regarding this case. Also we check if the statusCode is 206, while unsatisfied ranges are sent with status code 416.

So i think we have to discuss about what we have to do with unsatisfied ranges.

benchmarks

before:

aras@aras-Lenovo-Legion-5-17ARH05H:~/workspace/undici$ node ./benchmarks/parseRangeHeader.mjs 
cpu: AMD Ryzen 7 4800H with Radeon Graphics
runtime: node v21.6.2 (x64-linux)

benchmark                             time (avg)             (min … max)       p75       p99      p999
------------------------------------------------------------------------ -----------------------------
• parseRangeHeader
------------------------------------------------------------------------ -----------------------------
parseRangeHeader undefined         1'243 ps/iter       (478 ps … 150 ns)  1'364 ps  1'433 ps  5'388 ps
parseRangeHeader empty             1'240 ps/iter       (477 ps … 250 ns)  1'364 ps  1'603 ps  5'354 ps
parseRangeHeader bytes 0-400/400     172 ns/iter       (163 ns … 355 ns)    168 ns    292 ns    350 ns
parseRangeHeader bytes 0-400/*    35'965 ps/iter    (32'158 ps … 208 ns) 35'329 ps 46'651 ps    166 ns

(last case is invalid in main and returns null, thats why it is in picoseconds...)

after:

aras@aras-Lenovo-Legion-5-17ARH05H:~/workspace/undici$ node ./benchmarks/parseContentRangeHeader.mjs 
cpu: AMD Ryzen 7 4800H with Radeon Graphics
runtime: node v21.6.2 (x64-linux)

benchmark                                    time (avg)             (min … max)       p75       p99      p999
------------------------------------------------------------------------------- -----------------------------
• parseContentRangeHeader
------------------------------------------------------------------------------- -----------------------------
parseContentRangeHeader undefined         1'293 ps/iter       (478 ps … 310 ns)  1'364 ps  2'796 ps 11'186 ps
parseContentRangeHeader empty             1'285 ps/iter       (648 ps … 108 ns)  1'398 ps  1'637 ps  5'388 ps
parseContentRangeHeader bytes 0-400/400     152 ns/iter       (144 ns … 402 ns)    150 ns    278 ns    363 ns
parseContentRangeHeader bytes 0-400/*       119 ns/iter       (111 ns … 286 ns)    117 ns    234 ns    251 ns

summary for parseContentRangeHeader
  parseContentRangeHeader empty
   1.01x faster than parseContentRangeHeader undefined
   92.53x faster than parseContentRangeHeader bytes 0-400/*
   118.71x faster than parseContentRangeHeader bytes 0-400/400

codecov-commenter · 2024-02-19T00:42:52Z

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (e39a632) 85.54% compared to head (df8193d) 85.34%.
Report is 317 commits behind head on main.

Files	Patch %	Lines
lib/handler/RetryHandler.js	57.14%	3 Missing ⚠️
lib/core/util.js	83.33%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2779      +/-   ##
==========================================
- Coverage   85.54%   85.34%   -0.21%     
==========================================
  Files          76       85       +9     
  Lines        6858     7602     +744     
==========================================
+ Hits         5867     6488     +621     
- Misses        991     1114     +123

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ronag

I don't think the retry handler changes are an improvement.

ronag · 2024-02-19T06:40:54Z

lib/core/util.js

+    start: +m[1],
+    end: (m[2] && +m[2]) ?? null,
+    size: (m[3] && +m[3]) ?? null
+  })


Can you make this less compact and more readable? When how does this return null?

Uzlopak · 2024-02-19T16:10:37Z

I will first implement some unit tests, as some of the expectations to the code seem to be wrong.

KhafraDev

what about using the range header parsing from fetch?

KhafraDev · 2024-02-19T22:46:38Z

sorry, didn't get a notification for your comment in the issue?

Uzlopak · 2024-02-19T22:51:22Z

@KhafraDev

Hmm, yeah looks right. Probably a magnitude slower. But I have stomacheache about the unsatisfied range case.

KhafraDev · 2024-02-19T22:54:54Z

lib/core/util.js

+ * @see https://www.rfc-editor.org/rfc/rfc9110#field.content-range
+ */
+function parseContentRangeHeader (range) {
+  if (range == null || range === '') {


a header value can't be empty and I don't think range can be null here, only undefined

Uzlopak · 2024-02-19T23:03:20Z

lib/handler/RetryHandler.js

@@ -224,7 +230,7 @@ class RetryHandler {
      const { start, size, end = size } = contentRange


@KhafraDev
Here you see start, size and then end = size, which makes only sense in the unsatisfied case. But original parseRangeHeader can NOT return unsatisfied cases (e.g. bytes */2222). So this makes no sense.

originally we only have start and end.

metcoder95 · 2024-02-20T09:08:41Z

Sadly I don't see much room for improving the current range parsing implementation. The only improvement is to replace it with the fetch one and adjust accordingly, so we have a single implementation shared across and we can optimize that single implementation and have benefits at different ends.

Regarding unsatisfied ranges, I believe that 416 status code should result in an Aborted request and give back control to the implementation; I am not so sure what else we can do in those scenarios beyond stopping the request and not retrying more.

Uzlopak · 2024-02-20T09:10:01Z

If we abort on 416 i can implement it properly. Also it results in a faster parsing.

metcoder95 · 2024-02-20T09:10:34Z

Sure, that SGTM 👍

mcollina

lgtm

Uzlopak · 2024-02-24T20:29:50Z

LOL, I dont want that this gets merged accidently.

imrpove parseContentRangeHeader

df8193d

ronag requested changes Feb 19, 2024

View reviewed changes

Uzlopak mentioned this pull request Feb 19, 2024

use fetch utility to parse range header instead of custom one? #2484

Closed

KhafraDev reviewed Feb 19, 2024

View reviewed changes

Uzlopak commented Feb 19, 2024

View reviewed changes

mcollina approved these changes Feb 24, 2024

View reviewed changes

ronag approved these changes Feb 24, 2024

View reviewed changes

Uzlopak marked this pull request as draft February 24, 2024 20:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve parsing content-range header #2779

fix: improve parsing content-range header #2779

Uzlopak commented Feb 19, 2024

codecov-commenter commented Feb 19, 2024 •

edited

ronag left a comment

ronag Feb 19, 2024

Uzlopak commented Feb 19, 2024

KhafraDev left a comment

KhafraDev commented Feb 19, 2024

Uzlopak commented Feb 19, 2024

KhafraDev Feb 19, 2024

Uzlopak Feb 19, 2024

metcoder95 commented Feb 20, 2024 •

edited

Uzlopak commented Feb 20, 2024

metcoder95 commented Feb 20, 2024

mcollina left a comment

Uzlopak commented Feb 24, 2024

		@@ -224,7 +230,7 @@ class RetryHandler {
		const { start, size, end = size } = contentRange

fix: improve parsing content-range header #2779

Are you sure you want to change the base?

fix: improve parsing content-range header #2779

Conversation

Uzlopak commented Feb 19, 2024

codecov-commenter commented Feb 19, 2024 • edited

Codecov Report

ronag left a comment

Choose a reason for hiding this comment

ronag Feb 19, 2024

Choose a reason for hiding this comment

Uzlopak commented Feb 19, 2024

KhafraDev left a comment

Choose a reason for hiding this comment

KhafraDev commented Feb 19, 2024

Uzlopak commented Feb 19, 2024

KhafraDev Feb 19, 2024

Choose a reason for hiding this comment

Uzlopak Feb 19, 2024

Choose a reason for hiding this comment

metcoder95 commented Feb 20, 2024 • edited

Uzlopak commented Feb 20, 2024

metcoder95 commented Feb 20, 2024

mcollina left a comment

Choose a reason for hiding this comment

Uzlopak commented Feb 24, 2024

codecov-commenter commented Feb 19, 2024 •

edited

metcoder95 commented Feb 20, 2024 •

edited