Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] : Performance of combine. #534

Open
LeakStephane opened this issue Apr 30, 2024 · 6 comments
Open

[Question] : Performance of combine. #534

LeakStephane opened this issue Apr 30, 2024 · 6 comments

Comments

@LeakStephane
Copy link

LeakStephane commented Apr 30, 2024

Hey, first let me thank your library which is overall wonderful.

Issue :

I run into trouble regarding the Result.combine method.
I'm basically trying to combine a relatively "huge" array around 300_000 elements of Result<T, E>.
Sadly doing so takes around 1 minutes on my computer.

Question :

Is combine implementation ready to handle that size of array ?

P.S : For now i will just not use neverthrow on array that are too big.

@supermacro
Copy link
Owner

supermacro commented May 2, 2024

Oh wow, that's an interesting use case.

image

So the implementation of Result.combine is relatively straightforward.

There are two issues I see immediately (with respect to large result lists):

  • O(n) runtime due to usage of resultList.reduce
    • this is def not the main problem though, as looping 300,000 elements on my machine on a for loop took about 4 seconds
  • appendValueToEndOfList is creating a new array on each iteration 😨

The latter issue is almost certainly the problem.

That being said, as I've mentioned here, I don't have time to contribute to the codebase.

That being said though I would be happy to accept a fix for this - and I'll devote time to reviewing the PR.

@supermacro
Copy link
Owner

Also, ...list is expanding the previous list (in addition to the problem of creating a brand new array). This is also very likely another cause of the performance slowdown.

@supermacro
Copy link
Owner

On another note; given such a large list, would you not want some sort of function that partitions your results into Ok values and Err values?

e.g.

const [okList, errList] = Result.partition(veryLargeList)

@LeakStephane
Copy link
Author

LeakStephane commented May 4, 2024

Hey, thank for you detailed answer !


That being said, as I've mentioned #531, I don't have time to contribute to the codebase.

Yeah, i saw the message. I'm glad that your are transparent on the time you can dedicate to neverthrow. 👍

Also good luck in your new startup adventure, i know from experience that it's time consuming 😉


That being said though I would be happy to accept a fix for this - and I'll devote time to reviewing the PR.

I will definitively have a look, but no promises 😃


On another note; given such a large list, would you not want some sort of function that partitions your results into Ok values and Err values?

Yeah, could be nice, should be useful in my case.

(This is a personal opinion)

That being said, in general i'm not in favor of expanding the interface, it creates confusion and, it's harder to learn over time.

I need to check but it might be doable to do some kind of partition without relaying on a method implemented by neverthrow.


Overall nice pinpointing of the issue 👍

@flowreaction
Copy link

flowreaction commented May 23, 2024

Another issue with the current implementation, that isn't directly connected to this issue though, is that even when isErr() returns true, the reduce will keep iterating over the entire input list. So not really short circuiting but only returning err(result.error) n times before it finally finishes all elements.

I played around with the implementation and this here is magnitudes faster on my machine.

this implementation doesn't create new arrays in every iteration but mutates one array a bunch of times.
I am not too familiar with this package yet so i am not sure if I am misusing something here though.

export const combineResultListFast = <T, E>(
  resultList: readonly Result<T, E>[],
): Result<readonly T[], E> => {
  let acc = ok([]) as Result<T[], E>

  for (const result of resultList) {
    if (result.isErr()) {
      acc = err(result.error)
      break // short circuiting here
    } else {
     acc.map((list) => list.push(result.value))
    }
  }
  return acc
}

i benchmarked it with a couple of different input sizes and this is the result of the microbenchmark on my machine:

┌─────────┬─────────────┬─────────────────────┬─────────────────────┬────────────────┬─────────────┐
│ (index) │ numElements │ timeOld             │ timeNew             │ areResultsSame │ improvement │
├─────────┼─────────────┼─────────────────────┼─────────────────────┼────────────────┼─────────────┤
│ 0       │ 100         │ 0.08329199999997172 │ 0.09158300000001418 │ true           │ '0x'        │
│ 1       │ 1000        │ 2.159749999999974   │ 0.132000000000005   │ true           │ '16x'       │
│ 2       │ 10000       │ 53.110708999999986  │ 0.5692910000000211  │ true           │ '93x'       │
│ 3       │ 100000      │ 9115.303124999999   │ 2.2292080000006536  │ true           │ '4089x'     │
│ 4       │ 500000      │ 310282.093833       │ 11.678125000034925  │ true           │ '26569x'    │
└─────────┴─────────────┴─────────────────────┴─────────────────────┴────────────────┴─────────────┘

You can check out the code as well as the benchmark code here:
function implementation
bechmark

@flowreaction
Copy link

Addionally, it might also make sense to refactor the combineResultListWithAllErrors() function to not use reduce + spread as reduce and spread in combination is always slow for large data sets in comparison with regular loops and array mutation. See this jsperf benchmark: https://jsperf.app/noyulo

export const combineResultListWithAllErrors = <T, E>(
  resultList: readonly Result<T, E>[],
): Result<readonly T[], E[]> => {
  let acc = ok([]) as Result<T[], E[]>

  for (const result of resultList) {
    if (result.isErr() && acc.isErr()) {
      acc.error.push(result.error)
    } else if (result.isErr() && acc.isOk()) {
      acc = err([result.error])
    } else if (result.isOk() && acc.isErr()) {
      // do nothing
    } else if (result.isOk() && acc.isOk()) {
      acc.value.push(result.value)
    }
  }
  return acc
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants