Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More comprehensive benchmarks #128

Open
Morwenn opened this issue Mar 2, 2018 · 1 comment
Open

More comprehensive benchmarks #128

Morwenn opened this issue Mar 2, 2018 · 1 comment

Comments

@Morwenn
Copy link
Owner

Morwenn commented Mar 2, 2018

The benchmarks page on the wiki is seldom updated, and only compares the sorting of integer and floating point values. We need better and more comprehensive benchmarks, with the abillity to generate distributions for more kind of types (pairs, arrays, strings, etc...) so that we can better compare the different algorithms and where they are better than others.

A first easier step would be to group sorting algorithms by families to make some benchmarks more relevant since the different families of algorithms generally don't have the same goals.

Some other interesting results could be highlighted: for example, is it stably sorting by applying stable_adapter to pdq_sorter faster in many cases than using an inherently stable sorting algorithm? When is indirect sorting faster? When is using a Schwartz transform faster? I think that some pieces of the library can answer many use cases, but we need many more benchmarks to highlight which tool might be the best for a specific scenario. With a comprehensive benchmark suite, the dedicated wiki page could even be turned in a whole section with several sub-articles.

Another problem: I remember an oldish benchmark that showed how sorting strings with the same algorithm could have greatly different results depending on which compiler was used, which might make the benchmarks more complicated once again...

Ideally we would need a tool where we can select the different parameters before running the benchmarks:

  • Algorithm
  • Adapter if any
  • Type of data
  • Distribution of data
  • Compiler
  • Compiler options

Unfortunately that would require a full tool, and probably a dedicated website too. While this may be of value, I most probably don't have the time nor the skills required to do that. The benchmarks on C++ Hut are a good example of what I would like to achieve.

Morwenn added a commit that referenced this issue Sep 19, 2020
Still far from addressing #128, but still a nice improvement to the
status quo.

[ci skip]
@Morwenn
Copy link
Owner Author

Morwenn commented Apr 7, 2021

We are still far from comprehensive benchmarks, but there have been some improvements recently:

  • A new benchmark for stable sorts that don't allocate heap memory was added to the wiki as part of issue Improve the state of mergesorts #175.
  • A new dist::to_long_string projection was added to the benchmark suite by 6b5862c. It makes it possible to reuse the existing distributions to generate collections of long strings with a long common prefix. This is a good start to show expensive comparisons.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant