Skip to content

1.9.0 — Windbelt

Compare
Choose a tag to compare
@Morwenn Morwenn released this 30 Jan 13:57
ada4e8b

DOI

I kind of managed to turn the current pandemic, lockdowns and even general stay-at-home endeavours without an explicit lockdown into opportunities to work on several side projects — this is already the third cpp-sort release whose name is directly inspired by those side projects, which somehow means a lot. A windbelt is a small device used to harvest wind energy and turn it into electricity, generally a rather cheap DIY project since the required components can often be found around the house. I just finished mine, and even though I still need to test it it's as good as anything for a release name.

This release continues to pave the road to the future breaking 2.0.0 version of the library, even though it always seems to draw further away — there will definitely be more 1.x.y versions. As a result this release brings a few deprecations, polishing of existing features, and initial support for some C++20 features. One of the big items was also the move of the whole continuous integration from Travis CI to GitHub Actions.

Deprecations

All of the following CMake configuration variables and options are deprecated:

  • BUILD_EXAMPLES
  • ENABLE_COVERAGE
  • SANITIZE
  • USE_VALGRIND

Equivalent options prefixed with CPPSORT_ have been added to replace them. For backwards compatibility reasons, the new options default to the values of the equivalent deprecated variables.

New features

Some limited C++20 support was added to the library:

  • When available, std::identity can be used wherever utility::identity can be used, including in the places where the latter is used as a "joker". In the future 2.0.0 version of the library, std::identity will entirely replace utility::identity (issue #130).
  • When available, std::ranges::less and std::ranges::greater can be used wherever std::less<> and std::greater<> can be used, including in the places where the latter is used as a "joker". In the future 2.0.0 version of the library, std::less<> and std::greater<> will remain the main vocabulary types because they are less constrained that their new counterparts (issue #130).

Other new features:

  • stable_adapter and its specializations might define a member type named type: this type should be an alias to the "least nested" sorter with the same behaviour as the stable_adapter specialization which is guaranteed to always be stable - it can be the stable_adapter specialization itself.
  • A new stable_t alias template is available when including <cpp-sort/adapters/stable_adapter.h>. This alias takes a sorter template parameter and aliases a possibly adapted sorter which is always stable. It is meant to replace stable_adapter when the goal is to take any sorter and to get a stable sorter (stable_adapter remains the customization point). One of the goals is to reduce template nesting when possible, with the hope that it leads to better diagnostics and smaller template stack traces. For a given sorter Sorter, stable_t<Sorter> will follow these rules:
    • If cppsort::is_always_stable_v<Sorter> is true, then stable_t aliases Sorter.
    • Otherwise if stable_adapter<Sorter>::type exists, then stable_t aliases it.
    • Otherwise stable_t aliases stable_adapter<Sorter>.
  • A new comparator projection_compare can be used to embed both a comparison and a projection into a single comparison object, allowing to easily use projections with algorithms that don't support them explicitly:
    // Sort a family from older to younger member
    std::vector<Person> family = { /* ... */ };
    std::sort(family.begin(), family.end(), cppsort::make_projection_compare(std::greater<>{}, &Person::age));
  • A new CPPSORT_ENABLE_AUDITS macro can be defined to enable library audits. These audits are assertions too expensive to enable with CPPSORT_ENABLE_ASSERTS: some of them might even change the complexity of the algorithms performing the checks.

Bug fixes

  • Fix ska_sorter for signed 128-bit integers when both positive and negative values are present in the collection to sort.
  • The random-access version of quick_merge_sorter was O(n²) because it relied on libc++'s implementation of std::nth_element which is itself O(n²). It was replaced with miniselect's implementation of Andrei Alexandrescu's QuickselectAdaptive algorithm, ensuring that quick_merge_sort is O(n log n) as documented (issue #179).
    Comparisons performed against a quicksort killer pattern, showing the O(n²) behaviour of the old random-access version of quick_merge_sort
  • Fix a potential move-after-free bug in pdq_sorter, affecting all components that rely on it.

Improvements

  • The bidirectional overload of verge_sorter was significantly reworked in order to make the optimized path faster, while keeping its previous speed for the path where the vergesort layer doesn't find big enough runs. The improvements are especially noticeable for the Ascending sawtooth and Descending sawtooth in the patterns benchmark. Here is a brief summary of the main improvements:
    • It now uses the same k-way merge algorithm as the random-access overload. Among other things this means that the bidirectional overload should now have the complexity described in the documentation. I am still unsure whether it was the case before or not.
    • It is now smarter about which element should belong to which run, and as a result generates fewer 1-element runs, which leads to fewer merges overall.
    • It now spends less time computing std::distance prior to merges.
      Old vs. new unstable verge_sorter benchmark over std::list
  • stable_adapter now has dedicated specializations for verge_sorter and verge_adapter instead of relying on make_stable. The stable algorithm is a bit different from the unstable one: it detects strictly descending runs in the collection to sort instead of non-ascending ones, and wraps the fallback sorter in stable_t. This addresses issue Morwenn/vergesort#11 in the standalone project, and gives a partial answer for issue Morwenn/vergesort#7.
    Old vs. new stable verge_sorter and verge_sorter benchmarks over std::deque
    Old vs. new stable verge_sorter benchmark over std::list
    The benchmarks above clearly show the difference for the fast vergesort path for random-access iterators. The significant decrease in speed in the bidirectional benchmarks is due to the fact that stable_adapter<verge_sorter> used make_stable in 1.8.1, which actually creates a buffer of N elements and performs a random-access sort there. As a result the new version uses less memory but is slower. It is still possible to get the 1.8.1 behaviour using make_stable<verge_sorter>.
  • Nested stable_adapter now unwraps. It is a scenario that could happened from time to time, and should produce clearer error messages thanks to the reduced nesting.
  • utility::is_probably_branchless_comparisons and utility::is_probably_branchless_projection now have specializations for more comparisons and projections (#177), both for internal and standard library types. These changes mean that pdq_sorter uses its branchless partitioning algorithm in more scenari (which tends to be faster on average, but slower for some patterns). Since many components rely on pdqsort, the changes should be noticeable throughout the library. The changes affect the following features:
    • The type returned by std::mem_fn in libc++ and libstdc++ is now considered likely to be branchless when it wraps a pointer to data member. utility::as_function uses std::mem_fn to make pointer to members callable, so this change might have repercussions in most of the library.
    • The C++20 function object std::identity is always considered branchless.
    • The C++20 function objects std::ranges::less and std::ranges::greater follow the same rules as std::less<> and std::greater<> when determining whether they are considered branchless.
    • ska_sorter and schwartz_adapter have been enhanced to more often pass comparison of projection functions considered likely branchless to their underlying algorithms.
    • The comparison adapter used by counting_adapter is now considered likely branchless when the adapted comparison is likely branchless itself, and when the type used for counting is a built-in arithmetic type.
      Old vs. new counting_adapter(pdq_sorter) benchmark over std::vector

Tooling

Benchmarks:

  • Fix a MatplotlibDeprecationWarning in the errorbar-plot benchmark.

Documentation:

  • Change the documentation here and there to make it more forward-compatible with the changes coming in 2.0.0.
  • Change HTTP links to HTTPS when possible.
  • Fix some incorrect uses of cppsort::sort that still used the pre-1.0.0 parameters order.
  • More small fixes to improve the overall quality of the documentation.

Test suite:

  • Some files testing cppsort::sort and cppsort::stable_sort were in the sources but never run as part of the test suite. They are now correctly run.
  • Tests sorters with a new median_of_3_killer distribution, which implements a pattern known to be adverse to common quicksort implementations, making them go quadratic (#178). This distribution was implemented after the paper A Killer Adversary for Quicksort by M. D. McIlroy.
  • Improve stable_adapter tests thanks to a new descending_plateau distribution.
  • Bump the downloaded Catch2 version to v2.13.4.
  • General maintenance and cleanup of the existing tests.

Continuous integration:

  • The continuous integration was fully moved from Travis CI to GitHub Actions (#176), with a number of changes in configurations coverage, see below.
  • The oldest tested Clang release is now Clang 6 instead of Clang 3.8. The latter probably still works but there are no automated tests for it anymore.
  • There are now tests for g++ on MacOS.
  • There are now tests with AddressSanitizer and UndefinedBehaviorSanitizer on MacOS.
  • There are no tests with Valgrind's memcheck tool on MacOS anymore.
  • The overall number of compiler versions tested should be greater, because the different OS have different minimal compiler versions available.

Miscellaneous:

  • Some CMake options were deprecated and replacements introduced. See the Deprecations section of these notes.
  • The project's .gitignore now only ignores the files that can be generated with the library tooling.
  • Tweaks to the tool used to debug sorters to make it more usable.

Known bugs

I didn't manage to fix every bug I could find since the previous release, so you might want to check the list of known bugs.