Skip to content

1.11.0 — Heatwave

Compare
Choose a tag to compare
@Morwenn Morwenn released this 24 Jul 19:23
4dcd62b

DOI

I am writing those release notes in the middle of a heatwave — 32°C, you might laugh but I live in a region where that's unusually high enough for AC and fans to be uncommon around here —, which means that my brain is too confused to think of a better name or of anything else for what it's worth.

The biggest features in this release are the addition of several tools to create and manipulate sorting networks, including two new fixed-size sorters implementing specific sorting network algorithms. The rest is mostly the usual smaller improvements, bug fixes and all those things that are needed to improve the overall quality of the project.

Deprecations

block_sort is now deprecated and will fire a deprecation warning if used. It will be removed in cpp-sort 2.0.0. wiki_sort was added to the library to replace it.

Block sort is not a single sorting algorithm, but a whole family of sorting algorithms that shift blocks to sort a collection. Those algorithms are often in-place stable comparison sorts. cpp-sort provides two such algorithms: WikiSort and grail_sort. However WikiSort used the general name block_sort, which seemed more and more unclear and unfair as years went on, especially when considering that new block sorts have been implemented in the recent years, and more are being worked on.

Deprecation warnings can be disabled by defining the preprocessor macro CPPSORT_DISABLE_DEPRECATION_WARNINGS.

New features

This release adds several new features related to sorting networks. The goals are to start completing the library's collection of fixed-size sorters based on sorting networks, and to provide new user-facing tools that make manipulating and creating sorting networks easier.

New header <cpp-sort/utility/sorting_network.h>

This new utility header will contain all generic tools related to sorting networks specifically. It contains the following vocabulary type:

template<typename IndexType>
struct index_pair
{
    IndexType first, second;
};

Sorting networks work by performing a fixed sequence of compare-exchange operations on pairs of elements in the collection to sort. index_pair represents the pair of indices used to access the elements on which a compare-exchange operation will be performed. There is an implicit assumption that first < second, but it is never checked by the library. Currently the vocabulary type to represent a sequence of compare-exchange operations is std::array<index_pair<IndexType>, N>.

This new header also contains the function templates swap_index_pairs and swap_index_pairs_force_unroll, which accept an iterator to the beginning of a random-access collection and an std::array<index_pair<IndexType>, N>. Both functions use the index pairs to perform compare-exchange operations on the collection - effectively applying operations like a comparator network. The former simply loops over the index pairs while the second is a best effort function that tries its best to unroll the loop.

sorting_network_sorter::index_pairs()

All sorting_network_sorter specializations got a new index_pairs() static member function with the following signature:

template<typename DifferenceType=std::ptrdiff_t>
static constexpr auto index_pairs()
    -> std::array<index_pair<DifferenceType>, /* Number of CEs in the network */>;

It returns the sequence of compare-exchange operations used by a network to sort its inputs. It builds upon the features introduced in the previous section.

merge_exchange_network_sorter and odd_even_merge_network_sorter

Those new fixed-size sorters are sorting networks that implement two variations of Batcher odd-even mergesort, an algorithm that halves the collection to sort, sorts both halves recursively then merges them. The former halves the input by considering that the first half is on the even lanes of the network and the second half on the odd lanes. The latter consider that the first half is on the first N/2 lanes of the network, and the second half is on the last N/2 lanes. Both are pretty similar and interchangeable in most scenarios.

merge_exchange_network_sorter over 8 inputs odd_even_merge_network_sorter over 8 inputs

Both provide a static index_pairs() method similar to the one described in the previous section.

Bug fixes

  • If container_aware_adapter<insertion_sorter> worked at all when sorting an std::forward_list, it was purely accidental prior to this release: that's how bugged it was. Now it should be fixed, and way faster than it used to be when it accidentally worked.
  • Fix a potential bug in cartesian_tree_sorter: when an exception was thrown during the construction of the Cartesian tree, the algorithm would try to destroy elements that had not been constructed yet.
  • Fix potential use-after-move issues in stable_adapter.

Improvements

  • The conversion to function pointer operators of sorter_facade were improved to allow sorters to be converted to function pointers returning void regardless of the type they're supposed to return. In such a case, any value returned by the sorter is ignored/discarded. This allows to pass some sorters to functions expecting a void-returning function, or to more easily store sorters into arrays as function pointers no matter their return type:
    using sort_t = void(*)(std::vector<int>&);
    sort_t available_sorters[] = {
        cppsort::merge_sort,
        cppsort::quick_sort,
        cppsort::counting_adapter(cppsort::poplar_sort)
    }
    Those conversions still don't work with MSVC (see issue #185).
  • container_aware_adapter<insertion_sorter> is now around 50% faster in benchmarks when sorting an std::list.
    Benchmark the new container_aware_adapter<insertion_sorter> over std::list<double> against the old version
  • probe::enc is now up to 40% faster for small types when the comparison and projection are considered likely branchless. This improvement is thanks to a variation on binary search proposed by @scandum called monobound binary search.
    Benchmark the new probe::enc over std::vector<double> against the old version
  • cartesian_tree_sort now also works with forward and bidirectional iterators.
  • Fix a warning about [[nodiscard]] when using mel_sort.
  • Fix a warning about a unused typedef which fired when compiling container_aware_adapter<mel_sorter> with Clang.
  • Reduce a bit the memory use of mel_sort (up to sizeof(T) * n/2 less memory is allocated).

Tooling

Benchmarks:

  • Add dist::as_long_string function object to turn a long long int into a std::string of 50 characters whose last characters are the digits of the long long int; this allows to use all the existing distributions to create strings. The big sequence of equal characters at the beginning of the strings make the comparisons expensive, which is a benchmarking category that was often overlooked up to now.
  • The visualization part of the patterns benchmark now accepts an --errorbars option to display the standard deviation.
  • Add "lower is better" indications on the relevant axis of the benchmark plots to make them more easily understandable (the graphs in these release notes predate that change, sorry about that).

Documentation:

  • Give a formula to compute the number of compare-exchanges performed by a hypothetical merging network based on repeated half-cleaner networks.

Test suite:

  • Bump downloaded Catch2 version to v2.13.6.

Miscellaneous:

  • Conan integration: fix test when cross-building in test_package.

Known bugs

I didn't manage to fix every bug I could find since the previous release, so you might want to check the list of known bugs.