Skip to content

Writing a bubble_sorter

Morwenn edited this page Nov 29, 2015 · 29 revisions

If you have read the general tutorial about writing sorters, you might be interested in a full concrete example. In this tutorial, we will see step by step how to implement a simple bubble sort and how to write a bubble_sorter to wrap it. Step by step.

The bubble sort algorithm

The bubble sort is a simple sorting algorithm that goes through a collection, comparing adjacent elements and switching them if they are not in order. It does so repeatedly until the collection is sorted. There are some very specific cases where it might be the ideal algorithm, but it is useless most of the time. Anyway, here a basic implementation, taking a pair of iterators like many standard library algorithms:

template<typename RandomAccessIterator>
auto bubble_sort(RandomAccessIterator first, RandomAccessIterator last)
    -> void
{
    while (first != last--)
    {
        for (auto it = first ; it != last ; ++it)
        {
            if (*(it + 1) < *it)
            {
                std::iter_swap(it, it + 1);
            }
        }
    }
}

This version works with random-access iterators only. That said, making it work with bidirectional iterators too is only a matter of changing the it + 1 into an std::next(it):

template<typename BidirectionalIterator>
auto bubble_sort(BidirectionalIterator first, BidirectionalIterator last)
    -> void
{
    while (first != last--)
    {
        for (auto it = first ; it != last ; ++it)
        {
            auto next = std::next(it);
            if (*next < *it)
            {
                std::iter_swap(it, next);
            }
        }
    }
}

Some versions of bubble_sort track whether swaps were actually performed during the traversal of the array and return early if no swap was made (it means that the array was already sorted). For the sake of simplicity, we won't add this check to our algorithm.

A simple bubble_sorter

Now that we have a working bubble_sort algorithm, we will wrap it into a sorter so that it can benefit from the many tools available in cpp-sort when needed. Here is a very basic bubble_sorter implementation:

struct bubble_sorter
{
    template<typename BidirectionalIterator>
    auto operator()(BidirectionalIterator first, BidirectionalIterator last) const
        -> void
    {
        bubble_sort(first, last);
    }
};

Unfortunately, cpp-sort requires sorters to implement range-based algorithms too in order to satisfy the Sorter concept. Implementing by hand the whole set of features is boring and error-prone. Fortunately, cpp-sort provides the utility class template sorter_facade to automagically generate the missing features when an iterator-based operator() exists:

struct bubble_sorter_impl
{
    template<typename BidirectionalIterator>
    auto operator()(BidirectionalIterator first, BidirectionalIterator last) const
        -> void
    {
        bubble_sort(first, last);
    }
};

usinng bubble_sorter = cppsort::sorter_facade<bubble_sorter_impl>;

Now our bubble_sorter satisfies the library's requirements and implements all the additional features without too much additional work. However, we might also want it to play nice with hybrid_adapter, the main building block which allows to aggregate different sorters together. In order to do so, we need to specialize the class sorter_traits for our bubble_sorter, where we explicitly document the sorter's stability and the category of iterators it works with:

namespace cppsort
{
    template<>
    struct sorter_traits<::bubble_sorter>
    {
        using iterator_category = std::bidirectional_iterator_tag;
        using is_stable = std::true_type;
    };
}

Handling custom comparison functions

Our bubble_sort algorithm currently uses operator< to compare the elements to be sorted. To make it more generic, we would like it to work with any suitable comparison function instead, just like std::sort. Doing so is rather easy:

template<
    typename BidirectionalIterator,
    typename StrictWeakOrdering
>
auto bubble_sort(BidirectionalIterator first, BidirectionalIterator last,
                 StrictWeakOrdering compare)
    -> void
{
    while (first != last--)
    {
        for (auto it = first ; it != last ; ++it)
        {
            auto next = std::next(it);
            if (compare(*next, *it))
            {
                std::iter_swap(it, next);
            }
        }
    }
}

cpp-sort is designed to handle both comparison and non-comparison sorters. There isn't much to change to make bubble_sorter use the new addition to its underlying sorting algorithm:

struct bubble_sorter_impl
{
    template<
        typename BidirectionalIterator,
        typename StrictWeakOrdering = std::less<>
    >
    auto operator()(BidirectionalIterator first, BidirectionalIterator last,
                    StrictWeakOrdering compare={}) const
        -> void
    {
        bubble_sort(first, last, compare);
    }
};

With this addition, a bubble_sorter instance can be called with a custom comparison function or without one, defaulting to std::less<>. Note that sorter_facade generates the appropriate overloads so that the sorter can still be called with either a pair of iterators, a pair of iterators and a comparison function, a collection, or a collection and a comparison function. It also ensures that an instance of bubble_sorter can be converted to a function pointer corresponding to any of the overloads of operator().

Using bubble_sorter with forward iterators

In its current state, bubble_sort isn't usable with forward iterators because of the last-- instruction, which requires bidirectional iterators. If we drop this backwards iteration, we have to reintroduce the check to see whether the collection is sorted, which means that the algorithm will generally perform many useless comparisons all over the place. Fortunately, we can use another technique: we know that when the bubble sort traverses the collection for the nth time, it is supposed to perform exactly n comparisons. While decrementing a forward iterator isn't possible, we can still compute the size of the collection to sort, then decrement it and perform the correct number of comparisons:

template<
    typename ForwardIterator,
    typename StrictWeakOrdering
>
auto bubble_sort(ForwardIterator first, ForwardIterator last,
                 StrictWeakOrdering compare)
    -> void
{
    auto size = std::distance(first, last);
    if (size < 2) return;

    while (--size)
    {
        ForwardIterator current = first;
        ForwardIterator next = std::next(current);
        for (std::size_t i = 0 ; i < size ; ++i)
        {
            if (compare(*next, *current))
            {
                std::iter_swap(current, next);
            }
            ++next;
            ++current;
        }
    }
}

The only change to make at the sorter level is to change its declared iterator category in sorter_traits (and possibly the names of the template parameters in the sorter so that they don't lie about the iterator category):

namespace cppsort
{
    template<>
    struct sorter_traits<::bubble_sorter>
    {
        using iterator_category = std::forward_iterator_tag;
        using is_stable = std::true_type;
    };
}

Handling projection parameters

Projections are functions that can be used to "view" the values to sort differently during the comparison. Most of the comparison sorters in cpp-sort take an optional projection parameter. The bubble sort being a comparison sorter, it may be interesting to have it handle projections too. In order to do that, we will have to alter both the sorting algorithm and the sorter. Fortunately, the modifications are pretty straigthforward: in the algorithm, we only have to add another parameter and use it on the values that are being compared:

template<
    typename ForwardIterator,
    typename StrictWeakOrdering,
    typename Projection
>
auto bubble_sort(ForwardIterator first, ForwardIterator last,
                 StrictWeakOrdering compare, Projection projection)
    -> void
{
    auto size = std::distance(first, last);
    if (size < 2) return;

    auto&& proj = cppsort::utility::as_function(projection);

    while (--size)
    {
        ForwardIterator current = first;
        ForwardIterator next = std::next(current);
        for (std::size_t i = 0 ; i < size ; ++i)
        {
            if (compare(proj(*next), proj(*current)))
            {
                std::iter_swap(current, next);
            }
            ++next;
            ++current;
        }
    }
}

Note the use of utility::as_function used to transform the projection parameter. While using the raw projection would have been enough in most scenarios, this line makes it possible to pass pointers to member data instead of functions so that the algorithm can sort the collection on a specific field; this is a rather powerful mechanism. Now, to the sorter:

struct bubble_sorter_impl
{
    template<
        typename ForwardIterator,
        typename StrictWeakOrdering = std::less<>,
        typename Projection = cppsort::utility::identity,
        typename = std::enable_if_t<cppsort::is_projection_iterator<
            Projection, ForwardIterator, StrictWeakOrdering
        >>
    >
    auto operator()(ForwardIterator first, ForwardIterator last,
                    StrictWeakOrdering compare={}, Projection projection={}) const
        -> void
    {
        bubble_sort(first, last, compare, projection);
    }
};

We can see several improvements compared to the previous version: first of all, we added an optional projection parameter which defauts to utility::identity. This is a function object that returns a value as is so that the default behaviour of the algorithm is to run as if projections didn't exist. It is very likely to be optimized aways by the compiler anyway.

The second modification is one I wish we code do without (yet another thing that concepts would make more expressive): is_projection_iterator is a trait checks whether a projection function can be used on a dereferenced iterator. It also optionally checks that a given comparison function can be called with the result of two such projections. This trait exists to ensure that cppsort::sort won't call the functor when these conditions are not satisfied, which may be crucial when aggregating sorters with hybrid_adapter.

Final optimizations

Our current version of bubble_sort has to compute the size of the collection to sort prior to the actual sort. This is not optimal since some of the containers in the standard library know their size and can provide it in O(1). cpp-sort makes it possible to easily use this information: the function utility::size takes a container and returns the result of the member function size if the container has one, and std::distance(std::begin(container), std::end(container)) instead if it doesn't have such a member function. This small tool allows us to rewrite bubble_sort and bubble_sorter so that they can take advantage of this information when available:

template<
    typename ForwardIterator,
    typename StrictWeakOrdering,
    typename Projection
>
auto bubble_sort(ForwardIterator first, StrictWeakOrdering compare,
                 Projection projection, std::size_t size)
    -> void
{
    if (size < 2) return;

    auto&& proj = cppsort::utility::as_function(projection);

    while (--size)
    {
        ForwardIterator current = first;
        ForwardIterator next = std::next(current);
        for (std::size_t i = 0 ; i < size ; ++i)
        {
            if (compare(proj(*next), proj(*current)))
            {
                std::iter_swap(current, next);
            }
            ++next;
            ++current;
        }
    }
}

struct bubble_sorter_impl
{
    // Pair of iterators overload
    template<
        typename ForwardIterator,
        typename StrictWeakOrdering = std::less<>,
        typename Projection = cppsort::utility::identity,
        typename = std::enable_if_t<cppsort::is_projection_iterator<
            Projection, ForwardIterator, StrictWeakOrdering
        >>
    >
    auto operator()(ForwardIterator first, ForwardIterator last,
                    StrictWeakOrdering compare={}, Projection projection={}) const
        -> void
    {
        bubble_sort(first,
                    compare,
                    projection,
                    std::distance(first, last));
    }

    // Iterable overload
    template<
        typename ForwardIterable,
        typename StrictWeakOrdering = std::less<>,
        typename Projection = cppsort::utility::identity,
        typename = std::enable_if_t<cppsort::is_projection<
            Projection, ForwardIterable, StrictWeakOrdering
        >>
    >
    auto operator()(ForwardIterable& iterable, StrictWeakOrdering compare={},
                    Projection projection={}) const
        -> void
    {
        bubble_sort(std::begin(iterable),
                    compare,
                    projection,
                    cppsort::utility::size(iterable));
    }
};

Note that sorter_facade will make sure to call the new operator() overload for ranges instead of dispatching the call to the overload that takes a pair of iterators when given a full collection, so everything should work smoothly enough.

Better error messages

We now have a versatile bubble_sorter, able to handle many scenarios, and also optimized as much as a bubble sort can be. It works really well... until it doesn't. cpp-sort has one major drawback (some would say more than one, but...): when not used correctly, the error messages are close to unreadable; forget one const and embrace the hundreds of lines of cryptic error messages, and I really mean it! It sometimes took me far longer than it should have to find bugs. The sorter works properly, but we can still improve the way it fails...

Starting easy: we can use strong typedefs to hide some irrelevant template parameters and shorten some error messages a bit. In our case, we can make bubble_sorter inherit from cppsort::sorter_facade<bubble_sorter_impl> instead of defining it as a type alias. It doesn't improve error messages that much, but at least the error messages use the name bubble_sorter until they have to display the full name.

struct bubble_sorter:
    cppsort::sorter_facade<bubble_sorter_impl>
{};

One small change that can greatly improve error messages is the addition of a static assertion in the operator() to assert that the iterator category of the passed collection is compatible with that of the sorter. It doesn't have that much of an impact with bubble_sorter since we designed it to work with forward iterators and I hardly see why would anyone pass simple input iterators to it. But it's good practice anyway if you design sorters that only work with more restricted iterator categories. For example, passing an std::list to heap_sorter used to spawn more than 70 lines of cryptic error messages with g++ 5.2 (basically, it failed when it encoutered an operation in the heapsort implementation not supported by bidirectional iterators); with such a static assertion, it came down to 4 lines: the 4th line was the static assertion message, and the one above the faulty call site, which is quite a big improvement.

static_assert(
    std::is_base_of<
        std::forward_iterator_tag,
        typename std::iterator_traits<ForwardIterator>::iterator_category
    >::value,
    "bubble_sorter requires at least forward iterators"
);

Concepts would probably improve some error messages too, but we will have wait for at least C++17. In the current state of the library, many error messages remain pretty noisy and tough to understand, and we can't realistically put static assertions all over the place because too many things rely on SFINAE. That is why even these small improvements to error messages matter; if one can't understand why something fails, they are less likely to fix the error.

And that's it: we have covered pretty much every interesting aspect of writing a simple comparison sorter and we have seen how to implement some small optimizations and care about error messages as well as how to enhance it with projections. I hope you enjoyed the tutorial, even if bubble sort is not the most interesting sorting algorithm around. You can find the full implementation in the examples folder :)