Skip to content

Commit

Permalink
Fix compexity of algorithm using detail::partition
Browse files Browse the repository at this point in the history
I was under the impression that std::partition (and our stolen
implementation thereof) ran in O(n log n) time on forward iterators, but
reading about Lotumo partitioning led me to double-check and realize
that it actually runs in O(n) time. This changes the time complexity of
quick_sort and quick_merge_sort for forward iterators.
  • Loading branch information
Morwenn committed Dec 16, 2023
1 parent 4a6e390 commit 88576c0
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 14 deletions.
16 changes: 7 additions & 9 deletions docs/Original-research.md
Expand Up @@ -4,15 +4,13 @@ You can find some experiments and interesting pieces of code [in my Gist][morwen

### Of iterator categories & algorithm complexities

One of the main observations which naturally occured as long as I was putting together this library was about the best complexity tradeoffs between time and memory depending on the iterator categories of the different sorting algorithms (only taking comparison sorts into account):
* Algorithms that work on random-access iterators can run in O(n log n) time with O(1) extra memory, and can even be stable with such guarantees (block sort being the best example).
* Unstable algorithms that work on bidirectional iterators can run in O(n log n) time with O(1) extra memory: QuickMergesort [can be implemented][quick-merge-sort] with a bottom-up mergesort and a raw median-of-medians algorithms (instead of the introselect mutual recursion).
* Stable algorithms that work on bidirectional iterators can run in O(n log n) time with O(n) extra memory (mergesort), or in O(n log² n) time with O(1) extra memory (mergesort with in-place merge).
* Stable algorithms that work on forward iterators can get down to the same time and memory complexities than the ones working on bidirectional iterators: mergesort works just as well.
* Unstable algorithms that work on forward iterators can run in O(n log² n) time and O(1) space, QuickMergesort being once again the prime example of such an algorithm.
* Taking advantage of the list data structure allows for sorting algorithms running in O(n log n) time with O(1) extra memory, be it for stable sorting (mergesort) or unstable sorting (melsort), but those techniques can't be generically retrofitted to generically work with bidirectional iterators

Now, those observations/claims are there to be challenged: if you know of any stable comparison sorting algorithm that runs on bidirectional iterators in O(n log n) with O(1) extra memory, don't hesitate to be the ones challenging those claims :)
One of the main observations which naturally occured as long as I was putting together this library was about the best time & space complexity tradeoffs depending on the iterator categories of the different sorting algorithms (only taking comparison sorts into account):
* Algorithms that work on random-access iterators can run in O(n log n) and O(1) space, and can even be stable with such guarantees (block sorts being the best examples).
* Unstable algorithms that work on forward or bidirectional iterators can run in O(n log n) time and O(1) space: QuickMergesort [can be implemented][quick-merge-sort] with a bottom-up mergesort and a raw median-of-medians algorithm (instead of the introselect mutual recursion), leading to such complexity.
* Stable algorithms that work on forward or bidirectional iterators can run in O(n log n) time and O(n) space (mergesort), or in O(n log² n) time and O(1) space (ex: mergesort with in-place merge).
* Taking advantage of the list data structure allows for comparison sorts running in O(n log n) time and O(1) space, be it for stable sorting (mergesort) or unstable sorting (melsort), but those techniques can't be generically retrofitted to generically work with bidirectional iterators

Now, those observations/claims are there to be challenged: if you know a stable comparison sort that runs on bidirectional iterators in O(n log n) time and O(1) space, don't hesitate to be the challenger :)

### Vergesort

Expand Down
8 changes: 3 additions & 5 deletions docs/Sorters.md
Expand Up @@ -210,12 +210,11 @@ Implements a flavour of [QuickMergesort][quick-mergesort].
| Best | Average | Worst | Memory | Stable | Iterators |
| ----------- | ----------- | ----------- | ----------- | ----------- | ------------- |
| n | n log n | n log n | log n | No | Random-access |
| n | n log n | n log n | log² n | No | Bidirectional |
| n | n log² n | n log² n | log² n | No | Forward |
| n | n log n | n log n | log² n | No | Forward |

QuickMergesort is an algorithm that performs a quicksort-like partition and tries to use mergesort on the bigger partition, using the smaller one as a swap buffer used for the merge operation when possible. The flavour of QuickMergesort used by `quick_merge_sorter` uses a [selection algorithm][selection-algorithm] to split the collection into partitions containing 2/3 and 1/3 of the elements respectively. This allows to use an internal mergesort of the biggest partition (2/3 of the elements) using the other partition (1/3 of the elements) as a swap buffer.

The change in time complexity for forward iterators is due to the partitioning algorithm being O(n log n) instead of O(n). The space complexity is dominated by the stack recursion in the selection algorithms:
The space complexity is dominated by the stack recursion in the selection algorithms:
* log n for the random-access version, which uses Andrei Alexandrescu's [*AdaptiveQuickselect*][adaptive-quickselect].
* log² n for the forward and bidirectional versions, which use the mutually recursive [introselect][introselect] algorithm.

Expand All @@ -231,8 +230,7 @@ Implements a [quicksort][quicksort].

| Best | Average | Worst | Memory | Stable | Iterators |
| ----------- | ----------- | ----------- | ----------- | ----------- | ------------- |
| n | n log n | n log n | log² n | No | Bidirectional |
| n | n log² n | n log² n | log² n | No | Forward |
| n | n log n | n log n | log² n | No | Forward |

Despite the name, this sorter actually implements some flavour of introsort: if quicksort performs more than 2*log(n) steps, it falls back to a [median-of-medians][median-of-medians] pivot selection instead of the usual median-of-9 one. The median-of-medians selection being mutually recursive with an introselect algorithm explains the use of log²n stack memory.

Expand Down

0 comments on commit 88576c0

Please sign in to comment.