You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have several adaptors and algorithms which take a comparator, and require that the comparator produces a strict weak order over the sequence elements:
cmp::min/cmp::max (for a sequence of two elements)
The comparator for all of these functions follows the classic STL model of requiring a binary predicate which returns true if the first argument is "less than" the second, defaulting to std::ranges::less. We use the standard library std::strict_weak_order concept, in most cases via our own strict_weak_order_for<Seq1, Seq2> concept which checks std::strict_weak_order for all combinations of the sequences' element_t, value_t& and common_element_t (the Flux equivalent of Ranges' std::indirect_strict_weak_order).
The outlier here is flux::compare(seq1, seq2, cmp) which requires the comparator to return a value of one of the three C++20 comparison categories (std::partial_ordering, std::weak_ordering or std::strong_ordering) and performs a lexicographical comparison of the sequence elements returning that category -- that is, the same as std::lexicographical_compare_three_way.
First of all, this is inconsistent. We shouldn't have one algorithm with a completely different interface to all the others.
A simple solution would be to change our compare to be the equivalent of std::lexicographical_compare -- that is, require just a "less than" predicate like everything else.
A potentially better solution would be to consistently use three way comparison everywhere -- specifically, requiring a comparator for all of the above algorithms (except compare) to return at least std::weak_ordering. After all, we're a C++20 library, we can do things the modern way!
As with everything though, there are pros and cons to doing this:
Pros:
Correctness. At the moment, strict weak ordering is basically just a semantic requirement. Although a user could theoretically supply a custom comparator that returns std::weak_ordering but "lies" about it, this seems much less likely than accidentally passing an incorrect predicate.
Proper handling of NaNs: Right now, flux::sort(vector<float>) compiles but does the wrong thing if the data contains NaN values. With the proposed change (assuming std::compare_three_way as the default comparator) then this will fail to compile. This forces the user to think about how they want to handle NaNs; either by passing std::strong_order to use the IEEE total ordering, or by using a custom comparator that ignores NaN values.
Cons:
By default, only types with a spaceship operator could be used with the Flux algorithms. Of course, defaulting the operator is very easy in C++20, but there's probably a lot of code out there that only defines the classic relational operators
Potentially, worse codegen. These algorithms only really care about less-than, but custom comparators (or op<=> implementations) might need to do two compares rather than one. It's probably worth benchmarking this. In particular, using std::strong_order to compare floats generates a lot more code than std::less.
Ergonomics: right now it's easy to sort in descending order by saying sort(vec, std::greater{}). We'd probably want to provide a comparator that inverts ordering::greater and ordering::less to allow the same thing without the user needing to write a lambda to do it themselves.
Unfamiliarity: the STL has used less-than predicate comparators for 30 years, and now suddenly we're asking people to do something different. Likewise, "why can't I use flux::sort on my vector of doubles, std::sort works fine"
Overall, I think this is worth investigating -- at least, benchmarking what happens if we use our current sort with a comparator defined as (roughly):
We have several adaptors and algorithms which take a comparator, and require that the comparator produces a strict weak order over the sequence elements:
find_min
/find_max
/find_minmax
min
/max
/minmax
sort
set_union
/set_difference
/set_symmetric_difference
/set_intersection
cmp::min
/cmp::max
(for a sequence of two elements)The comparator for all of these functions follows the classic STL model of requiring a binary predicate which returns
true
if the first argument is "less than" the second, defaulting tostd::ranges::less
. We use the standard librarystd::strict_weak_order
concept, in most cases via our ownstrict_weak_order_for<Seq1, Seq2>
concept which checksstd::strict_weak_order
for all combinations of the sequences'element_t
,value_t&
andcommon_element_t
(the Flux equivalent of Ranges'std::indirect_strict_weak_order
).The outlier here is
flux::compare(seq1, seq2, cmp)
which requires the comparator to return a value of one of the three C++20 comparison categories (std::partial_ordering
,std::weak_ordering
orstd::strong_ordering
) and performs a lexicographical comparison of the sequence elements returning that category -- that is, the same asstd::lexicographical_compare_three_way
.First of all, this is inconsistent. We shouldn't have one algorithm with a completely different interface to all the others.
A simple solution would be to change our
compare
to be the equivalent ofstd::lexicographical_compare
-- that is, require just a "less than" predicate like everything else.A potentially better solution would be to consistently use three way comparison everywhere -- specifically, requiring a comparator for all of the above algorithms (except
compare
) to return at leaststd::weak_ordering
. After all, we're a C++20 library, we can do things the modern way!As with everything though, there are pros and cons to doing this:
Pros:
std::weak_ordering
but "lies" about it, this seems much less likely than accidentally passing an incorrect predicate.flux::sort(vector<float>)
compiles but does the wrong thing if the data contains NaN values. With the proposed change (assumingstd::compare_three_way
as the default comparator) then this will fail to compile. This forces the user to think about how they want to handle NaNs; either by passingstd::strong_order
to use the IEEE total ordering, or by using a custom comparator that ignores NaN values.Cons:
op<=>
implementations) might need to do two compares rather than one. It's probably worth benchmarking this. In particular, usingstd::strong_order
to compare floats generates a lot more code thanstd::less
.sort(vec, std::greater{})
. We'd probably want to provide a comparator that invertsordering::greater
andordering::less
to allow the same thing without the user needing to write a lambda to do it themselves.flux::sort
on my vector of doubles,std::sort
works fine"Overall, I think this is worth investigating -- at least, benchmarking what happens if we use our current
sort
with a comparator defined as (roughly):and seeing what happens.
The text was updated successfully, but these errors were encountered: