Add new Algorithms using explicit batch type #496

michaelbacci · 2021-06-23T22:57:08Z

New algorithms are added:

count
count_if
transform_batch
reduce_batch

Tests are updated.
README.md updated.

Using the added algorithms I've create a nanmean_fast function where the benchmark are the faster respect other implementation:

2021-06-24T00:59:51+02:00
Running ./bench_math
Run on (16 X 2300 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 256 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 2.46, 2.32, 2.13
------------------------------------------------------------------------------
Benchmark                                    Time             CPU   Iterations
------------------------------------------------------------------------------
BM_nanmean/250                             487 ns          486 ns      1313789
BM_nanmean/1024                           2036 ns         2034 ns       348920
BM_nanmean_with_xtensor/250               5533 ns         5529 ns       116447
BM_nanmean_with_xtensor/1024             20085 ns        20063 ns        35643
BM_nanmean_xt/250                         1461 ns         1460 ns       478433
BM_nanmean_xt/1024                        4973 ns         4969 ns       138529
BM_nanmean_fast/250                        273 ns          272 ns      2553337
BM_nanmean_fast/1024                      1014 ns         1014 ns       674205

BM_nanmean has a classic C style while
BM_nanmean_with_xtensor is essentially xt::mean(xt::filter(e, !xt::isnan(e))) where e is a tensor
BM_nanmean_xt is the xt::nanmean
BM_nanmean_fast use the added reduce_batch and count_if algorithms

JohanMabille

Thanks for your contribution! The implementation is really neat. I have some questions regarding the names of the functions as you can see below.

Regarding the failure in the tests, I think you have to include xsimd_fallback.hpp in the cpp file, so that the compiler can find the default implementation when a type is not supported by the current instructions set.

JohanMabille · 2021-06-24T15:55:30Z

include/xsimd/stl/algorithms.hpp

-    template <class I1, class I2, class O1, class UF>
-    void transform(I1 first, I2 last, O1 out_first, UF&& f)
+    template <class I1, class I2, class O1, class UF, class UFB>
+    void transform_batch(I1 first, I2 last, O1 out_first, UF&& f, UFB&& fb)


Why not keeping transform as the function name?

include/xsimd/stl/algorithms.hpp

JohanMabille · 2021-06-30T10:04:40Z

test/test_algorithms.cpp

@@ -216,6 +275,102 @@ TEST_F(xsimd_reduce, using_custom_binary_function)
    }
 }

+TEST(algorithms, reduce_batch)
+{
+    const double nan = std::numeric_limits<double>::quiet_NaN();


For ARM, vectorization for double is available only on 64bits arch. Therefore, this test should be guarded with something like
#if XSIMD_ARM_INSTR_SET >= XSIMD_ARM8_64_NEON_VERSION || XSIMD_X86_INSTR_SET >= XSIMD_X86_SSE2_VERSION

JohanMabille · 2021-06-30T10:06:29Z

include/xsimd/stl/algorithms.hpp

+    using enable_if_increment = typename std::enable_if<has_increment<T>::value>::type;
+
+    template <class T>
+    using enable_if_not_increment = typename std::enable_if<!has_increment<T>::value>::type;


I think it would be better to move these metafunctions in some detail namespace, they're not supposed to be part of the API.

JohanMabille · 2021-06-30T10:45:15Z

include/xsimd/stl/algorithms.hpp

+              typename = enable_if_increment<I2>,
+              typename = enable_if_increment<O1>,
+              typename = enable_if_not_increment<UF>,
+              typename = enable_if_not_increment<UFB>>


It would be more readable to gather these conditions so that you can use a single enable_if condition. That could be something like:

template <class... Args> struct have_increment : all_true<has_increment<Args>::value...> {}; template <class... Args> struct not_have_increment : all_true<!has_increment<Args>::value...> {}; template <class I1, class I2, class I3, class UF, class UFB> using enable_binary_algorithm_t = typename std::enable_if<have_increment<I1, I2, O1>::value && not_have_increment<UF, UFB>::value, int>::type;

Besides, default template parameters are not considered by the compiler for overload selection, so it's better to use the enable_if as the template parameter evaluating to an int when it's valid:

template <class I1, class I2, class O1, class UF, class UFB, enable_binary_algorithm_t<I1, I2, O1, UF, UFB> = 0> ...

michaelbacci force-pushed the new_algorithms branch 3 times, most recently from a2cc837 to 2be89a0 Compare June 24, 2021 09:44

JohanMabille requested changes Jun 24, 2021

View reviewed changes

michaelbacci force-pushed the new_algorithms branch 3 times, most recently from 5459d0a to 983a7c4 Compare June 30, 2021 08:57

JohanMabille reviewed Jun 30, 2021

View reviewed changes

michaelbacci force-pushed the new_algorithms branch from 983a7c4 to 0f40c17 Compare July 3, 2021 23:03

Add new Algorithms using explicit batch type

8a81e7c

michaelbacci force-pushed the new_algorithms branch from 0f40c17 to 8a81e7c Compare July 3, 2021 23:45

JohanMabille force-pushed the master branch 3 times, most recently from 6c6dc1f to 52984ef Compare October 14, 2021 12:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new Algorithms using explicit batch type #496

Add new Algorithms using explicit batch type #496

michaelbacci commented Jun 23, 2021 •

edited

JohanMabille left a comment

JohanMabille Jun 24, 2021

JohanMabille Jun 30, 2021

JohanMabille Jun 30, 2021

JohanMabille Jun 30, 2021

Add new Algorithms using explicit batch type #496

Are you sure you want to change the base?

Add new Algorithms using explicit batch type #496

Conversation

michaelbacci commented Jun 23, 2021 • edited

JohanMabille left a comment

Choose a reason for hiding this comment

JohanMabille Jun 24, 2021

Choose a reason for hiding this comment

JohanMabille Jun 30, 2021

Choose a reason for hiding this comment

JohanMabille Jun 30, 2021

Choose a reason for hiding this comment

JohanMabille Jun 30, 2021

Choose a reason for hiding this comment

michaelbacci commented Jun 23, 2021 •

edited