Bench/bowen/desul raja atomics #1624

johnbowen42 · 2024-04-04T15:59:33Z

Comparison of RAJA's implementation of atomic operations with desul

MrBurmark · 2024-04-29T20:37:29Z

benchmark/benchmark-atomic.cpp

+
+template<int BLOCK_SZ>
+struct ExecPolicyGPU {
+    using policy = RAJA::cuda_exec<BLOCK_SZ>;


This is a synchronous policy.

MrBurmark · 2024-04-29T20:38:56Z

benchmark/benchmark-atomic.cpp

+void TimeAtomicOp(int num_iterations = 2, int array_size = 100) {
+    RAJA::Timer timer;
+
+    for (int i = 0; i < num_iterations; ++i) {


It might be good to not time the first run of the kernel. Then time multiple iterations of the loop, running asynchronously for gpus, instead of timing each loop individually.

rhornung67 · 2024-04-30T19:25:22Z

benchmark/benchmark-atomic.cpp

+
+template<int BLOCK_SZ>
+struct ExecPolicyGPU {
+    using policy = RAJA::cuda_exec<BLOCK_SZ>;


Suggested change

using policy = RAJA::cuda_exec<BLOCK_SZ>;

using policy = RAJA::cuda_exec<BLOCK_SZ, true /*asynchronous*/>;

By default, RAJA exec policies are synchronous. To make them asynchronous, you add a template parameter with a bool true value.

rhornung67 · 2024-04-30T19:27:05Z

benchmark/benchmark-atomic.cpp

+#include "desul/atomics.hpp"
+#include "RAJA/util/Timer.hpp"
+
+#define N 1000000000


I think N should be a command line arg to select at run time.

rhornung67 · 2024-04-30T19:32:38Z

benchmark/benchmark-atomic.cpp

+    // GPU benchmarks
+    std::cout << "Executing CUDA benchmarks" << std::endl;
+    std::cout << INDENT << "Executing atomic add benchmarks" << std::endl;
+    TimeAtomicOp<ExecPolicyGPU<64>::policy, int, AtomicAdd<int, typename GPUAtomic::policy>, true>(4);


It may be cleaner and easier to work with to set a constexpr thread block size variable to a default value, such as 256, at the top of the file. I think we agreed in a recent group meeting that thread block size doesn't have a significant performance impact for the simple kernels in this file. @MrBurmark what do you think?

That sounds good to me, then we could vary it fairly easily if we wanted to.

And we could follow the pattern in RAJA Perf if we wanted to try block size sweeps.

I added a for_each_type function into RAJA for just these kind of use cases https://github.com/LLNL/RAJA/blob/develop/include/RAJA/util/for_each.hpp#L88

johnbowen42 and others added 3 commits April 1, 2024 15:45

Add benchmark for atomics

a74da7c

add cuda bench

a42e6d8

add max benchmark

8a07b68

johnbowen42 requested a review from rhornung67 April 4, 2024 15:59

Move away from google benchmark

7d0da79

johnbowen42 force-pushed the bench/bowen/desul-raja-atomics branch from dad5812 to 7d0da79 Compare April 15, 2024 21:56

johnbowen42 added 2 commits April 15, 2024 16:00

Remove debug prints

7bcbe68

reformat benchmark printing and cleanup conditional compilation

c4a9582

johnbowen42 force-pushed the bench/bowen/desul-raja-atomics branch from 7e171df to c4a9582 Compare April 23, 2024 01:57

Merge branch 'develop' into bench/bowen/desul-raja-atomics

bf93ec2

MrBurmark reviewed Apr 29, 2024

View reviewed changes

Merge branch 'develop' into bench/bowen/desul-raja-atomics

d2e2445

rhornung67 marked this pull request as ready for review April 30, 2024 19:20

rhornung67 reviewed Apr 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bench/bowen/desul raja atomics #1624

Bench/bowen/desul raja atomics #1624

johnbowen42 commented Apr 4, 2024

MrBurmark Apr 29, 2024

MrBurmark Apr 29, 2024

rhornung67 Apr 30, 2024

rhornung67 Apr 30, 2024

rhornung67 Apr 30, 2024

MrBurmark Apr 30, 2024

rhornung67 Apr 30, 2024

MrBurmark May 1, 2024 •

edited

	using policy = RAJA::cuda_exec<BLOCK_SZ>;
	using policy = RAJA::cuda_exec<BLOCK_SZ, true /asynchronous/>;

Bench/bowen/desul raja atomics #1624

Are you sure you want to change the base?

Bench/bowen/desul raja atomics #1624

Conversation

johnbowen42 commented Apr 4, 2024

MrBurmark Apr 29, 2024

Choose a reason for hiding this comment

MrBurmark Apr 29, 2024

Choose a reason for hiding this comment

rhornung67 Apr 30, 2024

Choose a reason for hiding this comment

rhornung67 Apr 30, 2024

Choose a reason for hiding this comment

rhornung67 Apr 30, 2024

Choose a reason for hiding this comment

MrBurmark Apr 30, 2024

Choose a reason for hiding this comment

rhornung67 Apr 30, 2024

Choose a reason for hiding this comment

MrBurmark May 1, 2024 • edited

Choose a reason for hiding this comment

MrBurmark May 1, 2024 •

edited