Skip to content

mre/cudampi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CUDAMPI

A large hybrid CPU/GPU sorting network using CUDA and MPI. The sorting network uses a standard Quicksort for CPUs and a custom Bitonic Sort for GPUs. These two algorithms were the fastest in a number of prior benchmarks.

We execute the first step of a bucketsort algorithm to presort the data. A bucket only contains numbers in a given range. We put each number into its corresponding bucket. This can be done in parallel. Now each bucket can be sorted on either a CPU or a GPU.

The sorting network uses the filesystem as a process management solution. Therefore no explicit locks are required. MPI/IO is used to write to the filesystem in parallel. The result is a number of sorted files.