Parallel Toolbox

An arbitrary assortment of distributed algorithms implemented on MPI. Focus is on ranking, sorting and basic distributed datastructures. Documentation is provided for each algorithm outlining its use case and trade-offs.

At the moment, tuning parameters are chosen arbitrarily and without much benchmarking, but I plan to remedy that in the future. Further, those parameters will get exposed, so applications demanding specific tuning of algorithms can easily achieve it.

About Safety

This library does not use unsafe itself, but (naturally) makes heavy use of rsmpi which is a thin binding and therefore abundantly unsafe. The binding lib makes no attempt at sanitizing input, and it is therefore extremely easy to provoke a segfault by passing wrong buffer sizes to MPI or by confusing data types.

Do not use this library in security-critical applications.

Algorithms

Inefficient Ranking and Sorting

A few inefficient routines for ranking and sorting are provided, which are reused in more efficient algorithms as the base case for recursions. The implementations (short of matrix_rank for now) are generic; the ranking variants return usize, because that is the indexing type which makes sense for ranks.

inefficient_sort sends all data to one processor, which will sort and redistribute it. It is the most inefficient and slowest algorithm, but works in all cases
inefficient_rank sends all data to one processor, which will calculate ranks and return them to the original processors. Otherwise, it is the same design as inefficient_sort
matrix_rank requires a square number of processors, but is theoretically more efficient than alternatives

Sample Sort

As an efficient all-purpose sorting algorithm the toolbox provides sample_sort. It expects roughly equally sized slices of data on all clients, and will sort the data according to the MPI processor ranks. It guarantees constant amount of recursions with high probability, and data distribution will be no worse than $(1 + \epsilon)\frac{n}{p}$ where $n$ is the total data amount, $p$ is the number of processors and $\epsilon$ is a tuning parameter.

Selection

The crate provides a single-threaded (select_k) and a parallel (parallel_select_k) selection algorithm that select the $k$ smallest entries from an unsorted slice. Both algorithms are randomized Las Vegas algorithms with Chernoff bounds proving their constant recursion depth. Both work with varying input slice lengths.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
examples		examples
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

src

src

.gitignore

.gitignore

Cargo.toml

Cargo.toml

README.md

README.md

Repository files navigation

Parallel Toolbox

About Safety

Algorithms

Inefficient Ranking and Sorting

Sample Sort

Selection

About

Releases

Packages

Languages

Cydhra/parallel_toolbox

Folders and files

Latest commit

History

Repository files navigation

Parallel Toolbox

About Safety

Algorithms

Inefficient Ranking and Sorting

Sample Sort

Selection

About

Topics

Resources

Stars

Watchers

Forks

Languages