Simple Decomp

This library decomposes two- and three-dimensional structured meshes between multiple processes in a pencil-like manner:

The main objective is to perform so-called parallel matrix transpose (rotating pencils):

which are useful for performing operations where a process needs to contain all data in one direction, e.g. multi-dimensional Fast Fourier Transforms or Thomas algorithm.

Features

C99-compatible.
Domain decompositions and pencil rotations for arbitrary data types using a single function.
Memory efficient: no internal buffers to store intermediate data.
Matrix transposition (row-major and column-major orders are interchanged) while rotating the pencils to improve the cache efficiency of the successive operations (e.g. Fast Fourier Transforms, Thomas algorithm).
Transparent (i.e. hackable) and well-documented implementation.
Well-tested for various data types, domain sizes and number of processes.

Dependency

Quick start

Checkout this project

git clone https://github.com/NaokiHori/SimpleDecomp
cd SimpleDecomp

and execute

make all
mpirun -2 --oversubscribe ./a.out

which runs a small program to get started, which is elaborated below.

To understand the concept easily, I consider a two-dimensional array which is split into two sub-arrays (i.e. two processes) and is contained by each process:

+-------------+
| 16 17 18 19 |
| 12 13 14 15 |
| 08 09 10 11 |
+-------------+
| 04 05 06 07 |
| 00 01 02 03 |
+-------------+

where the lower (from 00 to 07) and the upper (from 08 to 19) parts are stored by the rank 0 and 1, respectively. Notice that the matrix is stored in the row-major order, i.e.

00 01 02 03 04 05 06 07

in the buffer of the rank 0, and

08 09 10 11 12 13 14 15 16 17 18 19

in the buffer of the rank 1.

They are called x pencils in this project.

After the pencils are rotated, I obtain

+-------+-------+
| 16 17 | 18 19 |
| 12 13 | 14 15 |
| 08 09 | 10 11 |
| 04 05 | 06 07 |
| 00 01 | 02 03 |
+-------+-------+

where the left and the right parts are stored by the rank 0 and 1, respectively. Notice that the matrix is now stored in the column-major order, i.e.:

00 04 08 12 16 01 05 09 13 17

in the buffer of the rank 0, while

02 06 10 14 18 03 07 11 15 19

in the buffer of the rank 1.

They are named as y pencils in this project.

Check src/main.c to see how the matrix is initialised and the library APIs are called.

Usage

Please refer to the documentation for practical usages and all available APIs.

Application

I use this library to solve the incompressible Navier-Stokes equations efficiently on clusters.

Finite-difference method:
Spectral method:

Reinventing the wheel?

Although the concept is similar to 2decomp-fft, the pencil rotations are largely tailored to my purposes and the implementation is simplified.

I also recognise many other nice libraries which aim to do the same thing (in Juila or in Fortran, I am pretty sure there are more).

Acknowledgement

This library is inspired by 2decomp-fft.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github		.github
docs/source		docs/source
include		include
src		src
test		test
LICENSE		LICENSE
Makefile		Makefile
README.rst		README.rst
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

docs/source

docs/source

include

include

src

src

test

test

LICENSE

LICENSE

Makefile

Makefile

README.rst

README.rst

install.sh

install.sh

Repository files navigation

Simple Decomp

Features

Dependency

Quick start

Usage

Application

Reinventing the wheel?

Acknowledgement

About

Languages

License

NaokiHori/SimpleDecomp

Folders and files

Latest commit

History

Repository files navigation

Simple Decomp

Features

Dependency

Quick start

Usage

Application

Reinventing the wheel?

Acknowledgement

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages