mpi-2ddc

This program is a two-dimensional domain decomposition using MPI to communicate and reduce matrix & array computations in the equation y=Ax. Essentially, in using MPI, we can take the serial equation and break it into smaller chunks to have each processor or node do less work, therefore, speeding up our computation.

4-Core Standalone Computer Speedup with 4 MPI Processes

$$Amdahl's Law = Serial Time / Parallel Time parallel speedup = 3.30635s / .792755s parallel speedup = 4.17078s$$

Details

Serial Equation: A(M,N) * x(N) = y(M) (where M = Global Matrix Rows, N = Global Matrix Cols)

Becomes

MPI 2DDC Equation: A(m,n) * x(n) = y(m) (where m = Logical MPI Grid Rows, n = Logical MPI Grid Cols)

In the MPI eq., (m,n) are local to mpi processes, representing their 2D sub-domain

Currently, the default sizes are:

M = 16384

N = 16384

P = 2

Q = 2

Meaning the number of processes to pass mpirun -np should be 4 for now

Compile: make

Run: mpirun -np 4 ./runmpi

M and N have only been tested with square martices, so its best to ensure these parameters are the same.

Important

If you change the sizes in the source code, make sure that P*Q = the number of processors passed to the -np flag.

  E.g., mpirun -np 4 ./runmpi 1024 1024 2 2 
  or,   mpirun -np 8 ./runmpi 1024 1024 4 4

The default way to run the code is:

  mpirun -np 4 ./runmpi 
  ./runserial

If you have 4 processor cores available, you can just run

  mpirun ./runmpi

Note: The two terminal commands above are just defaults for running mpi-2ddc-main.cpp To compile the visual 5x5 demo, run the following commands:

g++ -Wall mpi-2ddc-5x5demo.cpp -o rundemo
mpirun -np 4 ./rundemo

Source Code Files Info

serial.cpp As a baseline for performance evaluation.

mpi-2ddc-main.cpp: main file for running with make comand and either using mpirun or submitting test-job.sh to a cluster scheduler.

mpi-2ddc-5x5demo.cpp: is here to provide a visual demonstration of the domain decomposition.

Ideal use case for this file:

Comment out line 105 and then uncomment the two lines of code marked DEMO1 (line 87-88). Now, recompile and run to view how A(5,5) gets mapped to A(m,n) where each MPI process has its own local value for both m,n.
Add the comments back to these lines and uncomment lines 95-96 to view the dimensions of Vector(x) and what nodes/mpi processes have the respectives values.
Lastly, uncomment the line 105 to view how the final solution is communicated and stored

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
examples		examples
include		include
results		results
slurm-scripts		slurm-scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
LINUXHWINFO.md		LINUXHWINFO.md
README.md		README.md
SLURMCLUSTERS_INFO.md		SLURMCLUSTERS_INFO.md
yax_mpi_paper_sections1_3.pdf		yax_mpi_paper_sections1_3.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

include

include

results

results

slurm-scripts

slurm-scripts

src

src

.gitignore

.gitignore

LICENSE

LICENSE

LINUXHWINFO.md

LINUXHWINFO.md

README.md

README.md

SLURMCLUSTERS_INFO.md

SLURMCLUSTERS_INFO.md

yax_mpi_paper_sections1_3.pdf

yax_mpi_paper_sections1_3.pdf

Repository files navigation

mpi-2ddc

4-Core Standalone Computer Speedup with 4 MPI Processes

Details

Compile: make

Run: mpirun -np 4 ./runmpi

Important

Source Code Files Info

About

Releases

Packages

Languages

License

tommygorham/mpi-2ddc

Folders and files

Latest commit

History

Repository files navigation

mpi-2ddc

4-Core Standalone Computer Speedup with 4 MPI Processes

Details

Compile: make

Run: mpirun -np 4 ./runmpi

Important

Source Code Files Info

About

Topics

Resources

License

Stars

Watchers

Forks

Languages