GitHub - Wenox/fast-fw: Optimized implementation of Floyd Warshall algorithm using modern AVX2.

Fast Floyd Warshall

Optimized implementation of Floyd Warshall algorithm using modern AVX2 extended instruction set.

SIMD vector instructions allow efficient parallelization on the CPU level.

The results become even better when the graph weights are limited to the range of 8 or 16 bits.

Results

Assembly implementation completely outperforms C++ implementation even when -O3 compiler flag is specified.

This was tested on both sparse and dense graphs of large sizes.

Comparison including C++

Comparison excluding C++

November, 2020.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
application		application
asm		asm
cpp		cpp
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

application

application

asm

asm

cpp

cpp

docs

docs

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Fast Floyd Warshall

Results

Comparison including C++

Comparison excluding C++

About

Releases

Packages

Languages

License

Wenox/fast-fw

Folders and files

Latest commit

History

Repository files navigation

Fast Floyd Warshall

Results

Comparison including C++

Comparison excluding C++

About

Topics

Resources

License

Stars

Watchers

Forks

Languages