User guide

Build guide

The Mahalanobis-average hierarchical clustering project was developed with the CMake build tool.

To build the executable, use CMake configure and build commands in a build directory. Then, the directory para will contain gmhclust executable.

The only dependency is the CUDA compiler (nvcc). The executable should be portable to all platforms supporting nvcc; it was successfully tested on Ubuntu 18.04 and Windows 10.

See the following steps:

cd gmhc
mkdir build && cd build
cmake ..
cmake --build .
ls para/gmhclust

Running the program

The gmhclust executable has three command line parameters:

Dataset file path – The mandatory parameter with a path to a dataset file.
The file is binary and has structure as follows:
A. 4B unsigned integer D – point dimension
B. 4B unsigned integer N – number of points
C. N.D 4B floats – N single-precision D-dimensional points stored one after another
Mahalanobis threshold – An absolute positive number that states the Mahalanobis threshold. It is the mandatory parameter.
Apriori assignments file path – An optional path to an apriori assignments file — a file with space separated 4B unsigned integers (assignment numbers). The number of integers is the same as the number of points in the dataset; it sequentially assigns each point in the dataset file an assignment number. Then simply, if the i-th and the j-th assignment numbers are equal, then the i-th and j-th points are assigned the same apriori cluster.

The command, that executes the program gmhclust to cluster data dataset with the apriori assignment file asgns and the threshold 100 is
./gmhclust data 100 asgns

Output

The executable writes the clustering process to the standard output in a text format. Each line contains an ID pair of merged clusters with their merge distance as well.
IDs are assigned as follows:

Initial dataset points are assigned nonnegative integers ([0, n-1]).
Merged clusters are assigned the next possible ID ([n, 2n-1]).

An example output for 4 points in a dataset would look like this:

0 2 0.65
1 4 1.2
3 5 0.1

R package build guide

To build the package, use CMake configure and build commands in a build directory. Specifically, build target gmhc_package.

See the following steps (last step to work, you need to have root rights):

cd gmhc
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --target gmhc_package

Building the specified target installs gmhc package to the default R package directory. Then in R session, it can be used as follows:

library('gmhc')
?gmhclust

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
cmake		cmake
gmhc_package		gmhc_package
include		include
para		para
para_timer		para_timer
serial		serial
thesis		thesis
.clang-format		.clang-format
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CMakeSettings.json		CMakeSettings.json
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmake

cmake

gmhc_package

gmhc_package

include

include

para

para

para_timer

para_timer

serial

serial

thesis

thesis

.clang-format

.clang-format

.gitignore

.gitignore

CMakeLists.txt

CMakeLists.txt

CMakeSettings.json

CMakeSettings.json

LICENSE

LICENSE

README.md

README.md

Repository files navigation

User guide

Build guide

Running the program

Output

R package build guide

About

Releases

Packages

Contributors 2

Languages

License

asmelko/gmhc

Folders and files

Latest commit

History

Repository files navigation

User guide

Build guide

Running the program

Output

R package build guide

About

Resources

License

Stars

Watchers

Forks

Languages