Skip to content

aabbas90/cluster-fug

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ClusterFuG: Clustering Fully connected Graphs by Multicut (arxiv)

Solve multicut problem on complete graph alleviating need for graph structure design. Similarity between a pair of nodes is computed by inner product of input node features.

Requirements

We use GCC 10. Other combinations might also work but not tested. CMake is required for compilation.

Installation

C++ solver:

git clone git@github.com:aabbas90/cluster-fug.git
cd cluster-fug
git submodule update --init --recursive
mkdir build
cd build
cmake ..
make -j 4

Python bindings:

We also provide python bindings using pybind. Simply run the following command:

python -m pip install git+https://github.com/aabbas90/cluster-fug.git

Usage

C++ solver:

We require multicut instance stored in a (.txt) file in the following format:

FACTORIZED COMPLETE MULTICUT
a
n d
feature_node_1_dim_1 ... feature_node_1_dim_d
...
feature_node_n_dim_1 ... feature_node_n_dim_d

which corresponds to a graph with n nodes, and d-dimensional node features. The value of a controls affinity strength where increasing the value helps in creating more clusters and vice versa. Afterwards run (for our DAppLAEC algorithm):

./src/dense_multicut_text_input <PATH_TO_PROBLEM_INSTANCE_TXT> HNSW laec_bf_later

Other algorithms mentioned in the paper can be run as:

./src/dense_multicut_text_input <PATH_TO_PROBLEM_INSTANCE_TXT> <INDEX_TYPE> <CONTRACTION_TYPE>

where INDEX_TYPE and CONTRACTION_TYPE can be choosen as:

Algorithm INDEX_TYPE CONTRACTION_TYPE
GAEC brute_force gaec
DGAEC faiss_brute_force dense_gaec
DGAECInc faiss_brute_force dense_gaec_inc
DLAEC faiss_brute_force dense_laec
DAppLAEC HNSW dense_laec_bf_later

For example to run GAEC: bash ./src/dense_multicut_text_input <PATH_TO_PROBLEM_INSTANCE_TXT> brute_force gaec For more information run:

./src/dense_multicut_text_input --help
	```

### Python solver:
An example to compute multicut on a set of features:
```python
import dense_multicut_py
import numpy as np

num_nodes = 100
dim = 16
affinity_strength = 5.0
k = 5
k_cap = k
# DAppLAEC:
INDEX_TYPE, CONTRACTION_TYPE = "HNSW", "dense_laec_bf_later"
features = np.random.rand(num_nodes, dim).astype(np.float32)
node_labels = dense_multicut_py.dense_multicut(features.flatten(), num_nodes, dim, affinity_strength, k, INDEX_TYPE, CONTRACTION_TYPE, k_cap)

Benchmark instances:

Instances used in the paper can be obtained from structured-prediction-prob-archive.

References

If you use this work please cite as

@article{abbas2023clusterfug,
  title={ClusterFuG: Clustering Fully connected Graphs by Multicut},
  author={Abbas, Ahmed and Swoboda, Paul},
  journal={arXiv preprint arXiv:2301.12159},
  year={2023}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 93.5%
  • Python 3.6%
  • CMake 2.9%