Convex Clustering

Objective

$\min_{\mathbf{X} \in \mathbb{R}^{d\times n}} f_{clust}(\mathbf{X}) = \frac{1}{2}\left\| \mathbf{X_i}-\mathbf{A_i}\right\|^2 + \lambda \sum_{i=1}^{n}\sum_{j=i+1}^{n}\varphi_{hub}(\mathbf{X_i}-\mathbf{X_j})$

where

$\varphi_{hub}(\mathbf{y})=\begin{cases}\frac{1}{2\delta}\left\| \mathbf{y}\right\|^2 & \text{ if } \left\| \mathbf{y} \right\|\leq \delta \\\left\| \mathbf{y} \right\| - \frac{\delta}{2}& \text{ if } \left\| \mathbf{y} \right\|> \delta \end{cases}$

X and A are the parameter and data matrix, respectively. d is the number of features and n is the number of points we want to cluster. Each column of A is a data point and each column of X is the "centroid" of each data point. After solving this equation and get the optimal solution for X, different data points can be clustered in the same group if

$\left\| \mathbf{x_{i}^{*}} - \mathbf{x_{i}^{*}}\right\| < \varepsilon$

i.e., different points' centroids are close enough to each other, for some predefined hyperparameter $\varepsilon$ .

Optimization Algorithms

We provide implementation of following two optimization algorithm for convex clustering

Accelrated Gradient Method (AGM)
Newton-CG

Derivation of Gradient and Hessian

The AGM requires the calculation of the gradient of the objective function, while Newton-CG requires Hessian. The detailed mathematical derivation is lengthy and thus only included in grad_hess_derivation.pdf. And the Numpy implementation is included in huber_obj.py.

Other Functionalities

We also write a cluster function in the utils.py that uses graph and DFS to find clusters

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Convex Clustering

Objective

Optimization Algorithms

Derivation of Gradient and Hessian

Other Functionalities

Files

README.md

Latest commit

History

README.md

File metadata and controls

Convex Clustering

Objective

Optimization Algorithms

Derivation of Gradient and Hessian

Other Functionalities