Skip to content

jerrywn121/convex_clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Convex Clustering

Objective

where

X and A are the parameter and data matrix, respectively. d is the number of features and n is the number of points we want to cluster. Each column of A is a data point and each column of X is the "centroid" of each data point. After solving this equation and get the optimal solution for X, different data points can be clustered in the same group if

i.e., different points' centroids are close enough to each other, for some predefined hyperparameter .

Optimization Algorithms

We provide implementation of following two optimization algorithm for convex clustering

  • Accelrated Gradient Method (AGM)
  • Newton-CG

Derivation of Gradient and Hessian

The AGM requires the calculation of the gradient of the objective function, while Newton-CG requires Hessian. The detailed mathematical derivation is lengthy and thus only included in grad_hess_derivation.pdf. And the Numpy implementation is included in huber_obj.py.

Other Functionalities

We also write a cluster function in the utils.py that uses graph and DFS to find clusters

Releases

No releases published

Packages

No packages published

Languages