Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve class design for AgglomerativeClustering and FeatureAgglomeration (was pooling_func in AgglomerativeClustering doesn't work) #9846

Closed
yinruiqing opened this issue Sep 28, 2017 · 3 comments · Fixed by #9875
Labels
Easy Well-defined and straightforward way to resolve help wanted

Comments

@yinruiqing
Copy link

Description

pooling_func in AgglomerativeClustering doesn't work.

Steps/Code to Reproduce

from sklearn import metrics
from sklearn.datasets.samples_generator import make_blobs
from sklearn.cluster import AgglomerativeClustering


centers = [[1, 1], [-1, -1], [1, -1]]
X, labels_true = make_blobs(n_samples=300, centers=centers, cluster_std=0.5,
                            random_state=0)

model = AgglomerativeClustering(linkage='complete',
                                connectivity=None,
                                affinity = 'cosine',
                                pooling_func = "test_error",
                                n_clusters=3)
model.fit(X)

Expected Results

Raise error because the pooling_func is not callable. It's a string.

Actual Results

No warning, no error

Versions

Linux-4.4.0-64-generic-x86_64-with-debian-stretch-sid
Python 3.5.3 | packaged by conda-forge | (default, Feb 9 2017, 14:37:12)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]
NumPy 1.13.1
SciPy 0.19.1
Scikit-Learn 0.19.0

@lesteve
Copy link
Member

lesteve commented Sep 28, 2017

It seems to me like this is a class design issue: FeatureAgglomeration derives from AgglomerativeClustering but pooling_func is only used in AgglomerativeClustering.transform.

Feel free to open a PR. I haven't looked in detail but a possible solution would be to have FeatureAgglomeration and AgglomerativeClustering derive from a common base class and have pooling_func only in FeatureAgglomeration.

Having said that there may be reasons why it was done like this although I can't think of any.

@jnothman
Copy link
Member

jnothman commented Sep 28, 2017 via email

@lesteve lesteve changed the title pooling_func in AgglomerativeClustering doesn't work Better class design for AgglomerativeClustering and FeatureAgglomeration (was pooling_func in AgglomerativeClustering doesn't work) Oct 3, 2017
@lesteve lesteve changed the title Better class design for AgglomerativeClustering and FeatureAgglomeration (was pooling_func in AgglomerativeClustering doesn't work) Improve class design for AgglomerativeClustering and FeatureAgglomeration (was pooling_func in AgglomerativeClustering doesn't work) Oct 3, 2017
@lesteve lesteve added Need Contributor Easy Well-defined and straightforward way to resolve labels Oct 3, 2017
@thechargedneutron
Copy link
Contributor

I want to take up this issue if not currently being worked upon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Easy Well-defined and straightforward way to resolve help wanted
Projects
None yet
4 participants