Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Apriori algorithm (working code) #2872

Closed
dfrusdn opened this issue Feb 19, 2014 · 7 comments
Closed

Feature Request: Apriori algorithm (working code) #2872

dfrusdn opened this issue Feb 19, 2014 · 7 comments

Comments

@dfrusdn
Copy link

dfrusdn commented Feb 19, 2014

I would like to contribute my Apriori algorithm found here:

http://codereview.stackexchange.com/questions/38101/optimizing-apriori-algorithm-python-pandas

This algorithm is useful for pattern mining.

It does not have all the needed features, but would be a good point to start.

@GaelVaroquaux
Copy link
Member

Hi

Thanks a lot for your enthousiasme. What's the relevant publication. Algorithms included in scikit-learn must be literature standards and come from publications with a high citation count. 

Cheers

Gaël

-------- Original message --------
From: dfrusdn notifications@github.com
Date:19/02/2014 04:01 (GMT+01:00)
To: scikit-learn/scikit-learn scikit-learn@noreply.github.com
Subject: [scikit-learn] Feature Request: Apriori algorithm (working code) (#2872)
I would like to contribute my Apriori algorithm found here:

http://codereview.stackexchange.com/questions/38101/optimizing-apriori-algorithm-python-pandas

This algorithm is useful for pattern mining.

It does not have all the needed features, but would be a good point to start.


Reply to this email directly or view it on GitHub.

@jnothman
Copy link
Member

Apriori has no shortage of pedigree in data mining (
https://en.wikipedia.org/wiki/Apriori_algorithm), but whether it's relevant
to scikit-learn is another matter.

On 19 February 2014 16:57, Gael Varoquaux notifications@github.com wrote:

Hi

Thanks a lot for your enthousiasme. What's the relevant publication.
Algorithms included in scikit-learn must be literature standards and come
from publications with a high citation count.

Cheers

Gaël

-------- Original message --------
From: dfrusdn < notifications@github.com>
Date:19/02/2014 04:01 (GMT+01:00)
To: scikit-learn/scikit-learn scikit-learn@noreply.github.com
Subject: [scikit-learn] Feature Request: Apriori algorithm (working code) (#2872)
I would like to contribute my Apriori algorithm found here:

http://codereview.stackexchange.com/questions/38101/optimizing-apriori-algorithm-python-pandas

This algorithm is useful for pattern mining.

It does not have all the needed features, but would be a good point to
start.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHubhttps://github.com//issues/2872#issuecomment-35468796
.

@ccsv
Copy link

ccsv commented Feb 19, 2014

@jnothman
I think this belongs in Scikit-learn it is a precursor to the CBA algorithm
http://cgi.csc.liv.ac.uk/~frans/KDD/Software/CBA/cba.html

Scikit-learns front page states it provides "Simple and efficient tools for data mining and data analysis"

Associative rule learning is a good addition to the data mining component
http://en.wikipedia.org/wiki/Association_rule_learning

The problems with that algorithm would be the for loops for large data sets and there should be a calculation for confidence for each pattern.

@GaelVaroquaux
Copy link
Member

Ok I wasn't completely sure that we were talking of this one. 

Gael

-------- Original message --------
From: jnothman notifications@github.com
Date:19/02/2014 07:10 (GMT+01:00)
To: scikit-learn/scikit-learn scikit-learn@noreply.github.com
Cc: Gael Varoquaux gael.varoquaux@normalesup.org
Subject: Re: [scikit-learn] Feature Request: Apriori algorithm (working code) (#2872)
Apriori has no shortage of pedigree in data mining ( https://en.wikipedia.org/wiki/Apriori_algorithm), but whether it's relevant to scikit-learn is another matter.

On 19 February 2014 16:57, Gael Varoquaux notifications@github.com wrote:

Hi

Thanks a lot for your enthousiasme. What's the relevant publication.
Algorithms included in scikit-learn must be literature standards and come
from publications with a high citation count.

Cheers

Gaël

-------- Original message --------
From: dfrusdn < notifications@github.com>
Date:19/02/2014 04:01 (GMT+01:00)
To: scikit-learn/scikit-learn scikit-learn@noreply.github.com
Subject: [scikit-learn] Feature Request: Apriori algorithm (working code) (#2872)
I would like to contribute my Apriori algorithm found here:

http://codereview.stackexchange.com/questions/38101/optimizing-apriori-algorithm-python-pandas

This algorithm is useful for pattern mining.

It does not have all the needed features, but would be a good point to
start.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHubhttps://github.com//issues/2872#issuecomment-35468796
.


Reply to this email directly or view it on GitHub.

@larsmans
Copy link
Member

A pandas dependency is not acceptable; I for one cannot read the code you posted. What will be the interface for this algorithm? I'm not convinced that it can fit the existing API unless it's presented as a kernel approximation algorithm.

@jnothman
Copy link
Member

I think it only fits in the context of something like CBA which classifies documents by learning association rules that map feature groups to target labels. In that context, the learnt rule-set constitutes the model... but not all rules in the general apriori are utilised.

@larsmans
Copy link
Member

#2662 is a generalization of this feature request, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants