Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuleFitClassifier not working with simple example using iris data #131

Open
gialmisi opened this issue Aug 25, 2022 · 1 comment
Open

RuleFitClassifier not working with simple example using iris data #131

gialmisi opened this issue Aug 25, 2022 · 1 comment

Comments

@gialmisi
Copy link
Contributor

The following code snippet results in an error:

from sklearn.datasets import load_iris
from imodels import RuleFitClassifier

iris = load_iris()
X, Y = iris.data, iris.target
rulefit = RuleFitClassifier()
rulefit.fit(X, Y)
print(rulefit)

The error reads:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_208411/3401153452.py in <cell line: 9>()
      7 rulefit = RuleFitClassifier()
      8 rulefit.fit(X, Y)
----> 9 print(rulefit)

~/.cache/pypoetry/virtualenvs/xlemoo-6BFI3yUJ-py3.8/lib/python3.8/site-packages/imodels/rule_set/rule_fit.py in __str__(self)
    247         s += '> \tPredictions are made by summing the coefficients of each rule\n'
    248         s += '> ------------------------------\n'
--> 249         return s + self.visualize().to_string(index=False) + '\n'
    250 
    251     def _extract_rules(self, X, y) -> List[Rule]:

~/.cache/pypoetry/virtualenvs/xlemoo-6BFI3yUJ-py3.8/lib/python3.8/site-packages/imodels/rule_set/rule_fit.py in visualize(self, decimals)
    237 
    238     def visualize(self, decimals=2):
--> 239         rules = self._get_rules()
    240         rules = rules[rules.coef != 0].sort_values("support", ascending=False)
    241         pd.set_option('display.max_colwidth', None)

~/.cache/pypoetry/virtualenvs/xlemoo-6BFI3yUJ-py3.8/lib/python3.8/site-packages/imodels/rule_set/rule_fit.py in _get_rules(self, exclude_zero_coef, subregion)
    208         for i in range(0, n_features):
    209             if self.lin_standardise:
--> 210                 coef = self.coef[i] * self.friedscale.scale_multipliers[i]
    211             else:
    212                 coef = self.coef[i]

IndexError: index 4 is out of bounds for axis 0 with size 4

I tried to look into this issue myself, but I am not familiar enough with the method to make any definitive claims. However, this line of code seems fishy. Why not just use the actual number of features stored in self.n_features? Could be a source of the indexing error.

@vruusmann
Copy link

vruusmann commented Aug 3, 2023

Looks like imodels classifiers only work with binary classification problems.

The iris dataset deals with a multi-class classification problem. The code snippet can be fixed by transforming the label from multi-class to binary:

from sklearn.datasets import load_iris
from imodels import RuleFitClassifier

import numpy

iris = load_iris()
X, y = iris.data, iris.target

# THIS! Predict if the iris species is "virginica" or not
y = numpy.where(y == 2, 1, 0)
#print(y)

rulefit = RuleFitClassifier()
rulefit.fit(X, y)
print(rulefit)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants