RuleFitClassifier not working with simple example using iris data #131

gialmisi · 2022-08-25T09:48:09Z

The following code snippet results in an error:

from sklearn.datasets import load_iris
from imodels import RuleFitClassifier

iris = load_iris()
X, Y = iris.data, iris.target
rulefit = RuleFitClassifier()
rulefit.fit(X, Y)
print(rulefit)

The error reads:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_208411/3401153452.py in <cell line: 9>()
      7 rulefit = RuleFitClassifier()
      8 rulefit.fit(X, Y)
----> 9 print(rulefit)

~/.cache/pypoetry/virtualenvs/xlemoo-6BFI3yUJ-py3.8/lib/python3.8/site-packages/imodels/rule_set/rule_fit.py in __str__(self)
    247         s += '> \tPredictions are made by summing the coefficients of each rule\n'
    248         s += '> ------------------------------\n'
--> 249         return s + self.visualize().to_string(index=False) + '\n'
    250 
    251     def _extract_rules(self, X, y) -> List[Rule]:

~/.cache/pypoetry/virtualenvs/xlemoo-6BFI3yUJ-py3.8/lib/python3.8/site-packages/imodels/rule_set/rule_fit.py in visualize(self, decimals)
    237 
    238     def visualize(self, decimals=2):
--> 239         rules = self._get_rules()
    240         rules = rules[rules.coef != 0].sort_values("support", ascending=False)
    241         pd.set_option('display.max_colwidth', None)

~/.cache/pypoetry/virtualenvs/xlemoo-6BFI3yUJ-py3.8/lib/python3.8/site-packages/imodels/rule_set/rule_fit.py in _get_rules(self, exclude_zero_coef, subregion)
    208         for i in range(0, n_features):
    209             if self.lin_standardise:
--> 210                 coef = self.coef[i] * self.friedscale.scale_multipliers[i]
    211             else:
    212                 coef = self.coef[i]

IndexError: index 4 is out of bounds for axis 0 with size 4

I tried to look into this issue myself, but I am not familiar enough with the method to make any definitive claims. However, this line of code seems fishy. Why not just use the actual number of features stored in self.n_features? Could be a source of the indexing error.

The text was updated successfully, but these errors were encountered:

vruusmann · 2023-08-03T13:39:28Z

Looks like imodels classifiers only work with binary classification problems.

The iris dataset deals with a multi-class classification problem. The code snippet can be fixed by transforming the label from multi-class to binary:

from sklearn.datasets import load_iris
from imodels import RuleFitClassifier

import numpy

iris = load_iris()
X, y = iris.data, iris.target

# THIS! Predict if the iris species is "virginica" or not
y = numpy.where(y == 2, 1, 0)
#print(y)

rulefit = RuleFitClassifier()
rulefit.fit(X, y)
print(rulefit)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuleFitClassifier not working with simple example using iris data #131

RuleFitClassifier not working with simple example using iris data #131

gialmisi commented Aug 25, 2022

vruusmann commented Aug 3, 2023 •

edited

RuleFitClassifier not working with simple example using iris data #131

RuleFitClassifier not working with simple example using iris data #131

Comments

gialmisi commented Aug 25, 2022

vruusmann commented Aug 3, 2023 • edited

vruusmann commented Aug 3, 2023 •

edited