Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multi-class classification #431

Open
Mahima-ai opened this issue May 12, 2021 · 4 comments
Open

Support for multi-class classification #431

Mahima-ai opened this issue May 12, 2021 · 4 comments
Milestone

Comments

@Mahima-ai
Copy link

Hi,

I have been working on a multi-class classification problem. I used the following metrics with the pos_class parameter for all the classes one by one but the values are same for all the classes.

  1. AbsCV
  2. Accuracy
  3. Average Odds Difference
  4. Balanced Classification Rate
  5. CV
  6. F1-score
  7. NMI
  8. RenyiCorrelation
  9. SklearnMetric (for few metrics)
  10. Theil Index
  11. Yanovich

On further investigation, I figured out that the Theil index is calculated for the class having maximum instances while other metrics are calculated for Y=1.

Please let me know whether the above mentioned metrics support multi-class classification or not. If yes, then how can I use it and if not, are you planning to have it in the next release.

@Mahima-ai
Copy link
Author

Also, computing the basic metrics FPR, TPR, FNR, TNR etc... results in error as the labels are calculated implicitly based on the y_true (actual labels). This should be asked explicitly from the user, as it is possible for a subset to leave out some class label which results in sci-kit learn's error.

def confusion_matrix(
    prediction: Prediction, actual: DataTuple, pos_cls: int
) -> Tuple[int, int, int, int]:
    """Apply sci-kit learn's confusion matrix."""
    actual_y: np.ndarray = actual.y.to_numpy(dtype=np.int32)
    labels: np.ndarray = np.unique(actual_y)
    if labels.size == 1:
        labels = np.array([0, 1], dtype=np.int32)
    conf_matr: np.ndarray = conf_mtx(y_true=actual_y, y_pred=prediction.hard, labels=labels)

@olliethomas
Copy link
Member

Hi @Mahima-ai - thanks for raising an issue. These models should definitely get fixed if they're not right! Sorry if this has caused you problems, we just haven't got any datasets in EthicML where we do multi-class classification so we haven't really thought about it very much. That said, it is something I want us to do right - FairML work in general is far too focussed on binary classification problems. Do you have a dataset that we should add? (I understand that this is a long shot - datasets are not so open)

You a good point about the confusion matrix based metrics. One option is for the user to specify the label values, but as this is a property of the dataset, it would be nice if this information could be taken from there directly.

If you'd like to open a PR, you're more than welcome. Alternatively I will try to address this. Looking through the tests, we don't have enough that use the non_binary_toy dataset.

Sorry again for causing you problems, and thanks by the way, for using EthicML

@Mahima-ai
Copy link
Author

Mahima-ai commented Jan 12, 2022

Hi @olliethomas,

I looked into this further and found that the metric_per_sensitive_attribute method creates a subset of the dataset. This subset has fewer rows than the actual dataset. Here, the number of unique labels differ, it gets reduced. This results in the LabelOutOfBounds Error or ValueError. The method definition is at https://github.com/predictive-analytics-lab/EthicML/blob/main/ethicml/metrics/per_sensitive_attribute.py#L23

def metric_per_sensitive_attribute(
    prediction: Prediction, actual: DataTuple, metric: Metric, use_sens_name: bool = True
) -> Dict[str, float]:

I am attaching a notebook (in the zip folder) on iris dataset to reproduce the same scenario for your perusal.
Multiclass Classification Support.zip

@olliethomas
Copy link
Member

Thanks for the notebook - I've tweaked it slightly and moved it to colab. It's available here

The error that you are seeing was due to the possible target values being inferred from the dataset.

There are actually two solutions to this.

The first, I've added in #489 - which is the option to define the labels for a metric. so em.TPR(pos_class=1) becomes em.TPR(pos_class=1, labels=[0,1,2]). This new parameter has the default value of None, and when None is passed, the default behaviour of inferring the labels remains.

The second solution is probably closer to what most users want to do. The problem is the s definition from the block

true_data=em.DataTuple(x=pd.DataFrame(data['sepal length (cm)']),
                       s=pd.DataFrame(data['sepal length (cm)']),
                       y=pd.DataFrame(data['target']))

What is happening here is that the protected attribute, s, is being set to a float where the number of unique s-values is, in this case, 35. The number of unique y-values is 3.
Given that the dataset is 150 samples, the chance of all targets being present in every unique s-group is low.

In the case of the notebook - if you set the sensitive attribute to some binary value, then everything works without having to define the labels.

true_data=em.DataTuple(x=pd.DataFrame(data['sepal length (cm)']),
                       s=pd.DataFrame(np.random.randint(0,2,size=(len(data), 1)), columns=["s"]),
                       y=pd.DataFrame(data['target']))

Hopefully this solve the issue for now - we'll make a new release soon. If I've misunderstood, or this doesn't solve the problem, please let me know. Thanks again.

@tmke8 tmke8 added this to the EthicML 2.0 milestone Mar 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants