plot_roc_curves(scores, y_true, sensitive_features) #758

michaelamoako · 2021-04-28T22:29:32Z

Is your feature request related to a problem? Please describe.

In producing Fairness Assessment results, seeing metrics given a fixed threshold is useful. In addition to that, it helps to also see an ROC curve (TPR vs. FPR) across different thresholds.

Describe the solution you'd like

A function like "plot_roc_curves(scores, y_true, sensitive_features)" that takes in sensitive_features (list of sensitive features) as a parameter, and plots roc curves across subgroups

Describe alternatives you've considered, if relevant

sklearn plot_roc_curve, though this does not support plot via sensitive groups

Additional context

Creating a Fairness Assessment using Fairlearn, ROC curve plots are missing

adrinjalali · 2021-04-29T14:25:57Z

I agree this is pretty useful, and for whoever would like to take it up, here's what I've used in my scripts. Feel free to take and generalize it.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc
from sklearn.metrics import RocCurveDisplay

def plot_auc(df, name, ax=None):
  y_pred = np.asarray(df.y_pred).astype(float)
  y_true = np.asarray(df.FraudLabel == "true").astype(int)

  fpr, tpr, _ = roc_curve(y_true, y_pred, pos_label=1)
  roc_auc = auc(fpr, tpr)
  display = RocCurveDisplay(fpr=fpr, tpr=tpr, roc_auc=roc_auc, estimator_name=name)
  display.plot(ax=ax)
  return display

plt.figure(figsize=(8, 6))
ax = plt.gca()
overall = plot_auc(df, "overall", ax)
women = plot_auc(df_women, "women", ax)
men = plot_auc(df_men, "men", ax)
nans = plot_auc(df_nan, "unspecified", ax)

And the corresponding one for precision recall curves:

import numpy as np
from sklearn.metrics import (precision_recall_curve,
                             PrecisionRecallDisplay,
                             average_precision_score)

def plot_precision_recall(df, name, ax=None):
  y_pred = np.asarray(df.y_pred).astype(float)
  y_true = np.asarray(df.FraudLabel == "true").astype(int)

  precision, recall, _ = precision_recall_curve(y_true, y_pred)
  average_precision = average_precision_score(y_true, y_pred)
  disp = PrecisionRecallDisplay(
    precision=precision, recall=recall,
    average_precision=average_precision,
    estimator_name=name
  )
  disp.plot(ax=ax)
  return display

plt.figure(figsize=(8, 6))
ax = plt.gca()
overall = plot_precision_recall(df, "overall", ax=ax)
women = plot_precision_recall(df_women, "women", ax)
men = plot_precision_recall(df_men, "men", ax)
nans = plot_precision_recall(df_nan, "unspecified", ax)

hildeweerts · 2021-04-29T14:36:17Z

Yes, we should definitely include this!

Perhaps adding a separate roc_curve_grouped(y_true, y_pred, sensitive_features) that returns the fpr, fnr, and thresholds for each group would be nice as well. That makes it easier to plot specific thresholds for each group on top of the curves, which can be quite insightful imo.

michaelamoako · 2021-04-29T15:23:18Z

Thanks! There is just one piece I am missing

Given I have a dataframe (df) with labeled sensitive features, I want to create dataframes that are a subset of this, using sensitive features. Something like:

create_sensitive_dfs(df, sensitive_features)
Input: dataframe, list of sensitive features
Output: Smaller dataframes (i.e: df_women, df_men, df_nans)

Example: sensitive_features = [age, race]
Output: df_kid_SA, df_adult_MENA, etc..

This would then allow me to use your script above

michaelamoako · 2021-04-29T15:50:46Z

Resolved: Combining all of the above @kstohr here's my workaround solution for cartesian product to create the smaller data frames - though how it would be embedded in the MetricFrame or where this same product happens is something I could not find.

Where sensitive features is a list of the sensitive feature columns (defined elsewhere)

def sensitive_pdfs(pdf, sensitive_features): 
    distinct_features_vals = []
    df_dict = {}
    for feature in sensitive_features:
        distinct_vals = set(df[feature].to_list())
        distinct_features_vals.append(distinct_vals)
    distinct_features_combos = list(product(*distinct_features_vals))
    for features in distinct_features_combos:
        query = " & ".join([f"{sensitive_features[i]} == '{features[i]}'" for i in range(len(sensitive_features))])
        filtered_df = df.query(query)
        key = tuple(sorted(features))
        df_dict[key] = filtered_df 
    return df_dict


def plot_auc(df, y_pred, y_true, name, ax=None):
  y_pred_r = np.asarray(df[y_pred]).astype(float)
  y_true_r = np.asarray(df[y_true]).astype(int)

  fpr, tpr, _ = roc_curve(y_true_r, y_pred_r, pos_label=1)
  roc_auc = auc(fpr, tpr)
  display = RocCurveDisplay(fpr=fpr, tpr=tpr, roc_auc=roc_auc, estimator_name=name)
  display.plot(ax=ax)
  return display


sensitive_pdfs_dict=sensitive_pdfs(pdf, sensitive_features)
plt.figure(figsize=(8, 6))
ax = plt.gca()
for grouping in sensitive_pdfs_dict: 
    group_plot= plot_auc(sensitive_pdfs_dict[grouping],"matching_score","golden_label", grouping, ax)

lurosenb · 2021-04-29T17:23:45Z

Hi all!

Here is some code that splits pdfs based on a list of columns (the columns must only have boolean values)

Example use:
sensitive_pdfs(df, [('color', ['blue','red']), ('city', ['ny','nj'])], 'pdf')

Example in colab notebook:
Notebook

def sensitive_pdfs(pdf, sensitive_features, title): 
  # Feature we are splitting on
  sensitive_feature, values = sensitive_features[0]

  # Safety check: guarantee that feature is boolean
  if pdf[sensitive_feature].isin(values).all():
    # Remove feature we are about to split by
    sensitive_features.pop(0)
    
    # For title tracking
    v0 = values[0]
    v1 = values[1]

    # Apply condition to split pdf 
    tmp1 = pdf[pdf[sensitive_feature] == values[0]]
    tmp2 = pdf[pdf[sensitive_feature] == values[1]]

    # Update titles for end pdfs
    title1 =  title + '_' + str(v0) + '_'
    title2 =  title + '_' + str(v1) + '_'

    # Base case check - no more features to split by
    if len(sensitive_features) == 0:
      return [(title + str(v0), tmp1)] + [(title + str(v1), tmp2)]
    else:
      # Recurse, continuing to split by sensitive features
      return sensitive_pdfs(tmp1, sensitive_features.copy(), title + '_' + str(v0) + '_') + \
             sensitive_pdfs(tmp2, sensitive_features.copy(), title + '_' + str(v1) + '_')
  else:
    raise ValueError('Oops - never should happen')

adrinjalali · 2021-04-30T06:06:05Z

Reopening since we need a nice method in fairlearn to do this.

MiroDudik · 2021-05-03T13:48:19Z

I agree that this is a super useful addition. I also like @hildeweerts suggestion to consider extending both sklearn's roc_curve and plot_roc_curve.

Until we find somebody willing to implement this, can we tease out the API we would like to see? We would use the same pattern for other curves that we discussed and come up in fairness literature like calibration_curve and cdf_curve.

For plotting, my preference is to have API of the following form:

plot_roc_curves(*, y_true, y_score, sensitive_features, **kwargs) -> matplotlib.axes.Axes

For just getting fpr, tpr, thresholds (but no plotting), we could have

roc_curves(*, y_true, y_score, sensitive_features, **kwargs)
- I am less sure about the return format here. But one idea would be to return three dictionaries fpr, tpr, thresholds that would be indexed by the sensitive feature values (or a tuple of sensitive feature values if we have multiple sensitive features).
- Another idea would be to do something similar to threshold optimizer, where we return tpr and threshold (a generalized notion of threshold) as a function of fpr, where the range of fpr values would be the same for all sensitive features. This I think is actually a much more useful output format for any further processing.

hildeweerts · 2021-05-03T14:17:03Z

I agree, let's get the conversation started (which may be relevant to the other plotting functionality as well? @romanlutz )

plot_roc_curve
- If we want to be consistent with sklearn we should use something like: plot_roc_curve(Estimator, X, y, sensitive_features, *, sample_weight=None, drop_intermediate=True, response_method='auto', name=None, ax=None, pos_label=None, **kwargs) returning a RocCurveDisplay (or similar).
- I would also be okay with using y_score directly in the API, as it allows for a bit more flexibility wrt the estimator, although that is at the risk of people accidentally using y_pred.
roc_curve(y_true, y_score, sensitive_features, *, pos_label=None, sample_weight=None, drop_intermediate=True)
- I don't see a need for **kwargs here
- I think returning a dictionary makes sense! Could you elaborate a bit on the other idea? I don't fully understand yet what that would look like.

Is there a specific reason you want to put * at the beginning? I'm always a bit wary about that because it's too easy to mess up the order of y_true and y_score or the difference between y_score and y_pred.

MiroDudik · 2021-05-03T17:30:50Z

Re. compatibility with sklearn, I noticed that they use estimator, X, y signature, but it seems very limiting (or forcing some workarounds with pass-through estimators). So, in this case, I was proposing to intentionally deviate from the sklearn pattern (that's why I would also use a different name, i.e., plot_roc_curves). I would make all of the arguments keyword-only to prevent any confusion with the sklearn format.

Re. * at the beginning: I mostly want to ensure keyword-only arguments, because some common libraries have the opposite convention from sklearn around the order of y_true vs y_pred. I wouldn't mind allowing both y_pred and y_score (as interchangeable), so the signature could be something like:

roc_curve(*, y_true, y_score=None, y_pred=None, sensitive_features, ... )

kstohr · 2021-05-17T16:00:06Z

@romanlutz Ok, I'll pick this up. Not sure how long it will take me, but will dive in and work through it.

romanlutz · 2021-05-17T16:05:50Z

Sounds great! Let us know if you have questions! We're happy to help! I'll also keep checking Discord during the sprint, of course.

michaelamoako · 2021-05-17T21:07:20Z

Quick note: There are contexts in which no negative samples in the ground truth ("No negative samples in y_true"), in which case an ROC curve can not be created

kstohr · 2021-05-17T21:14:00Z

@michaelamoako ok, so you want me to detect those cases and handle them with a try/except? Any other cases that would arise that we should handle?

kstohr · 2021-05-17T22:28:21Z

@romanlutz @michaelamoako New to this repo.. Couple quick q's before I build out this class/function.

I see you have a dir metrics and a dir postprocessing which has some plotting functionality in it already. Where do you want this feature to live? metrics or postprocessing ?
Also this function already exists: _calculate_tradeoff_points which --at first glance-- does appear to compute ROC points. Do we need to just extend this to process the data grouped by the sensitive features?
If we're plotting the ROC curve for each subgroup, that's just a line plot with traces. It sounds like Fairlearn is moving away from having a dashboard and instead enabling plotting as a set of utilities. Is that right? In which case do we want to build a base _line_plot and _grouped_line_plot class in Matplotlib or maybe plot.ly (interactive) which would be re-usable? Then inherit that class into other metrics classes that might benefit from plotting?
Related: I see this plotting functionality here: https://github.com/fairlearn/fairlearn/blob/main/fairlearn/postprocessing/_plotting.py ... do we want to add to or extend this module? i.e. build out plotting utilities in the same module?

romanlutz · 2021-05-18T18:27:45Z

@kstohr The ROC curves would be more general than the postprocessing functionality in that they could be generated for all models (not just the ones from postprocessing). I would suggest creating a new file for it under the metrics folder.

The postprocessing method has a lot of custom code that we can switch out at a later point. For now, I would completely ignore it since it has somewhat more comprehensive requirements. Same for the plotting capabilities of that module, they're very custom to the mitigation technique used there. Let's ignore it for now.

You're right about the dashboard moving away (in fact #766 deletes it), so this should just be its own standlone utility. We're all in favor of reusing things that already exist. Perhaps some of the code snippets from this thread are useful to have ROC plots with multiple lines (one per group)? I think matplotlib supports this already by just passing in the same axes to subsequent plotting commands.

kstohr · 2021-05-18T19:13:14Z

@romanlutz Ok great. So, I'll build a module under metrics: metrics/_roc.py If we want to refactor later to make things more re-usable, we can.

Yep. Saw the code snippets and incorporating as needed.

In terms of plotting, you can use plt.subplots() to add traces to the figure, but sklearn already has a class RocCurveDisplay and it looks like you can pass an existing ax to the function which should enable us to add traces to the figure for the each of the sensitive values.

romanlutz · 2021-05-18T19:34:44Z

Yes! I was hoping we could reuse what sklearn already has. In many ways, a lot of what Fairlearn does in assessment builds on sklearn by disaggregating metrics so this is yet another way we're doing so. Thanks @kstohr !

michaelamoako · 2021-05-18T20:00:58Z

Another very closely related but also valuable curve: Threshold vs. [Metric]. For example, to plot Threshold vs. [Recall]:

Assume y_true/y_pred defined elsewhere

  _, recall, thresholds = precision_recall_curve(y_true_r, y_pred_r)
  plt.plot(thresholds, recall[:-1], label=name) 
  plt.xlabel("Threshold")
  plt.ylabel("Recall")

the ROC curve function similarly also returns threshold, so there is potential to plot that on the X-axis there too

kstohr · 2021-05-18T20:41:39Z

@michaelamoako Yep. Will keep precision_recall_curve in mind. Assuming you'd also do that for each sensitive value (i.e. by group.)

romanlutz · 2021-05-18T20:58:46Z

@kstohr feel free to stick with one thing per PR. Smaller PRs are easier to review and typically get merged significantly faster 😄 @michaelamoako 's suggestion can be another PR unless it's just a few extra lines. Keep in mind that adding ROC curves should include at least a small unit test and a short user guide section, so this is already quite a bit.

michaelamoako · 2021-05-18T21:09:24Z

Agreed with @romanlutz - and yes (per group)

kstohr · 2021-05-18T22:02:59Z

@romanlutz @michaelamoako Roger that. I think Michael was just pointing out that the code should ideally be easy to adapt for this other purpose. At least that's how I undrestood it.

@MiroDudik

…766) With a slight delay (originally targeted for April) I'm finally removing the `FairlearnDashboard` since a newer version already exists in `raiwidgets`. The documentation is updated to instead use the plots @MiroDudik created with a single line directly from the `MetricFrame`. In the future we want to add more kinds of plots as already mentioned in #758 #666 and #668 . Specifically, the model comparison plots do not yet have a replacement yet. Note that the "example" added to the `examples` directory is not shown under "Example notebooks" on the webpage, which is intentional since it's technically not a notebook. This also makes #561 mostly redundant, which I'll close shortly. #667 is also directly addressed with this PR as the examples illustrate. Signed-off-by: Roman Lutz <rolutz@microsoft.com>

kstohr · 2021-05-21T16:11:06Z

@michaelamoako @romanlutz Yokee, the ref code snippets are up and running. Quick and dirty plot:

I am thinking of testing out using the existing MetricFrame to do the data splitting. The benefit of using MetricFrame is that this would be a standard way of handling splitting the data on sensitive values. You get the 'overall' curve as well as the curves for the sensitive values, and any methods we add to the MetricFrame class would be inherited by the ROC Curve class. The downside is that the _roc_utils.py would have a dependency on MetricFrame. Thoughts before I head down this path?

riedgar-ms · 2021-05-21T16:29:48Z

I don't think that a dependency on MetricFrame would be a bad thing. Or do you mean to inherit from it?

kstohr · 2021-05-21T16:47:41Z

Hoping just to import it. I'll dig into it. However, I need to pass y_test, y_pred, y_pred_proba. In addition, we need not just the sensitive values for multiple features but the cartesian product of the sensitive value features. So, it may not work.

…

On Fri, May 21, 2021 at 9:30 AM Richard Edgar ***@***.***> wrote: I don't think that a dependency on MetricFrame would be a bad thing. Or do you mean to inherit from it? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#758 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6WM57VBVG4TTCA73RZC63TO2DBDANCNFSM43X6STVQ> .

-- Kas Stohr +1 646 554 7671

michaelamoako · 2021-05-21T16:51:38Z

@kstohr I just tagged you in a comment above related to Cartesian product -though where this process is happening in the MetricFrame is not clear to me

MiroDudik · 2021-05-21T16:59:52Z

@kstohr : my suggestions:

you could just start with plot_roc_curves
it would be good to figure out the invocation format on this issue (either at the same time as PR or before PR)
I wouldn't worry about integrating things internally with MetricFrame yet... I expect that we'll need to re-implement and re-factor MetricFrame because it's 10x slower than an equivalent pandas code (I'll create an issue on that hopefully next week), but if it simplifies your life (rather than makes it more complicated), go for it
One idea re. MetricFrame would be to just use the following "metric" (maybe this is what you had in mind?):

def metric(y_true, y_pred):
    return (y_true, y_pred)
mf = MetricFrame(metric, y_true, y_pred, sensitive_features=sensitive_features)
mf.by_group  # this will contain the split of the data y_true and y_pred
             # and consider all combinations if multiple sensitive features are provided

kstohr · 2021-05-21T17:14:13Z

@MiroDudik Yep that's exactly what I had in mind, but I have to pass a third parameter...looks like that's permitted in the docs and it will split on it and extract the groups. Not sure if it will handle the cartesian split. Still getting up to speed on this codebase.

MiroDudik · 2021-05-21T17:19:14Z

Cartesian split is already handled I believe, you just provide two columns of sensitive features and it'll consider all combinations. See here.

MetricFrame also supports additional column parameters, e.g.:

def metric(y_true, y_pred, y_pred_proba):
    return (y_true, y_pred, y_pred_proba)
mf = MetricFrame(metric, y_true, y_pred, sensitive_features=sensitive_features, 
                 sample_params = {'y_pred_proba': y_pred_proba})
mf.by_group  # this will contain the split of the data y_true, y_pred, and y_pred_proba
             # and consider all combinations if multiple sensitive features are provided

michaelamoako · 2021-05-21T17:36:39Z

This would be tied to the NaN issue in MetricFrame right @MiroDudik ?

MiroDudik · 2021-05-21T17:52:10Z

@michaelamoako : I think you're thinking about #800? We don't have to worry about that here, because we're not using mf.difference() or other aggregates (so the above code shouldn't raise any errors).

kstohr · 2021-05-21T19:09:49Z

@miro Well, there you go. `mf.by_group` should work then. Tks.

…

On Fri, May 21, 2021 at 10:19 AM MiroDudik ***@***.***> wrote: Cartesian split is already handled I believe, you just provide two columns of sensitive features and it'll consider all combinations. See here <https://fairlearn.org/v0.6.2/auto_examples/plot_new_metrics.html#intersections-of-features> . MetricFrame also supports additional column parameters, e.g.: def metric(y_true, y_pred, y_pred_proba): return (y_true, y_pred, y_pred_proba)mf = MetricFrame(metric, y_true, y_pred, sensitive_features=sensitive_features, sample_params = {'y_pred_proba': y_pred_proba})mf.by_group # this will contain the split of the data y_true, y_pred, and y_pred_proba # and consider all combinations if multiple sensitive features are provided — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#758 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6WM5ZSJGK7HT336WLYCH3TO2I2FANCNFSM43X6STVQ> .

-- Kas Stohr +1 646 554 7671

kstohr · 2021-06-03T22:17:43Z

y'all this is coming along:

I think the best course of action is to build on MetricFrame as it will create a common language across the Fairlearn API. So you can split by group and use the series index to plot a subset of the sensitive feature subgroups (remember is the cartesian product, so selecting a smaller set to plot will be important.)

TODO:

Determine how best to instantiate plot figure, axis (right now there's a bit of redundant code; might instantiate with the class to have a default figure, ax)
Separate functions to add 'Overall" and "Baseline" traces to enable users to toggle these traces on/off
logging, error handling
Add additional AUC score and related metric functionality by sensitive feature group... if we're generating the AUC score anyway, why not return it
unit tests
Overall thoughts on this basic approach.

class roc_auc: 
    """
    Provides utilties for generating auc scores, roc curves
    and plotting roc_curves grouped by sensitive features. 
    
    Parameters
    ----------
    y_true : List, pandas.Series, numpy.ndarray, pandas.DataFrame
        The ground-truth labels for classification. (i.e. results of clf.predict())
        
    y_score : List, pandas.Series, numpy.ndarray, pandas.DataFrame
        
    sensitive_features : List, pandas.Series, dict of 1d arrays, numpy.ndarray, pandas.DataFrame
        The sensitive features which should be used to create the subgroups.
        At least one sensitive feature must be provided.
        All names (whether on pandas objects or dictionary keys) must be strings.
        We also forbid DataFrames with column names of ``None``.
        For cases where no names are provided we generate names ``sensitive_feature_[n]``.
    """
    def __init__(self,
                 y_true, 
                 y_score, 
                 sensitive_features): 
        """
        Initiate class with required parameters to generate metric. 
        TODO: validate input
        """
        self.y_true = y_true
        self.y_score = y_score 
        self.sensitive_features = sensitive_features
        self.ns_probs = [0 for n in range(len(self.y_true))]
        
    @staticmethod
    def splitter(y_true, y_pred): 
        """
        Placeholder function to enable splitting of dataframes using 
        existing MetricFrame class. 
        """
        return (y_true, y_pred)
    
    @staticmethod
    def plot_auc(y_true, y_score, name, ax=None, pos_label=1, **kwargs):
        """
        Plot auc curves. 
        """
        
        #Establish plot figure if not already generated
        if not ax: 
            # Establish plot figure
            plt.figure(figsize=(8, 6))
            ax = plt.gca()
        else: 
            self.ax = ax
            
        fpr, tpr, _ = roc_curve(y_true, y_score, pos_label=pos_label, **kwargs)
        roc_auc = auc(fpr, tpr)
        display = RocCurveDisplay(fpr=fpr, tpr=tpr, roc_auc=roc_auc, estimator_name=name)
        display.plot(ax=ax)
        return display
    
    def split_by_group(self): 
        """
        Splits data by sensitive feature subgroups. 
        See: Fairlearn.MetricFrame for more detail. 
        
        Note: MetricFrame requires y_pred (clf.predict). However, ROC curves and AUC scores 
        are generated using y_score (clf.predict_proba). 
        Method substitutes y_score (type:float) for y_pred (type:int) to conform to MetricFrame 
        required params. 
        (MetricFrame support regression and therefore allows values of type float to be 
        passed as y_pred.) 
        Admittedly this is a little weird. Alternately, you could pass `sample_params` to MetricFrame, 
        but not sure that's any cleaner. 
        """
        mf = MetricFrame(
            metric = self.splitter, 
            y_true = self.y_true, 
            y_pred = self.y_score, 
            sensitive_features = self.sensitive_features,
            #sample_params = {'y_score': y_score}
                        )
        self.sensitive_series = mf.by_group
        return self.sensitive_series
    
    def plot_roc_groups(self, 
                        sensitive_index=None, 
                        ax=None): 
        
        
        #Establish plot figure if not already generated
        if not ax: 
            # Establish plot figure
            plt.figure(figsize=(8, 6))
            ax = plt.gca()
        else: 
            self.ax = ax
        
        # Establish which combinations of sensitive features to plot
        if not sensitive_index: 
            sensitive_index = self.sensitive_series.index

        # Plot baseline - 'no skill'
        # i.e. performance of classifier is equivalent to random selection
        ns_auc = roc_auc_score(self.y_true, self.ns_probs)
        ns_fpr, ns_tpr, _ = roc_curve(self.y_true, self.ns_probs)
        ax.plot(ns_fpr, ns_tpr, linestyle='--', label='Baseline (AUC = 0.50)')

        # Plot overall model performance
        overall_auc = roc_auc_score(self.y_true, self.y_score)
        overall_fpr, overall_tpr, _ = roc_curve(self.y_true, self.y_score)
        ax.plot(overall_fpr, overall_tpr, label=f'Overall (AUC = {round(overall_auc, 2)})')
        
        # Plot ROC Curves by group
        for name in sensitive_index: 
            grp = self.sensitive_series[name]
            grp_y_true = self.sensitive_series[name][0]
            grp_y_score = self.sensitive_series[name][1]
            group_plot = plot_auc(
                y_true=grp_y_true, 
                y_score=grp_y_score, 
                name=name, ax=ax)
        return ax

riedgar-ms · 2021-06-04T01:38:30Z

Tagging @alexquach who has started work on #235 . Both of these are generically looking at 'plots for MetricFrame' so we should try to have a common API pattern (even if the details will be different)

kstohr · 2021-06-04T01:57:51Z

Totally agree. Was going to write a generic method to establish the plot trace. @alexquach if you have something going, please share.

…

On Thu, Jun 3, 2021 at 6:38 PM Richard Edgar ***@***.***> wrote: Tagging @alexquach <https://github.com/alexquach> who has started work on #235 <#235> . Both of these are generically looking at 'plots for MetricFrame' so we should try to have a common API pattern (even if the details will be different) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#758 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6WM57TNKUGECFJWTRKWG3TRAVCHANCNFSM43X6STVQ> .

riedgar-ms · 2021-06-04T16:04:56Z

@alexquach only just started this week, so there's still some playing around with 'how to plot this' before we get too far into the API design. I do like your approach of having a separate class, since there's no need to deal with MetricFrame internals. For the error bar case, it's more likely to be 'bring the MetricFrame precomputed and tell us the mapping of metrics to error bars' though.

We can meet to talk, if you think that would be helpful (and we can figure out timezones).

kstohr · 2021-06-04T16:35:09Z

What I was thinking was to build a standard plot that is “styled” the Fairlearn way, whatever that is...and then call that and instantiate it if the user does not pass their own fig, ax. I was going to do it as part of the class. If it’s in the class and then somebody looks at the code, It should be clear to them how they can change the defaults to suit their needs by either passing their own figure or updating parameters of the class instance. And, for those not familiar with plotting or just needing a convenience method, the defaults should meet their basic needs. In the ROC case, we’re taking advantage of scikit to provide default axis labels, etc. So even less to configure... Thoughts...

…

On Fri, Jun 4, 2021 at 9:05 AM Richard Edgar ***@***.***> wrote: @alexquach <https://github.com/alexquach> only just started this week, so there's still some playing around with 'how to plot this' before we get too far into the API design. I do like your approach of having a separate class, since there's no need to deal with MetricFrame internals. For the error bar case, it's more likely to be 'bring the MetricFrame precomputed and tell us the mapping of metrics to error bars' though. We can meet to talk, if you think that would be helpful (and we can figure out timezones). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#758 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6WM5ZZ5YDHAVB73MOSB23TRD2TRANCNFSM43X6STVQ> .

romanlutz · 2021-06-04T17:26:22Z

Styling shouldn't be done in this code I think. That's why people can provide their own ax. That's something @adrinjalali pointed out in the past with other plots, too.

Although, now I'm not sure you actually meant "styling" (as in https://matplotlib.org/stable/gallery/style_sheets/style_sheets_reference.html) or just axis labels, legend, etc.. Regardless, it shouldn't be preconfigured and people should be allowed to configure it however they like. I don't think that will affect the implementation in any way, other than the fact that we shouldn't set any options internally (e.g., figure size, shouldn't be set in our code).

kstohr · 2021-06-04T18:53:58Z

@romanlutz Here's how scikit learn handles it.

RocCurveDisplay

What they have done is offered the full convenience of a labeled plot. However, those who want to provide their own fig, ax, or overwrite the defaults can. They use the default figure size, font, etc. Notice how they return the figure and the axis as 'self' which enables the user to configure the plot further if they wish.

I am following the above approach. In fact, in the case of Roc Curves, I am actually using RocCurveDisplay.

Because plotting is specific to the metric you are plotting, I think providing labeled plotting functions that add convenience to the relevant modules makes more sense than building a generic shared plotting utility. Basically adapting common metrics to return data by subgroup and the associated plots to handle plotting sensitive features by subgroup.

kstohr · 2021-06-18T12:59:13Z

Hi all. I pushed a draft PR for this last night. Take a look at the basic approach and give me feedback when you have a moment.

https://github.com/fairlearn/fairlearn/pull/869/files

It's not passin CI/CD checks yet, still needs.

more functional testing;
unit tests
improved example documentation
etc.

adrinjalali added the help wanted label Apr 29, 2021

michaelamoako closed this as completed Apr 29, 2021

adrinjalali reopened this Apr 30, 2021

romanlutz mentioned this issue May 3, 2021

Remove FairlearnDashboard and replace it with matplotlib-based plots #766

Merged

romanlutz assigned kstohr May 17, 2021

romanlutz linked a pull request Feb 4, 2023 that will close this issue

ENH Plot ROC AUC curves #869

Open

plot_roc_curves(scores, y_true, sensitive_features) #758

plot_roc_curves(scores, y_true, sensitive_features) #758

Comments

michaelamoako commented Apr 28, 2021

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered, if relevant

Additional context

adrinjalali commented Apr 29, 2021

hildeweerts commented Apr 29, 2021

michaelamoako commented Apr 29, 2021

michaelamoako commented Apr 29, 2021 • edited

lurosenb commented Apr 29, 2021

adrinjalali commented Apr 30, 2021

MiroDudik commented May 3, 2021

hildeweerts commented May 3, 2021

MiroDudik commented May 3, 2021

kstohr commented May 17, 2021

romanlutz commented May 17, 2021

michaelamoako commented May 17, 2021

kstohr commented May 17, 2021

kstohr commented May 17, 2021

romanlutz commented May 18, 2021

kstohr commented May 18, 2021 • edited

romanlutz commented May 18, 2021

michaelamoako commented May 18, 2021

kstohr commented May 18, 2021

romanlutz commented May 18, 2021

michaelamoako commented May 18, 2021

kstohr commented May 18, 2021

kstohr commented May 21, 2021

riedgar-ms commented May 21, 2021

kstohr commented May 21, 2021 via email

michaelamoako commented May 21, 2021

MiroDudik commented May 21, 2021 • edited

kstohr commented May 21, 2021

MiroDudik commented May 21, 2021

michaelamoako commented May 21, 2021

MiroDudik commented May 21, 2021

kstohr commented May 21, 2021 via email

kstohr commented Jun 3, 2021

riedgar-ms commented Jun 4, 2021

kstohr commented Jun 4, 2021 via email

riedgar-ms commented Jun 4, 2021

kstohr commented Jun 4, 2021 via email

romanlutz commented Jun 4, 2021

kstohr commented Jun 4, 2021

kstohr commented Jun 18, 2021

michaelamoako commented Apr 29, 2021 •

edited

kstohr commented May 18, 2021 •

edited

MiroDudik commented May 21, 2021 •

edited