Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: threshold optimizer with relaxed fairness constraint fulfillment #1248

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

AndreFCruz
Copy link
Contributor

@AndreFCruz AndreFCruz commented Jun 28, 2023

Description

This PR corresponds to Issue #1246 (discussion is ongoing over there).

General summary: implementing ThresholdOptimizer with relaxed fairness constraint fulfillment.

Opened PR as a Draft to ease code discussions on the ongoing implementation.

Tests

  • no new tests required
  • new tests added
  • existing tests adjusted
  • new tests needed (TODO)

Documentation

  • no documentation changes needed
  • user guide added or updated
  • API docs added or updated
  • example notebook added or updated

TODO

  • convert asserts to proper error messages
  • harmonize API with sklearn classifiers and ThresholdOptimizer constructor
  • compatibility with standard inputs for X, Y, S (pd.DF, pd.Series, np.ndarray, etc.)
  • compatibility with non-callable predictors
    • May get fixed if we can switch to using the InterpolatedThresholder instead of our custom RandomizedClassifier implementations.
  • decide on API for RelaxedThresholdOptimizer (same or different class as ThresholdOptimizer)
  • add tests
  • add code documentation (classes and types are already documented)

Consolidating code with existing code-base

The _randomized_classifiers.py module provides two main functionalities:

  1. RandomizedClassifier: Constructing a randomized classifier at a given ROC point. This implies triangulating the target ROC point as a linear combination of realized (deterministic) ROC points in the realized ROC curve.
    • That is, to achieve a point in the interior of the ROC curve, we may need to use up to three realized points (each realized point uses a specific deterministic threshold). The final classifier is a randomized classifier (a classifier with a randomized threshold).
  2. EnsembleGroupwiseClassifiers: Bringing all group-specific classifiers together under a single classifier object.

Part (or all?) of this functionality seems to be covered by the InterpolatedThresholder class.

Are there any examples of using an InterpolatedThresholder object to triangulate an ROC point?

What other functionality do you reckon could be duplicated?

@romanlutz @MiroDudik

@AndreFCruz AndreFCruz marked this pull request as draft June 28, 2023 15:16
Copy link
Member

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do realize this is still WIP but I thought I'd share some early thoughts 🙂

We'll also need to think about compatibility with the existing plotting code:

def plot_threshold_optimizer(threshold_optimizer, ax=None, show_plot=True):

Are there any reasonable plots you generate for the relaxed version?

@@ -18,7 +18,7 @@ class ThresholdOperation:
"""

def __init__(self, operator, threshold):
if operator not in [">", "<"]:
if operator not in [">", "<"]: # NOTE for PR: sklearn uses >= for ROC threshold; see: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! The way we use it, it shouldn't really be crucial if it's >= or >. We can probably change it. The thresholds are always chosen between the input scores, e.g., between 0.5 and 0.6 we choose 0.55 so they're meant to be strictly on one side or the other. I would have to make sure that it doesn't change anything if we want to make a change here, though. Does this relate to your PR in some way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the thresholds in sklearn are the predicted scores for one or more instances (see code here), so using > instead of >= would mean flipping the predictions for some portion of samples.

If it's possible to change it here that would be great!

If not, is there any fairlearn code to build the (fpr, tpr, threshold) ROC triplets? I could switch to using that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fairlearn/postprocessing/_cvxpy_threshold_optimizer.py Outdated Show resolved Hide resolved
fairlearn/postprocessing/_cvxpy_threshold_optimizer.py Outdated Show resolved Hide resolved

Parameters
----------
estimator : object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you assuming that it's pre-fit (trained) or not? estimator kind of implies that it's not, while predictor (in sklearn) means it is pre-fit. We have some logic and an argument in ThresholdOptimizer to check for this:

prefit : bool, default=False

Obviously, when adding this functionality to the TO class you can take advantage of that and make sure you pass only a trained model in. In that case, you may want to call this predictor (?) but perhaps I'm splitting hairs now.

fairlearn/postprocessing/_cvxpy_threshold_optimizer.py Outdated Show resolved Hide resolved
fairlearn/postprocessing/_cvxpy_threshold_optimizer.py Outdated Show resolved Hide resolved

# Compute group-wise ROC curves
if y_scores is None:
y_scores = self.estimator(X)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing this a bit differently in TO and I would argue it's preferable:

scores = _get_soft_predictions(self.estimator_, X, self._predict_method)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, that was still using the previous callable API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we can of course omit the y_scores here, it's just here because this is the slowest part of the whole fit method and oftentimes users already have the predictions computed.

An example is when you want to map the whole fairness-accuracy Pareto frontier, you'd call RelaxedThresholdOptimizer on a bunch of tolerance values (e.g., np.arange(0, 1, 1e-2)), and each call would unnecessarily re-compute predictions. This is in fact the only part of the fit method that scales with the size of the dataset, everything else is generally fast on large datasets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point about performance, but I'm not entirely sure why a user would precompute them just to pass them in manually, rather than having the class compute the scores. Or do you mean that they need the scores for other purposes, so they have to compute them outside, so computing them twice feels like a waste?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I don't have a strong opinion on how exactly it is implemented, but I think it would be useful to somehow cache these predictions so we don't have to re-compute them for different levels of tolerance. Given that different values of tolerance will be handled by different class objects, I don't see how it could be cached at the class level (must be cached outside). Example follows:

When mapping the fairness-accuracy Pareto frontier, the user needs to create a different RelaxedThresholdOptimizer for each tolerance value, e.g.:

# Compute predictions for models with varying levels of tolerance
def compute_test_predictions_with_relaxed_constraints(tolerance: float, y_scores=None) -> np.ndarray:
    # Instantiate
    clf = _RelaxedThresholdOptimizer(
        predictor=lambda *args, **kwargs: unmitigated_predictor.predict(*args, **kwargs),
        predict_method="__call__",
        constraint="equalized_odds",
        tolerance=tolerance,
    )

    # NOTE
    # 1st Option: will call `predictor.predict(X_train)`
    clf.fit(X_train, Y_train, sensitive_features=A_train_np)

    # 2nd Option: will *not* call `predictor.predict(X_train)`
    clf.fit(X_train, Y_train, sensitive_features=A_train_np, y_scores=y_scores)

    return clf.predict(X_test, sensitive_features=A_test_np)

# [For 2nd Option] Pre-compute predictions
Y_train_scores = unmitigated_predictor.predict(X_train)

# Compute predictions at different levels of tolerance
all_model_predictions = {
    f"train tolerance={tol:.1}": compute_test_predictions_with_relaxed_constraints(tol, y_scores=Y_train_scores)
    for tol in np.arange(0, unmitigated_equalized_odds_diff, 1e-2)
}

Given that the underlying predictor is the same on all of these objects, the clf.fit(.) will call predictor.predict_proba(.) repeatedly (and redundantly) for each tolerance value. The 1st option calls the predictor's predict_method $n$ times (once per call of the helper function), while the 2nd option only calls it once (outside the function).

PS: ignore the predict_method="__call__" mess for now 😅

fairlearn/postprocessing/_cvxpy_threshold_optimizer.py Outdated Show resolved Hide resolved
@@ -0,0 +1,461 @@
"""Helper functions to construct and use randomized classifiers.

TODO: this module will probably be substituted by the InterpolatedThresholder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to make adjustments to InterpolatedThresholder let's do it (unless it's too complicated)

@romanlutz romanlutz added enhancement New feature or request API Anything which touches on the API labels Jun 28, 2023
@AndreFCruz
Copy link
Contributor Author

Thanks for the feedback!
I addressed most comments, and will be going through the remaining points throughout the week.

@AndreFCruz AndreFCruz force-pushed the andrefcruz/relaxed-postprocessing branch 3 times, most recently from 6498c81 to 07ffdb8 Compare July 4, 2023 17:00
@AndreFCruz AndreFCruz force-pushed the andrefcruz/relaxed-postprocessing branch from dbde72a to b6ada3e Compare July 6, 2023 12:52
@romanlutz
Copy link
Member

Are there any reasonable plots you generate for the relaxed version?

Answering my own question here: I saw the relaxation plot (below on the right side) in the paper.
image

Do you have plans to add that plot? I think it's super insightful, especially if you're familiar with TO. We already have plotting code for TO, so perhaps it wouldn't be too hard to extend it. No pressure, though 🙂 We just need an informative message if someone tries to pass the relaxed TO to the plotting function in case we don't add the plotting code just yet.

@AndreFCruz
Copy link
Contributor Author

Regarding the plot, I can add that exact plotting function from Figure 9 as implemented here.

For now, I wanted to get the core working and ready to merge, the plotting is somewhat independent of the rest. I'll add the check for when this class is passed to the current plotting function.

@romanlutz
Copy link
Member

Regarding the plot, I can add that exact plotting function from Figure 9 as implemented here.

For now, I wanted to get the core working and ready to merge, the plotting is somewhat independent of the rest. I'll add the check for when this class is passed to the current plotting function.

I completely agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Anything which touches on the API enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH relaxed fairness constraint fulfillment via postprocessing (currently only supports strict fulfillment)
3 participants