New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CDF_DemographicParity for regression #563
base: main
Are you sure you want to change the base?
Conversation
2814b61
to
da395f0
Compare
@Acornagain can you provide more detail in the description about what "other" relevant components you're adding? Similarly, the description of your changes should go into the CHANGES.md file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some first comments.
Would be awesome if documentation could be added that
- describes what it's for
- explains some of the calculations
- shows how to use it (user guide for the website doc, just like all other moments)
We also definitely need tests before this can go in.
import numpy as np | ||
from itertools import repeat | ||
|
||
class CDF_DemographicParity(Moment): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure how I feel about the underscore in the class name
If nobody else cares I'm willing to let it go, though :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steven told me that I should use this name, so I guess it is fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not resolve it unless @MiroDudik and @riedgar-ms also like that name.
import numpy as np | ||
from itertools import repeat | ||
|
||
class CDF_DemographicParity(Moment): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this inherit from LossMoment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose not to make it inherited from LossMoment because the comments about LossMoment say that "Moment that can be expressed as weighted loss". It seems that CDF_DemographicParity does not fit into this.
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
…_moment.py Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
…Parity and WeightedErrorRate Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
Signed-off-by: Acorn <wanglj17@mails.tsinghua.edu.cn>
0e304d8
to
acbdedc
Compare
|
||
'''for what we need here is augmented data. Hence to avoid unnecessary calculation, we use | ||
augmented data having been calculated in regression_moment here and directly return in | ||
the function load_data to suit the inferface of _lagrangian.''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: inferface
Also please don't use '''
for comments. #
is fine, and we can do multiple lines in a row with #
as well.
This class doesn't seem related to regression_moment, so I'm a bit confused about what this comment is trying to tell me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is related to the regression_moment, for the init function for CDF_DemographicParity contains code self.objective = WeightedErrorRate()
. The _lagrangian passes the original x and y to the WeightedErrorRate, but what it really needs is the augmented version. I consider it is a little bit awkward if I recalculate the augmented version in the WeightedErrorRate, which means that this class should store a copy of the grids and loss.
…ss.py into loss.py
@Acornagain : ping. Sorry for dropping the ball on this. I can set aside some time in January to wrap this up. Are you still interested? |
The constraints' difference bound for constraints that are expressed | ||
as differences, also referred to as :math:`\\epsilon` in documentation. | ||
Default None | ||
grids: list of floats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation here doesn't really tell me what grids are or what they're used for. Can we write a sentence or two about this? You can put it in the description of the constructor further up if it's too much in this arg description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added as an explanation to the parameter grids, "As explained in section 4.1 of 'Agarwal et al. (2019) https://arxiv.org/abs/1905.12843v1' continuous predictions are discretized to this set of grids." Does this sound more clear?
@MiroDudik are we planning to try to get this in at some point, or should we close the PR? |
Add CDF_DemographicParity and other relevant components to the fairlearn library.