Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird behavior of Smaller and Larger Models for same Text #42

Open
LaxmanSinghTomar opened this issue Feb 4, 2022 · 1 comment
Open

Comments

@LaxmanSinghTomar
Copy link

LaxmanSinghTomar commented Feb 4, 2022

Hey! Thanks for this easy to get started package. I was testing both original and unbiased model on following sentences:

doc_1 = "I don't know why people don't support Muslims and call them terrorists often. They are not."
doc_2 = "There is nothing wrong being in a lesbian. Everyone has feelings."

Following are the toxicity scores by them:

model_testing

The original model which is supposed to be biased is predicting doc_1 to be non-toxic as it should while the unbiased-smaller model predicts it to be toxic.

Likewise, for doc_2, the prediction should be non-toxic in ideal scenario and the original model(both smaller and larger) being biased should predict it toxic. This is what it does:

model_testing_2

Original smaller one predicts toxic while the larger one does not. Can you explain what might be causing different behavior for same text in smaller and larger models in case of both original and unbiased models here?

@LaxmanSinghTomar LaxmanSinghTomar changed the title Weird behavior of Smaller and Larger Models for both Original and Unbiased Models Weird behavior of Smaller and Larger Models for same Text Feb 4, 2022
@laurahanu
Copy link
Collaborator

Hello, sorry for the late reply and thank you for this observation!

It is hard to draw any meaningful conclusions based on a few examples, but I would imagine the difference in the smaller and larger models is due to the reduced capacity of the smaller models to learn more difficult examples as with the case of sentence negation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants