You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently only token classification and text classification are supported for NLP. However, there are important cases for text regression, for example:
CTR prediction for advertisements
sentiment magnitude prediction, e.g. GCP sentiment analysis predicts continous values instead of classes
ordinal regression for texts, e.g. predicting number of stars from 1 to 5 based on review text
Describe the solution you'd like
Support for text regression, similar to tabular regression, but for NLP models, e.g. checking regression error distribution or train-test degradation for regression metrics.
The text was updated successfully, but these errors were encountered:
noamzbr
added
nlp
Affects deepchecks.nlp package
ds
Tasks suited for Data Scientists
and removed
needs triage
Issue needs to be labeled and prioritized
labels
Jul 19, 2023
Specifically, tabular regression checks that don't make sense for NLP are weak segments performance (since they are not well defined for NLP), boosting overfit (since NLP does not use boosting typically) and model inference time (which is naturally long for NLP).
From [NLP text classification quickstart]:
text property outliers
unknown tokens
under annotated property segments
under annotated metadata segments
text duplicates
special characters
Also, image regression could also be added in a very similar way (but that is outside the scope of this issue).
Is your feature request related to a problem? Please describe.
Currently only token classification and text classification are supported for NLP. However, there are important cases for text regression, for example:
Describe the solution you'd like
Support for text regression, similar to tabular regression, but for NLP models, e.g. checking regression error distribution or train-test degradation for regression metrics.
The text was updated successfully, but these errors were encountered: