-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix segmented custom performance/output metrics #1518
Conversation
self._tag_custom_perf_metrics(file) | ||
self._tag_custom_output_metrics(file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._tag_custom_perf_metrics(file) | |
self._tag_custom_output_metrics(file) |
I think the tagging happens recursively, so these aren't needed
classifier="output", data_type=data_type, discreteness=discreteness # type: ignore | ||
) | ||
self._set_column_schema(column_name, column_schema=column_schema) | ||
def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]): | |
def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView]): |
I don't think we need to handle ResultSet
here
if column_name.startswith(perf_col): | ||
metric = KNOWN_CUSTOM_PERFORMANCE_METRICS[perf_col] | ||
self.tag_custom_performance_column(column_name, default_metric=metric) | ||
def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]): | |
def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView]): |
@@ -132,6 +132,26 @@ def _check_whylabs_condition_count_uncompound() -> bool: | |||
return True | |||
|
|||
|
|||
def _get_column_names(x: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]) -> Set[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to handle ResultSet
self._set_column_schema(column_name, column_schema=column_schema) | ||
def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]): | ||
column_names = _get_column_names(view) | ||
for column_name in column_names: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the only reason we do this is so that we get the correct value for k, otherwise the column names are known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Fixed better in 1.4.0 |
Description
Tag custom performance/output metrics for segmented profiles.
Changes
Fix tagging logic to handle segmented cases
I have reviewed the Guidelines for Contributing and the Code of Conduct.