Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix segmented custom performance/output metrics #1518

Closed
wants to merge 1 commit into from

Conversation

richard-rogers
Copy link
Contributor

Description

Tag custom performance/output metrics for segmented profiles.

Changes

Comment on lines +777 to +778
self._tag_custom_perf_metrics(file)
self._tag_custom_output_metrics(file)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self._tag_custom_perf_metrics(file)
self._tag_custom_output_metrics(file)

I think the tagging happens recursively, so these aren't needed

classifier="output", data_type=data_type, discreteness=discreteness # type: ignore
)
self._set_column_schema(column_name, column_schema=column_schema)
def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):
def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView]):

I don't think we need to handle ResultSet here

if column_name.startswith(perf_col):
metric = KNOWN_CUSTOM_PERFORMANCE_METRICS[perf_col]
self.tag_custom_performance_column(column_name, default_metric=metric)
def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):
def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView]):

@@ -132,6 +132,26 @@ def _check_whylabs_condition_count_uncompound() -> bool:
return True


def _get_column_names(x: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]) -> Set[str]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to handle ResultSet

self._set_column_schema(column_name, column_schema=column_schema)
def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):
column_names = _get_column_names(view)
for column_name in column_names:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the only reason we do this is so that we get the correct value for k, otherwise the column names are known.

Copy link
Contributor

@jamie256 jamie256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@richard-rogers
Copy link
Contributor Author

Fixed better in 1.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants