Fix segmented custom performance/output metrics #1518

richard-rogers · 2024-05-10T23:35:03Z

Description

Tag custom performance/output metrics for segmented profiles.

Changes

Fix tagging logic to handle segmented cases
I have reviewed the Guidelines for Contributing and the Code of Conduct.

richard-rogers · 2024-05-10T23:38:19Z

python/whylogs/api/writer/whylabs.py

+            self._tag_custom_perf_metrics(file)
+            self._tag_custom_output_metrics(file)


Suggested change

self._tag_custom_perf_metrics(file)

self._tag_custom_output_metrics(file)

I think the tagging happens recursively, so these aren't needed

richard-rogers · 2024-05-10T23:40:05Z

python/whylogs/api/writer/whylabs.py

-                            classifier="output", data_type=data_type, discreteness=discreteness  # type: ignore
-                        )
-                        self._set_column_schema(column_name, column_schema=column_schema)
+    def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):


Suggested change

def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):

def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView]):

I don't think we need to handle ResultSet here

richard-rogers · 2024-05-10T23:40:22Z

python/whylogs/api/writer/whylabs.py

-                    if column_name.startswith(perf_col):
-                        metric = KNOWN_CUSTOM_PERFORMANCE_METRICS[perf_col]
-                        self.tag_custom_performance_column(column_name, default_metric=metric)
+    def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):


Suggested change

def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):

def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView]):

richard-rogers · 2024-05-10T23:41:02Z

python/whylogs/api/writer/whylabs.py

@@ -132,6 +132,26 @@ def _check_whylabs_condition_count_uncompound() -> bool:
    return True


+def _get_column_names(x: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]) -> Set[str]:


I don't think this needs to handle ResultSet

jamie256 · 2024-05-11T00:35:42Z

python/whylogs/api/writer/whylabs.py

-                        self._set_column_schema(column_name, column_schema=column_schema)
+    def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):
+        column_names = _get_column_names(view)
+        for column_name in column_names:


I guess the only reason we do this is so that we get the correct value for k, otherwise the column names are known.

jamie256

LGTM!

richard-rogers · 2024-05-14T02:54:31Z

Fixed better in 1.4.0

Fix segmented custom performance/output metrics

b775548

richard-rogers requested review from FelipeAdachi and jamie256 May 10, 2024 23:35

richard-rogers commented May 10, 2024

View reviewed changes

jamie256 reviewed May 11, 2024

View reviewed changes

jamie256 approved these changes May 11, 2024

View reviewed changes

richard-rogers closed this May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix segmented custom performance/output metrics #1518

Fix segmented custom performance/output metrics #1518

richard-rogers commented May 10, 2024

richard-rogers May 10, 2024

richard-rogers May 10, 2024

richard-rogers May 10, 2024

richard-rogers May 10, 2024

jamie256 May 11, 2024

jamie256 left a comment

richard-rogers commented May 14, 2024

		self._tag_custom_perf_metrics(file)
		self._tag_custom_output_metrics(file)

	def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):
	def _tag_custom_output_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView]):

	def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]):
	def _tag_custom_perf_metrics(self, view: Union[DatasetProfileView, SegmentedDatasetProfileView]):

		@@ -132,6 +132,26 @@ def _check_whylabs_condition_count_uncompound() -> bool:
		return True


		def _get_column_names(x: Union[DatasetProfileView, SegmentedDatasetProfileView, ResultSet]) -> Set[str]:

Fix segmented custom performance/output metrics #1518

Fix segmented custom performance/output metrics #1518

Conversation

richard-rogers commented May 10, 2024

Description

Changes

richard-rogers May 10, 2024

Choose a reason for hiding this comment

richard-rogers May 10, 2024

Choose a reason for hiding this comment

richard-rogers May 10, 2024

Choose a reason for hiding this comment

richard-rogers May 10, 2024

Choose a reason for hiding this comment

jamie256 May 11, 2024

Choose a reason for hiding this comment

jamie256 left a comment

Choose a reason for hiding this comment

richard-rogers commented May 14, 2024