AssetCheckSummaryRecord #21940

dpeng817 · 2024-05-17T16:48:38Z

Adds an "AssetCheckState" API; which functions similarly to an AssetRecord. The core conceit is to be able to query just based on asset check key a set of information about the "state" of the asset check (most recent run, for now), without having any references to the event log.

For now, it's actually powered by the event log, however.

How I tested this

Added a new semi-end-to-end test to our event log storage tests. This required me setting up an instance hooked up to an event log storage; which I think is nice to have anyway.
There's one failing test here that's confusing me; test_get_event_records_sqlite. I changed it to use the instance passed in by fixture, which should in theory not change the results? But it seems like it has. Probably missing something here.

There will be a follow up cloud pr.

dpeng817 · 2024-05-17T16:48:55Z

Freshness sensor fix instance method + in-progress checks handling #22013
AssetCheckSummaryRecord #21940 👈
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @dpeng817 and the rest of your teammates on Graphite

prha

Some more cosmetic nits... I think the storage fixtures are confusing.

Bigger issue is to get a better handle on the API signature. I think @schrockn voiced some reservations about the "state" terminology. We can do the conservative thing and maybe do get_latest_asset_check_info or something?

python_modules/dagster/dagster/_core/instance/__init__.py

prha · 2024-05-20T21:39:24Z

python_modules/dagster/dagster/_core/instance/__init__.py

@@ -164,6 +164,7 @@
    from dagster._core.storage.daemon_cursor import DaemonCursorStorage
    from dagster._core.storage.event_log import EventLogStorage
    from dagster._core.storage.event_log.base import (
+        AssetCheckState,


will defer to @schrockn on AssetCheckState vs LatestAssetCheckInfo.

prha · 2024-05-20T21:41:27Z

python_modules/dagster/dagster_tests/storage_tests/utils/event_log_storage.py

@@ -452,6 +454,13 @@ def event_log_storage(self, request):
    def instance(self, request) -> Optional[DagsterInstance]:
        return None

+    @pytest.fixture
+    def end_instance(self, instance):


this is confusing to me... Why are they different?

See the internal PR; but the essential thing is that in the graphql event log storage tests, the "instance" fixture is expected to be the host instance, rather than the user code instance. So we need a separate instance for the "user code instance" for the cases where they are different.

I think the reason this is confusing is that we are presumably testing the event log storage implementation.

But we don't make use of the fixtured storage at all in the test. We're effectively testing the implementation of DagsterInstance rather than any particular implementation of the EventLogStorage.

Instead, I would expect to see the following:

def test_get_asset_check_state(self, storage: EventLogStorage): ... events = _synthesize_events(...) for event in events: storage.store_event(event) latest_check_info = storage.get_latest_check_info(...) # assert things about the latest check info here

This way, we don't have to touch the instance at all, don't have to be aware that there is a storage class on the instance or anything.

The nice thing about this is that we're really just testing the relationship between store_event and get_asset_check_state / get_latest_check_info / whatever we decide to call this.

schrockn · 2024-05-21T14:36:55Z

Idea: We could call these "summary records". It is a rollup of information about a particular key at a given point in time.

One use case I am imagining is generating a log of these over time, so you could understand why the system made decisions when it did.

"State record" could also work. But in general I like the notion of also thinking of these as a varietal of record.

addressed

prha

Okay if the internal/oss test suites pass.

Adds an "AssetCheckState" API; which functions similarly to an AssetRecord. The core conceit is to be able to query just based on asset check key a set of information about the "state" of the asset check (most recent run, for now), without having any references to the event log. For now, it's actually powered by the event log, however. How I tested this - Added a new semi-end-to-end test to our event log storage tests. This required me setting up an instance hooked up to an event log storage; which I think is nice to have anyway. - There's one failing test here that's confusing me; test_get_event_records_sqlite. I changed it to use the instance passed in by fixture, which should in theory not change the results? But it seems like it has. Probably missing something here. There will be a follow up cloud pr.

dpeng817 force-pushed the dpeng817/asset_state_api branch from 9ee530c to 5347d1a Compare May 17, 2024 22:04

dpeng817 requested review from prha and sryza and removed request for prha May 17, 2024 22:06

dpeng817 changed the title ~~Asset state instance API~~ AssetCheckState May 17, 2024

dpeng817 force-pushed the dpeng817/asset_state_api branch from 5347d1a to 5fef025 Compare May 18, 2024 00:13

dpeng817 requested a review from johannkm May 20, 2024 21:38

prha previously requested changes May 20, 2024

View reviewed changes

dpeng817 changed the title ~~AssetCheckState~~ AssetCheckSummaryRecord May 21, 2024

dpeng817 force-pushed the dpeng817/asset_state_api branch from 5fef025 to ba974e4 Compare May 21, 2024 21:16

Asset state instance API

86ac348

dpeng817 force-pushed the dpeng817/asset_state_api branch from ba974e4 to 86ac348 Compare May 21, 2024 22:41

dpeng817 mentioned this pull request May 21, 2024

Freshness sensor fix instance method + in-progress checks handling #22013

Merged

prha self-requested a review May 21, 2024 23:39

prha approved these changes May 21, 2024

View reviewed changes

dpeng817 merged commit 50a8675 into master May 22, 2024
1 check failed

dpeng817 deleted the dpeng817/asset_state_api branch May 22, 2024 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssetCheckSummaryRecord #21940

AssetCheckSummaryRecord #21940

dpeng817 commented May 17, 2024 •

edited

dpeng817 commented May 17, 2024 •

edited

prha left a comment

prha May 20, 2024

prha May 20, 2024

dpeng817 May 20, 2024

prha May 20, 2024

prha May 20, 2024

schrockn commented May 21, 2024

prha left a comment

AssetCheckSummaryRecord #21940

AssetCheckSummaryRecord #21940

Conversation

dpeng817 commented May 17, 2024 • edited

dpeng817 commented May 17, 2024 • edited

prha left a comment

Choose a reason for hiding this comment

prha May 20, 2024

Choose a reason for hiding this comment

prha May 20, 2024

Choose a reason for hiding this comment

dpeng817 May 20, 2024

Choose a reason for hiding this comment

prha May 20, 2024

Choose a reason for hiding this comment

prha May 20, 2024

Choose a reason for hiding this comment

schrockn commented May 21, 2024

prha left a comment

Choose a reason for hiding this comment

dpeng817 commented May 17, 2024 •

edited

dpeng817 commented May 17, 2024 •

edited