feat(release-health): Sessions should not be default #67353

lynnagara · 2024-03-20T17:21:32Z

Sessions does not even exist anymore, the metrics pipeline should be used for release health. Flip the default as all environments have been cut over.

Sessions does not even exist anymore, the metrics pipeline should be used for release health. Flip the default.

evanpurkhiser · 2024-03-20T17:31:55Z

Can you add a scope to the pr title

lynnagara · 2024-03-20T17:40:28Z

Can you add a scope to the pr title

done

jjbayer

Thanks!

jjbayer · 2024-03-20T17:55:24Z

tests/snuba/rules/conditions/test_event_frequency.py

@@ -21,7 +21,7 @@
 from sentry.testutils.skips import requires_snuba
 from sentry.utils.samples import load_data

-pytestmark = [requires_snuba]
+pytestmark = [pytest.mark.sentry_metrics, requires_snuba]


What does this do?

any attempt to access metrics in tests seems to fail without it

codecov · 2024-03-20T18:30:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.40%. Comparing base (7c89c55) to head (16142cf).
Report is 23 commits behind head on master.

❗ Current head 16142cf differs from pull request most recent head 55401ea. Consider uploading reports for the commit 55401ea to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #67353      +/-   ##
==========================================
- Coverage   79.43%   79.40%   -0.04%     
==========================================
  Files        6372     6373       +1     
  Lines      282294   282347      +53     
  Branches    48667    48679      +12     
==========================================
- Hits       224247   224202      -45     
- Misses      57676    57774      +98     
  Partials      371      371

Files	Coverage Δ
src/sentry/conf/server.py	`89.63% <100.00%> (ø)`
src/sentry/release_health/metrics.py	`95.55% <100.00%> (ø)`
src/sentry/testutils/pytest/metrics.py	`92.40% <ø> (-2.60%)`	⬇️

... and 23 files with indirect coverage changes

markstory · 2024-03-20T18:51:45Z

src/sentry/conf/server.py

-SENTRY_RELEASE_HEALTH = "sentry.release_health.sessions.SessionsReleaseHealthBackend"
+SENTRY_RELEASE_HEALTH = "sentry.release_health.metrics.MetricsReleaseHealthBackend"
 SENTRY_RELEASE_HEALTH_OPTIONS: dict[str, Any] = {}

 # Release Monitor
-SENTRY_RELEASE_MONITOR = (
-    "sentry.release_health.release_monitor.sessions.SessionReleaseMonitorBackend"
-)
+SENTRY_RELEASE_MONITOR = "sentry.release_health.release_monitor.metrics.MetricReleaseMonitorBackend"


Will this cause data loss in self-hosted? I don't remember if self-hosted ever had release health but if they did, they will lose access to that data now.

Yeah I'm not sure we ever put a proper migration in place for self hosted. We should make sure that self hosted keeps using the old backend until we do that.

No this won't, self-hosted has already cut over to use metrics release health.
https://github.com/getsentry/self-hosted/blob/b3d3ce06da1661eca62a5d1fd5112810d6bbd117/sentry/sentry.conf.example.py#L195

Did we actually backfill release health from sessions over to metrics? lgtm if so

We implemented a period where we dual wrote to sessions/metrics datasets. Then, after 3 months(~90 days), we cut over to reading from the metrics dataset

How does that work in self hosted? What if they come from a version that had sessions, and come straight to latest release using metric. How do you enforce a 90 day dual write there?

Unfortunately we weren't able to provide a way to smoothly accommodate that scenario. We gave self-hosted users as many warnings as we could through documentation and release notes months in advance before cutting over the metrics.

Ok, I don't think this is a good way to handle situations like this going forward. We could have provided a dual write backend that logs a date, dual writes, and starts reading from sessions going forward. Another option could have been to write a backfill that is only applied in self hosted. We really shouldn't be making self hosted a second class citizen like this.

Yeah, I agree with you there. There are definitely ways that this could've gone better but it also feels like a part of the broader issue that teams are shipping features without keeping self-hosted in mind.

cc @chadwhitacre

hubertdeng123

seems fine to me for self-hosted, we're already using the new default settings

lynnagara · 2024-03-20T22:57:48Z

@jjbayer @iker-barriocanal Do you want me to remove or fix these tests? I'm not sure they are actually relevant anymore since this code is all deprecated, and should actually be removed (not just made non-default)

jjbayer · 2024-03-21T07:33:30Z

@jjbayer @iker-barriocanal Do you want me to remove or fix these tests? I'm not sure they are actually relevant anymore since this code is all deprecated, and should actually be removed (not just made non-default)

@lynnagara what tests exactly? After we get rid of the SessionsReleaseHealthBackend, we can also remove the @parametrize_backend helper, and make all test classes that use it go to the metrics backend instead.

sentry/tests/snuba/sessions/test_sessions.py

Lines 21 to 41 in 6030fda

    
           def parametrize_backend(cls): 
        
               """ 
        
               hack to parametrize test-classes by backend. Ideally we'd move 
        
               over to pytest-style tests so we can use `pytest.mark.parametrize`, but 
        
               hopefully we won't have more than one backend in the future. 
        
               """ 
        
               assert isinstance(cls.backend, SessionsReleaseHealthBackend) 
        
               newcls = type( 
        
                   f"{cls.__name__}MetricsLayer", 
        
                   (BaseMetricsTestCase, cls), 
        
                   { 
        
                       "__doc__": f"Repeat tests from {cls} with metrics layer", 
        
                       "backend": MetricsReleaseHealthBackend(), 
        
                       "adjust_interval": True,  # HACK interval adjustment for new MetricsLayer implementation 
        
                   }, 
        
               ) 
        
               globals()[newcls.__name__] = newcls 
        
               return cls

lynnagara · 2024-04-01T19:23:46Z

@jjbayer are you sure that parametrize_backend is actually parametrizing anything? From what I can tell it doesn't seem to actually run any of the release health tests. When I turn them on they all fail.

lynnagara · 2024-04-01T19:48:04Z

tests/sentry/api/endpoints/test_organization_releases.py

@@ -167,6 +168,7 @@ def test_release_list_order_by_sessions_empty(self):
            response, [release_5, release_4, release_3, release_2, release_1]
        )

+    @pytest.mark.xfail(reason="Does not work with the metrics release health backend")


@wedamija fyi - is this actually expected to work with the release health backend? if not, i can remove it

Do we know why it doesn't work? I think it's probably testing something useful here that should work across backends

Likely because store_session is writing to snuba's sessions backend. I'm not really sure how to change it though, and would need someone more familiar with that code to take it up.

same for the sessions api tests fyi

This isn't a great situation:

It seems like we don't really have much (any?) coverage on many parts of release health

Meanwhile we are running tons of CI that isn't relevant at all and is frankly a total waste of CI hours since it was testing the old backend that just doesn't even exist anymore but we didn't get around to cleaning up from our codebase

@wedamija how do we get this fixed then?
There are serious risks associated with having the default backend be something that just doesn't run in production. We have to remember to override it in ops, self hosted and everywhere we deploy sentry or stuff will be broken. In the meantime, are we just pretending we have existing coverage and are "disabling" something here when we don't have any real coverage anyway?

I'm going to see if i can switch out to the old backend and preserve the tests at least

Ok that doesn't work. @iambriccardo do you have any idea how to do what dan suggests about migrating store_session?

Hi! Let me take a look. A teammate of mine did the conversion, I know partially about the whole domain but let me see if I can be of any help.

@lynnagara I fixed the problem and pushed the changes directly here. I hope that is fine by you.

jjbayer · 2024-04-02T09:29:25Z

@jjbayer are you sure that parametrize_backend is actually parametrizing anything? From what I can tell it doesn't seem to actually run any of the release health tests. When I turn them on they all fail.

Yeah, @parametrize_backend inserts a "metrics" version of each test into the global scope, so when you run test_sessions.py you'll see

tests/snuba/sessions/test_sessions.py::SnubaSessionsTestMetricsLayer::test_basic_release_model_adoptions PASSED
tests/snuba/sessions/test_sessions.py::SnubaSessionsTest::test_basic_release_model_adoptions PASSED

iambriccardo · 2024-04-02T13:45:06Z

src/sentry/testutils/pytest/metrics.py

@@ -85,7 +85,7 @@ def new_create_snql_in_snuba(subscription, snuba_query, snql_query, entity_subsc
            is_performance_metrics = False
            is_metrics = False
            if isinstance(query.match, Entity):
-                is_performance_metrics = query.match.name.startswith("generic_metrics")
+                is_performance_metrics = query.match.name.startswith("generic")


This is not the best fix but it was the easiest for the scope of this PR. We might want in the future to improve this.

Out of curiosity, how does this change fix the tests?

The acceptance tests were running into some endpoints which were querying generic metrics but via the entity key generic_org_metrics_counters which is still on generic metrics but failed the generic_metrics check. For this reason I relaxed the condition but a better solution would be to have some sort of mapping or inferring the use case id from the query and marking it as performance if it's not sessions.

lynnagara

Thanks @iambriccardo for the fix.. unfortunately there are 4 more modules/tests marked xfail in this PR. Could we apply similar there?

lynnagara · 2024-04-02T16:44:47Z

tests/snuba/api/endpoints/test_organization_sessions.py

@@ -86,6 +86,7 @@ def adjust_end(end: datetime.datetime, interval: int) -> datetime.datetime:
    return end


+@pytest.mark.xfail(reason="Does not work with the metrics release health backend")


@iambriccardo any thoughts on this one?

codecov · 2024-04-03T12:00:43Z

Bundle Report

Changes will increase total bundle size by 37.0kB ⬆️

Bundle name	Size	Change
sentry-webpack-bundle-array-push	26.11MB	37.0kB ⬆️

iambriccardo · 2024-04-03T12:53:55Z

tests/snuba/rules/conditions/test_event_frequency.py

@@ -311,7 +311,7 @@ def make_session(i):
                duration=None,
                errors=0,
                # The line below is crucial to spread sessions throughout the time period.
-                started=received - i,
+                started=received - i - 1,


We had to do this since the new metrics based implementation has the end interval as non inclusive so we want data to fit within the interval because we use interval: x in tests which is going back x seconds from the current time.

Sessions does not even exist anymore, the metrics pipeline should be used for release health. Flip the default as all environments have been cut over.

feat: Sessions should not be default

804e0e4

Sessions does not even exist anymore, the metrics pipeline should be used for release health. Flip the default.

lynnagara requested review from a team and jjbayer March 20, 2024 17:21

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 20, 2024

lynnagara requested review from a team and hubertdeng123 March 20, 2024 17:22

vercel bot deployed to Preview March 20, 2024 17:24 View deployment

lynnagara changed the title ~~feat: Sessions should not be default~~ feat(release-health): Sessions should not be default Mar 20, 2024

requires metrics

bea49a7

vercel bot deployed to Preview March 20, 2024 17:53 View deployment

jjbayer approved these changes Mar 20, 2024

View reviewed changes

markstory reviewed Mar 20, 2024

View reviewed changes

hubertdeng123 approved these changes Mar 20, 2024

View reviewed changes

fix some tests

ca558be

vercel bot deployed to Preview March 20, 2024 22:59 View deployment

Merge branch 'master' into sessions-is-not-default

482595b

vercel bot deployed to Preview April 1, 2024 19:05 View deployment

update tests

ef22bc4

vercel bot deployed to Preview April 1, 2024 19:39 View deployment

lynnagara commented Apr 1, 2024

View reviewed changes

more test stuff

b88e787

lynnagara requested review from a team as code owners April 1, 2024 20:21

vercel bot deployed to Preview April 1, 2024 20:23 View deployment

yet more tests

d6184ea

vercel bot deployed to Preview April 1, 2024 20:52 View deployment

one more

c86e019

vercel bot deployed to Preview April 1, 2024 21:37 View deployment

iambriccardo added 2 commits April 2, 2024 09:08

Fix offset missing

5dabab4

Remove xfail

1ef5c0d

vercel bot deployed to Preview April 2, 2024 07:13 View deployment

iambriccardo self-requested a review April 2, 2024 07:14

iambriccardo added 3 commits April 2, 2024 14:49

Try fix

16142cf

Try fix

62855c0

Try fix

55401ea

vercel bot deployed to Preview April 2, 2024 12:53 View deployment

iambriccardo reviewed Apr 2, 2024

View reviewed changes

lynnagara commented Apr 2, 2024

View reviewed changes

Fix

929c1fa

vercel bot deployed to Preview April 3, 2024 11:57 View deployment

Fix

7315406

iambriccardo reviewed Apr 3, 2024

View reviewed changes

vercel bot deployed to Preview April 3, 2024 12:54 View deployment

xfail test

42ac0cc

vercel bot deployed to Preview April 3, 2024 14:24 View deployment

lynnagara merged commit c4349cf into master Apr 3, 2024
49 checks passed

lynnagara deleted the sessions-is-not-default branch April 3, 2024 21:55

shellmayr pushed a commit that referenced this pull request Apr 10, 2024

feat(release-health): Sessions should not be default (#67353)

b04bc42

Sessions does not even exist anymore, the metrics pipeline should be used for release health. Flip the default as all environments have been cut over.

github-actions bot locked and limited conversation to collaborators Apr 19, 2024

		@@ -86,6 +86,7 @@ def adjust_end(end: datetime.datetime, interval: int) -> datetime.datetime:
		return end


		@pytest.mark.xfail(reason="Does not work with the metrics release health backend")

feat(release-health): Sessions should not be default #67353

feat(release-health): Sessions should not be default #67353

Conversation

lynnagara commented Mar 20, 2024 • edited

evanpurkhiser commented Mar 20, 2024

lynnagara commented Mar 20, 2024

jjbayer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Mar 20, 2024 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wedamija Mar 20, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hubertdeng123 left a comment

Choose a reason for hiding this comment

lynnagara commented Mar 20, 2024 • edited

jjbayer commented Mar 21, 2024

lynnagara commented Apr 1, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jjbayer commented Apr 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lynnagara left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Apr 3, 2024

Bundle Report

Choose a reason for hiding this comment

lynnagara commented Mar 20, 2024 •

edited

codecov bot commented Mar 20, 2024 •

edited

wedamija Mar 20, 2024 •

edited

lynnagara commented Mar 20, 2024 •

edited