Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref: Rationalize Kafka topic config #65793

Merged
merged 48 commits into from Mar 4, 2024
Merged

ref: Rationalize Kafka topic config #65793

merged 48 commits into from Mar 4, 2024

Conversation

lynnagara
Copy link
Member

@lynnagara lynnagara commented Feb 26, 2024

This brings the concept of logical vs physical topic override to Sentry, which is fairly closely aligned with how topic management is done in Snuba.

Both the old way (KAFKA_TOPICS) and new way (KAFKA_TOPIC_OVERRIDES and KAFKA_TOPIC_TO_CLUSTER) will be supported until we have fully migrated over in production and all client code in this repository no longer reference the old variables.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Feb 26, 2024
@lynnagara lynnagara requested a review from a team as a code owner February 27, 2024 01:24
@@ -1,7 +1,39 @@
from __future__ import annotations

from enum import Enum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we call this kafka_topic_definition ?

from typing import TypedDict


class Topic(Enum):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice I like this.

Comment on lines 3461 to 3465
# Mapping of default Kafka topic name to custom names
KAFKA_TOPIC_OVERRIDES: Mapping[str, str] = {}

# Mapping of default Kafka topic name to broker config
KAFKA_TOPIC_TO_CLUSTER: Mapping[str, Mapping[str, TopicDefinition | None]] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In snuba these are called KAFKA_TOPIC_MAP and KAFKA_BROKER_CONFIG.
I think there would be value in consolidate the naming as well. It reduces the cognitive overhead when managing moving from one config (snuba) to the other (sentry).

Any chance we can name them in the same way? Even if this naming is more appropriate than the Snuba one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally named it differently because it is not the same as Snuba at all. Here we map the topic to a cluster definition. Snuba does not go through any cluster definition and all the topic configs are mapped to the broker config. I think we might want to align them in the future but I'm not yet sure which way is better, and I want to avoid making this huge change in this PR right now.

src/sentry/conf/server.py Outdated Show resolved Hide resolved
src/sentry/conf/types/consumer_definition.py Outdated Show resolved Hide resolved
src/sentry/consumers/__init__.py Outdated Show resolved Hide resolved
src/sentry/utils/batching_kafka_consumer.py Outdated Show resolved Hide resolved
src/sentry/consumers/__init__.py Outdated Show resolved Hide resolved
Comment on lines 3474 to 3476
# When OUTCOMES_BILLING is None, it inherits from OUTCOMES and does not
# create a separate producer. Check ``track_outcome`` for details.
KAFKA_OUTCOMES_BILLING: None,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this part into #65998 as it can be applied independently (and more safely) than the rest of this PR

Copy link
Contributor

@fpacifici fpacifici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remembering how complicated this migration was in Snuba it will be critical to review the rollout plan. What is your strategy and which steps will you go through?
Also how are we going to deal with self hosted configuration ? As people may be using the old override system, we should make it clear it is deprecated and it is going to go away. In Snuba, if I remember correctly, we had loud warning log message claiming the config was deprecated.


# Mapping of default Kafka topic name to custom names
KAFKA_TOPIC_OVERRIDES: Mapping[str, str] = {
"generic-metrics-subscription-results": KAFKA_GENERIC_METRICS_SUBSCRIPTIONS_RESULTS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not following why we need to set up the override this way for one topic only. Could you please provide some more context ?

src/sentry/conf/server.py Show resolved Hide resolved
import click


class Topic(Enum):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to add a docstring here explaining that these are the logical topics. This is because here is where people add topics, this is being used by many.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added docstring. There is also a test now that asserts that these same values are present in sentry-kafka-schemas

src/sentry/conf/types/kafka_definition.py Show resolved Hide resolved
Comment on lines +46 to +50
# XXX: Eventually only Topic will be accepted here.
# For backward compatibility with getsentry, we must also
# support the physical override topic name (str, Callable[str], str)
# while the migration is taking place
topic: Required[Topic | str | Callable[[], str]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this mean that this will be the logical name only ? If yes where does the override happens ?

Copy link
Member Author

@lynnagara lynnagara Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

override happens in settings and nowhere else, it will be fully removed from consumer definitions, and as a follow up all code will stop referencing those override values directly.

@@ -237,8 +239,8 @@ def ingest_events_options() -> list[click.Option]:
},
},
"generic-metrics-subscription-results": {
"topic": settings.KAFKA_GENERIC_METRICS_SUBSCRIPTIONS_RESULTS,
"default_topic": "generic-metrics-subscription-results",
"topic": Topic.GENERIC_METRICS_SUBSCRIPTIONS_RESULTS,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any specific reason this is the only topic being set via the Topic enum ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give the same answer as above - two reasons:

  • demonstration of how this works
  • it is the only topic that has schema validation turned on. In order to keep that happening, we need to know both the default and override topic there so it must use the new system.

real_topic = settings.KAFKA_TOPIC_OVERRIDES.get(default_topic, default_topic)
else:
# TODO: Deprecated, remove once this way is no longer used
if not isinstance(consumer_topic, str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition seems counterintuitive. Was that any reason not to phrase it positively as:

if isinstance(consumer_topic, str):
    real_topic = consumer_topic
else:
   ....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not write this code, I copied it from line 398 above, and don't want to refactor it unnecessarily given it will probably go away in a couple of days.

from sentry.consumers import KAFKA_CONSUMERS
from sentry.testutils.cases import TestCase


def test_topic_definition() -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also have a test to verify that all the topics in KAFKA_TOPIC_OVERRIDES are actually in the Topic enum.

Copy link
Member Author

@lynnagara lynnagara Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is always empty/default in dev/ci so it won't do much. But I added it anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right, I guess this is more for getsentry

@lynnagara
Copy link
Member Author

lynnagara commented Mar 1, 2024

Remembering how complicated this migration was in Snuba it will be critical to review the rollout plan. What is your strategy and which steps will you go through? Also how are we going to deal with self hosted configuration ? As people may be using the old override system, we should make it clear it is deprecated and it is going to go away. In Snuba, if I remember correctly, we had loud warning log message claiming the config was deprecated.

Believe it or not I actually don't think this is going to be a particularly complicated or long migration. There is nothing to do in self-hosted since we don't really provide/document any way to to override topic configs. Though technically possible, I highly doubt anyone has figured out how to set up overrides and synchronize them across our codebases.

The next step will be to populate all the clusters and overrides in getsentry. I will do it one environment at a time to be safe, probably starting with the single tenants..

Once all the configuration is populated via both the old-way and new-way in all deployments, then we can switch all the code paths to stop referencing the old variables, and after that delete them.

@fpacifici
Copy link
Contributor

There is nothing to do in self-hosted since we don't really provide/document any way to to override topic configs. Though technically possible, I highly doubt anyone has figured out how to set up overrides and synchronize them across our codebases.

We do provide a way to override sentry settings and it is in the public docs. That contains topics config. In Snuba we did take care of self hosted as we were used to receive github issues asking to allow the kafka settings customization (like authentication) which means people did customize the kafka config.

I'd really consider adding a simple log warning (after we migrated prod) to state the settings are deprecated. Then some time later remove them.

@lynnagara lynnagara merged commit b273b38 into master Mar 4, 2024
51 checks passed
@lynnagara lynnagara deleted the kafka-topics-cleanup branch March 4, 2024 17:39
Copy link

sentry-io bot commented Mar 4, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ UnboundLocalError: cannot access local variable 'settings' where it is not associated with a value sentry.consumers in get_stream_processor View Issue

Did you find this useful? React with a 👍 or 👎

@wedamija wedamija added the Trigger: Revert add to a merged PR to revert it (skips CI) label Mar 4, 2024
@getsentry-bot
Copy link
Contributor

PR reverted: 369b089

getsentry-bot added a commit that referenced this pull request Mar 4, 2024
This reverts commit b273b38.

Co-authored-by: wedamija <6288560+wedamija@users.noreply.github.com>
lynnagara added a commit that referenced this pull request Mar 4, 2024
lynnagara added a commit that referenced this pull request Mar 4, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Mar 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components Trigger: Revert add to a merged PR to revert it (skips CI)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants