Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify the way of how a Subject it's associated with the topic #720

Open
eliax1996 opened this issue Sep 20, 2023 · 0 comments
Open

Unify the way of how a Subject it's associated with the topic #720

eliax1996 opened this issue Sep 20, 2023 · 0 comments

Comments

@eliax1996
Copy link
Contributor

Issue Description

Current Status

Let's clarify the required definitions to understand the issue properly:

  1. Topic: This is a mechanism used by Kafka to group different messages.
  2. Schema: It's a description of the structure that a message must adhere to.
  3. Subject: An entity required by the Schema Registry to establish a relationship between a schema and a topic. It's used to check the right to publish a specific messages in a topic.

The expected flow, given the entities introduced above, is as follows:

  1. We expect the user to create a schema.
  2. The user should register the schema with a specific topic using a designated Subject.
  3. The user should register a message with a specific topic using a certain Subject.
    • This implies verifying that the Subject is allowed/registered to publish to a particular topic.
    • It also involves verifying that the message structure complies with the schema associated with the entity.

To summarize the types of relationships we could have:

  • One topic could be associated with one or more subjects.
  • One subject could be associated with one or more topics.
  • One subject is always associated with one schema.

To calculate the subject deterministically, we need:

  1. The policy name (an enum representing how we calculate the subject from the provided inputs, listed here).
  2. The topic name.
  3. The record name (the namespace of the record).
  4. A prefix for the subject (used to differentiate between key and value schemas, particularly in the PROTOBUF case).

Current Issue

Currently, we have only one policy to associate a Subject with a topic, known as topic_name. This strategy is used to calculate the subject for a given schema.

With this specific policy, the relationship between topic <-> schema <-> subject is bi-directional (i.e., these entities have a one-to-one relationship). Therefore, given the topic and the schema, we can automatically compute the associated subject.

We use this property to enable users to produce messages by providing only the schema ID as a parameter, instead of the entire schema. This allows us to retrieve the schema based on its ID, calculate the unique associated subject, and check if the subject is registered for the targeted schema.

This approach works well for all cases except for Protobuf. Currently, we query the database for the schema and check if it's associated with the targeted topic, without verifying whether it's registered as a key or a value schema (meaning a user can switch between key and value by simply using the schema body). This issue needs to be addressed.

A more long-term design problem is that this property holds true only when the strategy is topic_name. In the future, we need to ensure that before proceeding with message production by providing only schema IDs, we should check if the subject can be directly computed or if we also need the schema (for the record_name strategy or the topic_record_name strategy).

A more general solution would be to formalize the Subject object and assign it an ID.
This way, even if the relationship between subject and schema is not unique, we can directly verify if the subject is allowed and if the message structure complies with the schema by retrieving the subject from the database and then obtaining the associated schema using the subject ID.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant