Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] KafkaOpenMetadataTopicConnector requires both producer and consumer config #7409

Open
juergenhemelt opened this issue Feb 14, 2023 · 5 comments
Assignees
Labels
bug Something isn't working pinned Keep open (do not time out)

Comments

@juergenhemelt
Copy link

Existing/related issue?

odpi/egeria-connector-integration-lineage-event-driven-sample#42

Current Behavior

Whenever a connector is configured and a configuration for the producer and/or consumer is not passed, default values will be used as described here: https://egeria-project.org/connectors/resource/kafka-open-metadata-topic-connector/?h=kafka+op#default-properties-for-the-producer-and-consumer

If the connector is only used for reading and not for writing, a producer is gratuitous. But unless you configure both the consumer and the producer the connector will not start It checks the availability of the Kafka brokers for the consumer and the producer and if the producer is not configured it takes localhost:9092as default. Unless you have a local Kafka cluster the startup will fail.

Expected Behavior

There should be a configuration to tell the connector if it is for reading, writing or both. Depending on this configuration the availability of the Kafka broker(s) should only be checked for the consumer and/or producer config.

Steps To Reproduce

I used the SampleLineageEventReceiverIntegrationConnector and configured only a consumer and not a producer for the embedded OpenMetadataTopicConnector.

Environment

- Egeria: 3.15
- OS: Linux
- Java: 11.x
- Browser (for UI issues):
- Additional connectors and integration:

Any Further Information?

Workaround is to configure both a consumer and a producer with the same Kafka properties.

@juergenhemelt juergenhemelt added bug Something isn't working triage New bug/issue which needs checking & assigning labels Feb 14, 2023
@planetf1 planetf1 added the message-bus message bus, async processessing, topics label Feb 21, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Apr 23, 2023
@planetf1 planetf1 removed the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Apr 24, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Jun 24, 2023
@mandy-chessell mandy-chessell self-assigned this Jun 25, 2023
@mandy-chessell mandy-chessell added pinned Keep open (do not time out) and removed no-issue-activity Issues automatically marked as stale because they have not had recent activity. message-bus message bus, async processessing, topics triage New bug/issue which needs checking & assigning labels Jun 25, 2023
@mandy-chessell
Copy link
Contributor

I think this one should be pursued ...

@mandy-chessell
Copy link
Contributor

The aim stated above is to minimize the administration effort. In addition, there is potentially a runtime performance improvement if we can avoid starting a producer or consumer unnecessarily.

Implementing this fix needs some thought on when the information is supplied that determines whether the Kafka topic connector should start either the producer or the consumer or both. The Kafka topic connector is just one implementation of the open metadata topic connector. Changes to the main runtime code need to work for other event bus technology too.

In general, we would expect configuration for topics to occur:

  • If the server was joining a cohort. There are three topics per cohort and the server is both sending and receiving events on each of these topics. Therefore both a producer and a consumer is always required.
  • If the server has access services enabled. Many of the access service support either an inTopic or an outTopic. The configuration stored in the access service configuration is used to configure both the client-side and the server-side of each topic. Therefore configuration for both the consumer and producer is required. However, the client-side and the server-side each use different instances of the open metadata topic connector and each instance will only need either a consumer or a producer because the events flow in one direction only for each topic.
  • If the server is using the Kafka Audit Log Destination Connector, audit log records will be sent outbound from the server. This connector only needs a producer.
  • If the server is running integration connectors that are using the open metadata topic connector to connect to topics from third party technology. In this case, only the integration connector developer knows whether they need the consumer, producer or both.
  • If the server is configured to send enterprise level cohort messages to a topic, the repository services only needs a producer to sent the information.

Most of the helper methods that configure topics in the configuration document use the event bus config to provide default values for the open metadata topic connector in use. Since this is used in multiple places (and not at runtime), it needs information for both the producer and the consumer.

The code that creates the open metadata topic connection always knows whether it is sending/receiving events (or both). Therefore is should be possible to pass an architected configuration property that is part of the Open Metadata Topic Connector interface that can be interpreted as appropriate by each implementation. This value would be set by the consuming code that knows which direction events are flowing. If the value is not set then it is assumed that events are flowing in both directions.

@mandy-chessell
Copy link
Contributor

mandy-chessell commented Jun 26, 2023

Names for the new configuration property that any implementation of the OpenMetadataTopicConnector can choose to implement.

    public static final String  EVENT_DIRECTION_PROPERTY_NAME = "eventDirection";
    public static final String  EVENT_DIRECTION_INOUT   = "inOut";
    public static final String  EVENT_DIRECTION_OUT_ONLY = "outOnly";
    public static final String  EVENT_DIRECTION_IN_ONLY  = "inOnly";

These values are found in the OpenMetadataTopicProvider.

The KafkaOpenMetadataTopicProvider extends it recognized configuration properties:

        List<String>  recognizedPropertyNames = new ArrayList<>();

        recognizedPropertyNames.add(producerPropertyName);
        recognizedPropertyNames.add(consumerPropertyName);
        recognizedPropertyNames.add(serverIdPropertyName);
        recognizedPropertyNames.add(sleepTimeProperty);
        recognizedPropertyNames.add(OpenMetadataTopicProvider.EVENT_DIRECTION_PROPERTY_NAME);

        connectorType.setRecognizedConfigurationProperties(recognizedPropertyNames);

The kafka implementation of the OpenMetadataTopicConnector considers the event direction when checking if the brokers are up (does it look in the producer and/or consumer properties for the broker addresses?) and when starting the producer and consumer threads.

The configuration helper classes:

  • Set event direction to InOut for cohortTopics and the in memory enterprise repository topic.
  • Set event direction to null for the access services (more later)
  • Set event direction to outOnly for the remote enterprise repository topic and the kafka audit log destination.

The startup logic for the access services sets up the event direction appropriately in the connection when it is about the create the connector (and knows whether it is an inTopic or OutTopic and whether it is for the client or server side.

This required a small change to the multi-tenant support because Data Engine OMAS is passing its InTopic connection in the parameter for the outTopic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pinned Keep open (do not time out)
Projects
None yet
Development

No branches or pull requests

3 participants