refactor: delegate config validation to Kafka Connect #130

dalelane · 2024-02-06T22:50:31Z

Description

The connector was doing a lot of it's own processing of config options - type-checking, null-checking, casting, etc.

This makes the connector-specific aspects of the code harder to follow, and it also makes it easier for us to miss validating some required config options (for example, we weren't checking that topic names were provided).

This commit moves all of this out of the connector, and makes it the responsibility of the Kafka Connect framework. This removes a lot of unnecessary checking and type checking/casting from the connector code, as the source task and jms reader class can assume that they will be given a validated config.

For example, repeated blocks of code like this:

boolean optionAsBool = false; // default value
String optionAsString = props.get(MQSourceConnector.SOME_OPTION_KEY);
if (optionAsString != null) {
     optionAsBool = Boolean.parseBoolean(optionAsString);
}

becomes:

boolean optionAsBool = options.getBoolean(MQSourceConnector.SOME_OPTION_KEY);

Note that I did have to leave the existing Map<String, String> parameter in the record builder as that is a public interface that users have subclassed. This should ideally be a Kafka Connect object rather than a raw map, but in the interest of not breaking existing subclasses, I've left it as a map. This does mean I'm converting the properties map to a config object twice, but as this is only at startup I think the performance cost will be minimal.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

none of these - no functional change should be observed

How Has This Been Tested?

existing tests verify no regressions

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged and published in downstream modules

The connector was doing a lot of it's own processing of config options - type-checking, null-checking, casting, etc. This makes the connector-specific aspects of the code harder to follow, and it also makes it easier for us to miss validating some required config options (for example, we weren't checking that topic names were provided). This commit moves all of this out of the connector, and makes it the responsibility of the Kafka Connect framework. This removes a lot of unnecessary checking and type checking/casting from the connector code, as the source task and jms reader class can assume that they will be given a validated config. Note that I did have to leave the existing Map<String, String> parameter in the record builder as that is a public interface that users have subclassed. This should ideally be a Kafka Connect object rather than a raw map, but in the interest of not breaking existing subclasses, I've left it as a map. This does mean I'm converting the properties map to a config object twice, but as this is only at startup I think the performance cost will be minimal. Signed-off-by: Dale Lane <dale.lane@uk.ibm.com>

jhughes24816 · 2024-02-07T08:52:00Z

src/main/java/com/ibm/eventstreams/connect/mqsource/JMSReader.java

        try {
            final Class<? extends RecordBuilder> c = Class.forName(builderClass).asSubclass(RecordBuilder.class);
            builder = c.newInstance();
-            builder.configure(props);
+            builder.configure(config.originalsStrings());


Is there a reason to pass the strings rather than the config itself?

I put that in the commit message:

Note that I did have to leave the existing Map<String, String> parameter in the record builder as that is a public interface that users have subclassed. This should ideally be a Kafka Connect object rather than a raw map, but in the interest of not breaking existing subclasses, I've left it as a map. This does mean I'm converting the properties map to a config object twice, but as this is only at startup I think the performance cost will be minimal.

It is a bit clunky and frustrating - in an ideal world we wouldn't be passing untyped strings around, but without a breaking change I felt like we were stuck with it for now.

jhughes24816 · 2024-02-07T08:57:33Z

src/main/java/com/ibm/eventstreams/connect/mqsource/MQSourceConnector.java

+    static {
+        CONFIGDEF = new ConfigDef();
+
+        CONFIGDEF.define(CONFIG_NAME_MQ_QUEUE_MANAGER,


Is CONFIG_NAME useful as a prefix? could we simply use to the DOCUMENTATION and MESSAGE prefixes for the corresponding strings? and remove the CONFIG_NAME prefix altogether?

same sort of thing here, I'm afraid - as these are public statics in a project that users are encouraged to develop against, I feel like these can't be altered without a major semantic version change

I agree that they're not ideal as/where they are though

I suspect this was accidentally added as part of resolving a merge conflict, but this shouldn't be here. Signed-off-by: Dale Lane <dale.lane@uk.ibm.com>

We use the latest tag MQ image in our integration tests. In December, MQ's container image removed the no-auth svrconn channel so this breaks the connector tests that try to make connections to MQ without credentials. This commit adds a custom MQSC script to configure the queue manager to restore the previous behaviour. The MQSourceTaskAuthIT tests still test the ability to connect to a queue manager with auth credentials, so this commit means we still test connecting with and without credentials. Signed-off-by: Dale Lane <dale.lane@uk.ibm.com>

jhughes24816

LGTM thanks!

jhughes24816 reviewed Feb 7, 2024

View reviewed changes

dalelane added 2 commits February 7, 2024 10:51

fix: remove duplicate pom.xml entry

2ee5ca4

I suspect this was accidentally added as part of resolving a merge conflict, but this shouldn't be here. Signed-off-by: Dale Lane <dale.lane@uk.ibm.com>

jhughes24816 approved these changes Feb 7, 2024

View reviewed changes

dalelane merged commit 6d2b939 into ibm-messaging:main Feb 7, 2024
2 checks passed

dalelane deleted the config-refactor branch February 7, 2024 20:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: delegate config validation to Kafka Connect #130

refactor: delegate config validation to Kafka Connect #130

dalelane commented Feb 6, 2024

jhughes24816 Feb 7, 2024

dalelane Feb 7, 2024

jhughes24816 Feb 7, 2024

dalelane Feb 7, 2024

jhughes24816 left a comment

refactor: delegate config validation to Kafka Connect #130

refactor: delegate config validation to Kafka Connect #130

Conversation

dalelane commented Feb 6, 2024

Description

Type of change

How Has This Been Tested?

Checklist

jhughes24816 Feb 7, 2024

Choose a reason for hiding this comment

dalelane Feb 7, 2024

Choose a reason for hiding this comment

jhughes24816 Feb 7, 2024

Choose a reason for hiding this comment

dalelane Feb 7, 2024

Choose a reason for hiding this comment

jhughes24816 left a comment

Choose a reason for hiding this comment