Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: delegate config validation to Kafka Connect #130

Merged
merged 3 commits into from
Feb 7, 2024

Conversation

dalelane
Copy link
Contributor

@dalelane dalelane commented Feb 6, 2024

Description

The connector was doing a lot of it's own processing of config options - type-checking, null-checking, casting, etc.

This makes the connector-specific aspects of the code harder to follow, and it also makes it easier for us to miss validating some required config options (for example, we weren't checking that topic names were provided).

This commit moves all of this out of the connector, and makes it the responsibility of the Kafka Connect framework. This removes a lot of unnecessary checking and type checking/casting from the connector code, as the source task and jms reader class can assume that they will be given a validated config.

For example, repeated blocks of code like this:

boolean optionAsBool = false; // default value
String optionAsString = props.get(MQSourceConnector.SOME_OPTION_KEY);
if (optionAsString != null) {
     optionAsBool = Boolean.parseBoolean(optionAsString);
}

becomes:

boolean optionAsBool = options.getBoolean(MQSourceConnector.SOME_OPTION_KEY);

Note that I did have to leave the existing Map<String, String> parameter in the record builder as that is a public interface that users have subclassed. This should ideally be a Kafka Connect object rather than a raw map, but in the interest of not breaking existing subclasses, I've left it as a map. This does mean I'm converting the properties map to a config object twice, but as this is only at startup I think the performance cost will be minimal.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

none of these - no functional change should be observed

How Has This Been Tested?

existing tests verify no regressions

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

The connector was doing a lot of it's own processing of config
options - type-checking, null-checking, casting, etc.

This makes the connector-specific aspects of the code harder to
follow, and it also makes it easier for us to miss validating some
required config options (for example, we weren't checking that
topic names were provided).

This commit moves all of this out of the connector, and makes it
the responsibility of the Kafka Connect framework. This removes a
lot of unnecessary checking and type checking/casting from the
connector code, as the source task and jms reader class can assume
that they will be given a validated config.

Note that I did have to leave the existing Map<String, String>
parameter in the record builder as that is a public interface that
users have subclassed. This should ideally be a Kafka Connect
object rather than a raw map, but in the interest of not breaking
existing subclasses, I've left it as a map. This does mean I'm
converting the properties map to a config object twice, but as
this is only at startup I think the performance cost will be
minimal.

Signed-off-by: Dale Lane <dale.lane@uk.ibm.com>
try {
final Class<? extends RecordBuilder> c = Class.forName(builderClass).asSubclass(RecordBuilder.class);
builder = c.newInstance();
builder.configure(props);
builder.configure(config.originalsStrings());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to pass the strings rather than the config itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put that in the commit message:

Note that I did have to leave the existing Map<String, String> parameter in the record builder as that is a public interface that users have subclassed. This should ideally be a Kafka Connect object rather than a raw map, but in the interest of not breaking existing subclasses, I've left it as a map. This does mean I'm converting the properties map to a config object twice, but as this is only at startup I think the performance cost will be minimal.

It is a bit clunky and frustrating - in an ideal world we wouldn't be passing untyped strings around, but without a breaking change I felt like we were stuck with it for now.

static {
CONFIGDEF = new ConfigDef();

CONFIGDEF.define(CONFIG_NAME_MQ_QUEUE_MANAGER,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is CONFIG_NAME useful as a prefix? could we simply use to the DOCUMENTATION and MESSAGE prefixes for the corresponding strings? and remove the CONFIG_NAME prefix altogether?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same sort of thing here, I'm afraid - as these are public statics in a project that users are encouraged to develop against, I feel like these can't be altered without a major semantic version change

I agree that they're not ideal as/where they are though

I suspect this was accidentally added as part of resolving a
merge conflict, but this shouldn't be here.

Signed-off-by: Dale Lane <dale.lane@uk.ibm.com>
We use the latest tag MQ image in our integration tests. In
December, MQ's container image removed the no-auth svrconn channel
so this breaks the connector tests that try to make connections
to MQ without credentials.

This commit adds a custom MQSC script to configure the queue
manager to restore the previous behaviour.

The MQSourceTaskAuthIT tests still test the ability to connect to
a queue manager with auth credentials, so this commit means we
still test connecting with and without credentials.

Signed-off-by: Dale Lane <dale.lane@uk.ibm.com>
Copy link
Contributor

@jhughes24816 jhughes24816 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@dalelane dalelane merged commit 6d2b939 into ibm-messaging:main Feb 7, 2024
2 checks passed
@dalelane dalelane deleted the config-refactor branch February 7, 2024 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants