-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DBZ-7536 Add support for default value for text_ and varchar_ in post… #5308
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akanimesh7 This is good start, thanks! Could you please take a look at the comments I left?
@@ -253,7 +253,7 @@ public SchemaBuilder schemaBuilder(Column column) { | |||
case PgOid.INT4RANGE_ARRAY: | |||
case PgOid.NUM_RANGE_ARRAY: | |||
case PgOid.INT8RANGE_ARRAY: | |||
return SchemaBuilder.array(SchemaBuilder.OPTIONAL_STRING_SCHEMA); | |||
return SchemaBuilder.array(SchemaBuilder.STRING_SCHEMA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we keep the type optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say the DDL operation was
ALTER TABLE CUSTOMERS ADD COLUMN col2 text[] DEFAULT '{a,b}'::text[] NOT NULL;
So how it works is when the avro schema is created from the connect schema, The default value {"a","b"} for col2 is proposed to the avro schema. It verifies this default value. Refer this. Now since we have kept the type optional so this verification fails since the string "a" is not a null, which is the first type in the array of types in this union.
Having said that, i also see that there's a fix for this which has been merged 5 months before, but there has been no release after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akanimesh7 OK, TBH I am really not fan of this change if it just by-pass a bug in Avro. To handle this properly we there should be an SMT developed that will flip the schema optionality if needed but I don't think this should go to the core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpechane Understood. In that case rather than doing it for all array types, can we just do it for text[] and varchar[] ? As soon as the newer version of avro is released, these two also can be corrected. Even earlier, these two types were not supported, hence there is no harm, rather this is just an addition.
On the other hand creation of SMT will need some more time and will also be temporary code that will come in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the problem is that you focus on it from Avro PoV only. But there are also converters for JSON and Protobuf. It does not feel correct to me to emit incorrect schema type for everybody to fix issue in Avro only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, makes sense. Let's halt this one then until a new version of avro comes out with the bug fix.
@@ -173,6 +176,8 @@ private static Map<String, DefaultValueMapper> createDefaultValueMappers(Timesta | |||
result.put("time", (c, v) -> timestampUtils.toLocalTime(extractDefault(v, "00:00"))); | |||
result.put("timestamp", (c, v) -> timestampUtils.toOffsetDateTime(extractDefault(v, "1970-01-01"))); | |||
result.put("timestamptz", (c, v) -> timestampUtils.toOffsetDateTime(extractDefault(v, "1970-01-01")).atZoneSameInstant(ZoneOffset.UTC)); | |||
result.put("_text", parseNullDefault((c, v) -> createStringArr(extractDefault(v, "{}")))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this issue for _text/_varchar
only or are all array datatypes affected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that i think of it, looks like all array datatypes will be affected. Fixed only for these since this was impacting us. Will need a thorough review of other array datatypes.
On the same note the default mapper did not have any array data types to begin with. These are the first few additions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akanimesh7 Would it be possible to provide a sensible default - thus providing at least the empty List/Array for all array types and later on add support for the specific types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's possible. I'll work on that change.
@@ -188,6 +193,25 @@ private static Map<String, DefaultValueMapper> createDefaultValueMappers(Timesta | |||
return result; | |||
} | |||
|
|||
private static List<String> createStringArr(String arrayString) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static List<String> createStringArr(String arrayString) { | |
private static List<String> parseArrayString(String arrayString) { |
@jpechane Addressed comments. PTAL |
Solves https://issues.redhat.com/browse/DBZ-7536
Before
Newly added column definition
Representation of the field in the generated Avro Schema
After
Empty array default
Newly added column definition
Representation of the field in the generated Avro Schema
Non Empty array default
Newly added column definition
Representation of the field in the generated Avro Schema