-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DBZ-6993 Add support for JSON columns to Oracle connector #4973
base: main
Are you sure you want to change the base?
Conversation
Welcome as a new contributor to Debezium, @shybovycha. Reviewers, please add missing author name(s) and alias name(s) to the COPYRIGHT.txt and Aliases.txt respectively. |
Hi @shybovycha, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key. |
Hi @shybovycha I see changes made specifically for LogMiner, did you get an opportunity to check whether any changes were needed for XStream or OpenLogReplicator adapters to support this data type? I ask because I've hit a road block locally with GoldenGate and XStream on Oracle 23 Free, so was hoping if you had that you could share how you got XStream working on |
Hi @shybovycha, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key. |
Hey, @Naros . Sorry, I did not test against XStreams or OpenLogReplication - in our setup we only test the LogMiner workflow as this is the main use case. |
Ok thanks for confirming @shybovycha, it looks like atm I'm blocked testing this with Oracle XStream (see DBZ-5655). I'm going to start checking Oracle 23 with OLR now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the code LGTM; however, I did leave one comment below. I also don't see an integration test in the test suite to actually test the JSON serialization. Could you please add one?
I went ahead and started working on these changes as I would like to include this as part of the 2.5.0.Beta1 this week. I'm working on the tests now. However, where can I find the Oracle Instant Client 23, I don't see this publically available on the Oracle downloads web page for Instant Clients.
Wow, talk about embedded in Oracle's website, took forever to find the Instant Client, but I found it, placing it here in case anyone else needs to fetch it: https://download.oracle.com/otn_software/linux/instantclient/23c/instantclient-basic-linux.x64-23.3.0.0.0.zip?source=:so:tw:or:awr:osec:,:so:tw:or:awr:osec:
debezium-connector-oracle/src/main/java/io/debezium/connector/oracle/OracleConnection.java
Outdated
Show resolved
Hide resolved
Hi @shybovycha, so testing with Oracle FREE, it reports operations on JSON column types are unsupported and cannot be streamed by LogMiner. Did you happen to test this on another Oracle 23 platform that isn't publically available such as Exadata or Oracle Cloud? |
@Naros can you provide more context around the errors are you seeing? If it is the errors from the build we are talking about
I will have a look today. Re. running Oracle (locally, at least) I ran it in Docker (from the
|
Hi @shybovycha, so the issue/error is that while I can compile and run the code against Oracle 23, when LogMiner generates the rows in the V$LOGMNR_CONTENTS table, it reports that the operation is I looked at the LogMiner documentation but didn't see any reference saying JSON was or was not supported by LogMiner. The column is successfully captured using the snapshot phase without any issue, it's only during streaming that the data does not seem to be present in V$LOGMNR_CONTENTS. This PR does not include any IT for the column, so I was attempting to add one when I hit this issue. I would suggest looking at the |
Hi @shybovycha, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key. |
Hi @shybovycha, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key. |
aff6c93
to
0b9f963
Compare
@Naros I have been struggling to run the integration tests locally, for a variety of reasons. The README seems to be quite different from what Github Workflow is doing - whilst README suggests running
Github Workflow runs
(which by the way requires quay.io credentials and permissions for the The former fails to find Oracle connector (regardless of providing a path to the JAR on the
The latter fails trying to run MongoDB ReplicaSet tests (which I guess is not needed for Oracle tests? anyways, fails on my M1 Apple silicon):
Even after disabling the MongoDB ReplicaSet tests, replacing the Oracle Docker image with
Apparently, this is solely due to the CPU architecture 😢 Since I am unable to run the tests locally, I will have to rely on Github Actions instead. One other thing worth clarifying: I think JSON columns were introduced in Oracle 21, so I went ahead and added annotations disabling the IT for database versions lower than 21 (major). |
@Naros could you please approve the pending workflows so that I can see if / how tests are failing? |
/packit test --labels oracle |
Hi @shybovycha, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key. |
499a399
to
013bc18
Compare
Making some tests green: debezium/debezium-connector-vitess#179 and the latest commit to this branch - somehow the tests irrelevant to this change are failing. The |
Hi @shybovycha, I appreciate the proactive approach for the REST tests; however, this is due to other things, and we shouldn't be altering that test. Could you please remove that change from this PR entirely, thanks. |
@Naros I am not entirely sure what's going on on the CI, but now I have |
The issue with Oracle 23 is that it's hugely CPU bound with the Free edition, so I suspect what has happened is that the infrastructure that supports PackIt was perhaps overloaded combined with the CPU limiting on the container by Oracle; this is creating this problem. I'll restart the tests once more and let's see if running them later in the day helps. /packit test --labels oracle-23 |
@Naros does not seem like running later in the day does the trick 😞 can we run the tests differently? or rather does |
Hi @shybovycha, I'll try running locally, but otherwise that's the only alternative we have with Oracle 23. Are you not able to run these locally and validate at least the Oracle 23 JSON tests pass? I'm less concerned with the non-JSON tests. |
@Naros I've spent 2 full days trying to set up Oracle locally to work with the integration tests using the |
Hi @shybovycha, you should not need the docker pull container-registry.oracle.com/database/free:latest |
Evening, @shybovycha. I ran the test suite against Oracle 23 on my laptop, and the only test failures were with the JSON integration tests. By all accounts, Oracle LogMiner does not support JSON with the JSON column type, LogMiner reported those operations all as OPERATION_TYPE 255 or Unsupported. Looking more closely at the supported data types for LogMiner in Oracle 23, I also don't see mention of JSON, see: It's plausible that the current Oracle 23 Free Developer Version, which is from Sept 2023, does not have integrated support in LogMiner for JSON, but it's also equally possible that Oracle is purposely only adding new data type support into GoldenGate, XStream, and Data Guard. Given the documentation, I'm really inclined to think it's most likely the latter and not the former. Attached are the failures related to the JSON tests:
In the test outputs I can see these entries in the logs:
I should note that this used the |
One thing to note, @shybovycha, is that we probably should explicitly add UPDATE
|
@Naros so what is the plan here? should we spend more time and try to fix the tests for XStream mode only? n.b.: i was not able to run integration tests in XStream mode locally 😞 the other option (not saying it is good, but it is an option 🤷 ) is to ignore or remove the integration test for the moment being |
Hi @shybovycha, I'm not a fan of just disabling the tests and adding a new code like this because it exposes liability on the team, and without being able to test this against Oracle 23, we have no fallback on how to help users if they face problems or this introduces some regression. As for XStream - Oracle 23 changed how this is to be configured, and I opened this StackOverview post four months ago without any help from the Oracle community on what I could have been doing wrong to get Oracle 23 XStream working. Perhaps you have some idea? If we can at least test this with XStream, then we can guard this as an XStream feature only, but until we can verify that it works with one adapter, I'm not comfortable merging this into |
One idea to consider @shybovycha could be to focus on the JSON feature sets of Oracle 19c and 21c rather than 23. This way we can at least verify the basics around JSON and have a solution that satisfies those two versions; and then build on top of that for Oracle 23. wdyt? |
…e actual JSON objects
Hi @shybovycha, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key. |
1 similar comment
Hi @shybovycha, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key. |
@Naros unfortunately we can only handle JSON properly after Oracle 23, since before there is no way to configure JDBC driver on how to fetch the JSON data |
Oracle 23 introduced a new column type,
JSON
.In order to properly support it, Oracle JDBC driver requires a
oracle.jdbc.jsonDefaultGetObjectType
property to be set to one of the allowed values (Java classes).I followed the same pattern as
MySqlConnection
and added an extradriver
-level connector configuration option to allow users to change thejsonDefaultGetObjectType
value.Once the property is set,
JSON
columns can use theJson
schema to be parsed by Debezium, which requires the properjdbcType
value to be set and handled byOracleValueConverters#schemaBuilder
andOracleValueConverters#converter
methods.This proved to be a bit tricky, since by default the
Column#jdbcType
andColumn#nativeType
are set to generic values:jdbcType = oracle.jdbc.OracleTypes.OTHER (1111)
andnativeType = -1
. Hence had to set those based on theColumn#typeName
inOracleConnection#overrideColumn
instead ofresolveJdbcType
orresolveNativeType
, since that is the only method (among the three) which has access to an entireColumn
object (giving access totypeName
value).