Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customers, Products, and Suppliers do not flow to Google Big Query #38

Open
chadmott opened this issue Aug 19, 2020 · 2 comments
Open

Comments

@chadmott
Copy link

At the end of the lab, I see transactional data in Big Query, but not the customers, products, or suppliers data.

In my local confluent control center, I see this data in their respective topics, and CC is properly showing their values (so it is schema-aware)

In confluent cloud (which is where the connector is configured to pull from) I see the data, but for the values i see the binary representation of the AVRO encoded data. I suspect that for whatever reason, the confluent cloud cluster is unable to deserialize the data?

The connector is running


Name                 : DC01_GCS_SINK
Class                : io.confluent.connect.gcs.GcsSinkConnector
Type                 : sink
State                : RUNNING
WorkerId             : kafka-connect-ccloud:18084

 Task ID | State   | Error Trace
---------------------------------
 0       | RUNNING |
---------------------------------

with no errors.

Could you comment on why dont I see any errors? How can I view messages that the connector "skipped" when running in KSQL mode?

@chadmott
Copy link
Author

@tmcgrath I suspect this is why you were getting the error in Data Studio... the queries are joining on IDs that (at least for me) do not exist

@chadmott
Copy link
Author

Quick update --- adding the ID fields to the tables in BigQuery, and then re-starting the connector has data flowing in. Seems for whatever reason the ID does not exist in the Schemas...

{
  "connect.name": "io.confluent.ksql.avro_schemas.KsqlDataSourceSchema",
  "fields": [
    {
      "default": null,
      "name": "FIRST_NAME",
      "type": [
        "null",
        "string"
      ]
    },
    {
      "default": null,
      "name": "LAST_NAME",
      "type": [
        "null",
        "string"
      ]
    },
    {
      "default": null,
      "name": "EMAIL",
      "type": [
        "null",
        "string"
      ]
    },
    {
      "default": null,
      "name": "CITY",
      "type": [
        "null",
        "string"
      ]
    },
    {
      "default": null,
      "name": "COUNTRY",
      "type": [
        "null",
        "string"
      ]
    },
    {
      "default": null,
      "name": "SOURCEDC",
      "type": [
        "null",
        "string"
      ]
    }
  ],
  "name": "KsqlDataSourceSchema",
  "namespace": "io.confluent.ksql.avro_schemas",
  "type": "record"
}

if there is no schema here, it makes sense why it wouldn't show up in BigQuery but... why then did the data flow after I manually added the ID to big query?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant