Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In place of fieldDefinitions, support avro schema, which is a more comprehensive way to describe data #820

Open
knguyen1 opened this issue Apr 17, 2024 · 0 comments

Comments

@knguyen1
Copy link

knguyen1 commented Apr 17, 2024

Is your feature request related to a problem? Please describe.
Avro is described here in this documentation: https://avro.apache.org/docs/1.10.2/idl.html#minutiae_annotations
You can use java-style decorators to add details to your fields.

This is an example of an avro schema.

@namespace("com.zingg.common.schema")
protocol Sample {
    record MySampleRecord {
        int @matchType("DONT_USE") id;
        string @aliases(["FirstName"]) @matchType("FUZZY") firstname;
        string @aliases(["LastName"]) @matchType("FUZZY") lastname;
    }
}

It compiles to a json:

{
  "type" : "record",
  "name" : "MySampleRecord",
  "namespace" : "com.zingg.common.schema",
  "fields" : [ {
    "name" : "id",
    "type" : "int",
    "matchType" : "DONT_USE"
  }, {
    "name" : "firstname",
    "type" : "string",
    "aliases" : [ "FirstName" ],
    "matchType" : "FUZZY"
  }, {
    "name" : "lastname",
    "type" : "string",
    "aliases" : [ "LastName" ],
    "matchType" : "FUZZY"
  } ]
}

It gets deployed to confluent schema registry and you can retrieve it with a simple curl:

$ curl http://schema-registry/subjects/{SCHEMA_NAME}/versions/latest

Describe the solution you'd like
In modern software design, often, data contracts/things that describe data are stored in schema registry. Instead of fieldDefinitions in the zingg conf let us reference a schema-registry url and schema name. This helps us centralize data descriptions in one place, and not have to re-define in another place.

{
  "fieldDefinitions": {
    "schemaRegistry": "http://schema-registry.my.domain",
    "schemaName": "MySampleRecord",
    "version": "latest"
  }
}

Describe alternatives you've considered
Alternative is to use POCO classes defined in java... Glue schema registry etc. But avro schema is the cleanest solution.

Additional context

Since schema is stored in the registry, there is no need to repeat this information in the conf.
Here's the docker image to host your own schema registry. https://hub.docker.com/r/confluentinc/cp-schema-registry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant