New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add protobuf codegen decoder #12980
base: master
Are you sure you want to change the base?
Add protobuf codegen decoder #12980
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12980 +/- ##
============================================
+ Coverage 61.75% 62.14% +0.39%
+ Complexity 207 198 -9
============================================
Files 2436 2518 +82
Lines 133233 138194 +4961
Branches 20636 21371 +735
============================================
+ Hits 82274 85887 +3613
- Misses 44911 45892 +981
- Partials 6048 6415 +367
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
4615a6c
to
b64cfa1
Compare
b64cfa1
to
9ca47ac
Compare
@KKcorps Could you take a look. |
Can you link the overall design doc for this in the summary section? |
...not-input-format/pinot-protobuf/src/main/java/com/google/protobuf/ProtobufInternalUtils.java
Outdated
Show resolved
Hide resolved
// This is needed since the generated class name is not always the same as the proto file name. | ||
// The descriptor that we get from the jar drops the first prefix of the proto class name. | ||
// For example, insead of com.data.example.ExampleProto we get data.example.ExampleProto. | ||
// Copied from Flink codebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we import/use Flink method instead of copy? Does Flink allow such copy of their code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can import the flink code. We would have to add a dependency to flink's protobuf module just for these utils functions. They are just needed to get the full package name from a classLoader object. It's pretty straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for copied code you can add comments like
/**
* Copied from the presto TimeZoneKey. It basically caches the Joda Chronologies corresponding to each of the
* timezones listed in the zone-index.properties
* The zone-index.properties is kept in sync with the presto zone index properties.
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chenboat There's other instances in Pinot where code was copied:
Lines 36 to 40 in 363a03e
* ================================================================================================================= | |
* THIS CLASS IS COPIED FROM Calcite's {@link ChainedSqlOperatorTable} and modified the function lookup to terminate | |
* early once found from ordered SqlOperatorTable list. This is to avoid some hard-coded casting assuming all Sql | |
* identifier looked-up are of the same SqlOperator type. | |
* ================================================================================================================= |
...src/main/java/org/apache/pinot/plugin/inputformat/protobuf/ProtoBufCodeGenMessgeDecoder.java
Outdated
Show resolved
Hide resolved
// This is needed since the generated class name is not always the same as the proto file name. | ||
// The descriptor that we get from the jar drops the first prefix of the proto class name. | ||
// For example, insead of com.data.example.ExampleProto we get data.example.ExampleProto. | ||
// Copied from Flink codebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for copied code you can add comments like
/**
* Copied from the presto TimeZoneKey. It basically caches the Joda Chronologies corresponding to each of the
* timezones listed in the zone-index.properties
* The zone-index.properties is kept in sync with the presto zone index properties.
*/
public static final String NULLABLE_DOUBLE_FIELD = "nullable_double_field"; | ||
public static final String NULLABLE_FLOAT_FIELD = "nullable_float_field"; | ||
public static final String NULLABLE_BOOL_FIELD = "nullable_bool_field"; | ||
public static final String NULLABLE_BYTES_FIELD = "nullable_bytes_field"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the supported data types if pinot doesn't support the type natively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See: We support all the protobuf descriptor types at present in the version we are using. If in the future new types get added, the codegen will skip that filed and log an error.
https://github.com/apache/pinot/pull/12980/files#diff-84564c2ff3bb09e23abe31143340f4e5a2aea0613588e1131311cb3ea17d4a5eR159-R177
228811e
to
8cadb65
Compare
003bf3f
to
8bf1d6d
Compare
...rc/main/java/org/apache/pinot/plugin/inputformat/protobuf/ProtoBufCodeGenMessageDecoder.java
Outdated
Show resolved
Hide resolved
...pinot-protobuf/src/main/java/org/apache/pinot/plugin/inputformat/protobuf/ProtoBufUtils.java
Outdated
Show resolved
Hide resolved
Descriptors.FieldDescriptor keyFd = | ||
fd.getMessageType().findFieldByName("key"); | ||
Descriptors.FieldDescriptor valueFd = | ||
fd.getMessageType().findFieldByName("value"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
...pinot-protobuf/src/main/java/org/apache/pinot/plugin/inputformat/protobuf/ProtoBufUtils.java
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
private static void addFieldToPinotSchema(Schema pinotSchema, FieldSpec.DataType dataType, String name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this code re-used from somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add it to a common package and use it. I can do that in a follow up pr.
…upport for complex type schema
resolves #12679
Design doc: https://docs.google.com/document/d/186mVLISgYEmljla3-IHrFNJ2dGMJ21qJ-MWFtsx-ayI/
Test resources have the generated code we expect for the different test proto files.