Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding translated.from annotation when using SchemaTranslator #970

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

li-ukumar
Copy link
Member

@li-ukumar li-ukumar commented Jan 22, 2024

What

Adding an annotation in translated schemas, translated through SchemaTranslator

Why

This will help in figuring out whether a schema is translated or source of truth.
This annotation can help in determining different behavior for SoT and translated schemas

Test

Updated Unit tests

Comment on lines 175 to 177
if (!ConfigurableSchemaComparator.equals(avroSchemaFromEmbedded, avroSchemaFromJson,
new SchemaComparisonConfiguration(true, true, true, false, true, true,
Collections.singleton((TRANSLATED_FROM_SOURCE_OPTION))))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hard to read. Why can you provide a constructor in ConfigurableSchemaComparator to allow configure what options to be excluded? Also why DATA_PROPERTY above is not handled simiarly?

Copy link
Member Author

@li-ukumar li-ukumar Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check will never fail for DATA_PROPERTY because the code above this check adds it just before the check.

Object embededSchemaPropertyVal = avroSchemaFromJson.getObjectProp(DATA_PROPERTY);
            if (embededSchemaPropertyVal != null)
            {
              avroSchemaFromEmbedded.addProp(DATA_PROPERTY, embededSchemaPropertyVal);
            }

Updated the SchemaComparisonConfiguration call

if (dataSchema instanceof NamedDataSchema) {
NamedDataSchema namedDataSchema = (NamedDataSchema) dataSchema;
// Add annotation if not already present
if (!namedDataSchema.getProperties().containsKey(TRANSLATED_FROM_SOURCE_OPTION)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what should you do if existing translated_from prop is different from updated one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't update the source in that case. So, if schema A translates to B, and B is translated to C,
the translatedFrom option in C will be A

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original question is talking about an error case, I am guessing that should never happen and we should throw error?
But your response provides me something different from my understanding, I thought that annotation is just for direct source, seems not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to stop users from translating a translated schema again, as it is allowed right now in SchemaTranslator right now, and stopping it might fail users.

However, we can use this annotation to fail other flows like PDL -> Proto or Avro -> proto.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this logic, translated.from annotation is used to indicate the root source, not directly translated source, is this clearly communicated?

}
dataSchema = namedDataSchema;
}
return dataSchema;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will have side effect of changing passed DataSchema?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be none, except one additional json property in the result avro schema, which is the requirement of this change. This new property will identify the avro schemas as a translated schema.

@li-ukumar li-ukumar changed the title Adding li.data.translated.from annotation when using SchemaTranslator Adding schema.translated.from.src annotation when using SchemaTranslator Jan 24, 2024
@@ -53,6 +58,7 @@ public class SchemaTranslator
private static final Logger log = LoggerFactory.getLogger(SchemaTranslator.class);

public static final String DATA_PROPERTY = "com.linkedin.data";
public static final String TRANSLATED_FROM_SOURCE_OPTION = "schema.translated.from.src";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will translated.from enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. any unique key will work.

I'll update to translated.from

if (dataSchema instanceof NamedDataSchema) {
NamedDataSchema namedDataSchema = (NamedDataSchema) dataSchema;
// Add annotation if not already present
if (!namedDataSchema.getProperties().containsKey(TRANSLATED_FROM_SOURCE_OPTION)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original question is talking about an error case, I am guessing that should never happen and we should throw error?
But your response provides me something different from my understanding, I thought that annotation is just for direct source, seems not?

@li-ukumar li-ukumar changed the title Adding schema.translated.from.src annotation when using SchemaTranslator Adding translated.from annotation when using SchemaTranslator Jan 26, 2024
@mchen07
Copy link
Contributor

mchen07 commented Jan 26, 2024

Temporarily modified the tests to ignore then new option
Do you mean that this is a temporary solution or you have followup fix?

@li-ukumar
Copy link
Member Author

Temporarily modified the tests to ignore then new option
Do you mean that this is a temporary solution or you have followup fix?

It was the description for the initial PR. The unit tests are updated now.

Updated the Description .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants