-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store source for nested objects #108818
Store source for nested objects #108818
Conversation
Hi @kkrik-es, I've created a changelog YAML for you. |
…ct' into fix/synthetic-source/nested-object
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I left two comments.
@@ -287,6 +287,19 @@ static void parseObjectOrNested(DocumentParserContext context) throws IOExceptio | |||
} | |||
|
|||
if (context.parent().isNested()) { | |||
if (context.mappingLookup().isSourceSynthetic() && context.getClonedSource() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question about the added code here. Is the reason that this is needed in case there is an object field that is mapped as nested field? The array case is then covered by parseNonDynamicArray(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, just in case there's an entry not using an array. Should be rare but it can happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case is the || mapper instanceof NestedObjectMapper
addition at line 612 required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, that section covers the usual case where the nested object contains an array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a nested object has already been added to ignored source, then I think we don't need to do anything for that json object contains an array? Since it has already been added to ignored source as part of the object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we either get an array or an object, not an object wrapping an array. I verified this in the unit test, isn't that the valid syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see without || mapper instanceof NestedObjectMapper
we only store the first element of an array. So subsequent objects in an array would not get recorded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add this is as a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
boolean isRoot = context.parent() instanceof RootObjectMapper; | ||
context.addIgnoredField( | ||
new IgnoredSourceFieldMapper.NameValue( | ||
isRoot ? arrayFieldName : context.parent().name() + "." + arrayFieldName, | ||
isRoot ? 0 : context.parent().fullPath().length() + 1, | ||
XContentDataHelper.encodeXContentBuilder(tuple.v2()) | ||
) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe turn this logic into a helper method? I think this appears now several times in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, @lkts added one. Applied here.
context = tuple.v1(); | ||
token = context.parser().currentToken(); | ||
parser = context.parser(); | ||
} | ||
context = context.createNestedContext((NestedObjectMapper) context.parent()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the DocumentParserContext
constructor that is used eventually from createNestedContext(...)
method also copy clonedSource
field from provided context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, this needs to be propagated.
@@ -287,6 +287,19 @@ static void parseObjectOrNested(DocumentParserContext context) throws IOExceptio | |||
} | |||
|
|||
if (context.parent().isNested()) { | |||
if (context.mappingLookup().isSourceSynthetic() && context.getClonedSource() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a nested object has already been added to ignored source, then I think we don't need to do anything for that json object contains an array? Since it has already been added to ignored source as part of the object?
|
||
boolean booleanValue = randomBoolean(); | ||
int intValue = randomInt(); | ||
var syntheticSource = syntheticSource(documentMapper, b -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also assert the Lucene document that get created here? I think it can be useful to assert the actual IndexableField
instances for _ignored_source
field. Maybe the syntheticSource(...)
method can optionally accept a Consumer<LuceneDocument>
parameter, that allows to the test to assert the field instances?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm not sure what would be the benefit, the contents are encoded. The decoded version should show up in the printed source that we already assert on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true, even if for some reason we end up with duplicate ignored stored fields. (duplicate array elements would be printed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Covers nested objects with fields or arrays of objects. Adds support for tracking the source of arrays in disabled objects, too.
Related to #106825, #108417