Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add toXContent and fromXContent for DiscoveryNode and DiscoveryNodes #13576

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shiv0408
Copy link
Member

@shiv0408 shiv0408 commented May 7, 2024

Description

We need to upload the DiscoveryNodes object to remote to enable the cluster state publication flow from remote. To do that we need to parse DiscoveryNodes object to and from XContent.

Related Issues

Resolves #13691
Meta issue #13683

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented May 7, 2024

❌ Gradle check result for 5870c05: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented May 7, 2024

❌ Gradle check result for 7d1a599: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented May 7, 2024

❕ Gradle check result for 93d13e2: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=search.aggregation/230_composite/composite aggregation date_histogram profile shows filter rewrite info}

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@shiv0408
Copy link
Member Author

shiv0408 commented May 7, 2024

SmokeTestMultiNodeClientYamlTestSuiteIT test has been identified as flaky previously in #9332

Copy link
Member

@dbwiddis dbwiddis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments.

Also I've seen the method you call fromXContent called parse(). Since it's a static method and not enforced by the interface, in theory you can make it whatever you want... I see a lot of both throughout the codebase so either is probably fine, but I'd suggest making this consistent with other code around where it will be used... if that's fromXContent() that's fine, I haven't dug into the context.

// throw new IllegalArgumentException("expected object but got a " + token);
// }
XContentParser.Token token;
while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to align with your implementation of ToXContentFragment (which doesn't require start/end object) as well as the code on lines 595-597 which implies the START_OBJECT is optional. You'd hit the end of the parser without getting the END_OBJECT. (Same comment on DiscoveryNodes class below.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there might be some misunderstanding here. I agree that DiscoveryNode object's XContent does not need to have a start and end object, but here we are checking for the end object which is added in line 581 above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for DiscoveryNodes, this comment does make sense. I will modify the class to implement ToXContentObject.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there might be some misunderstanding here. I agree that DiscoveryNode object's XContent does not need to have a start and end object

I'm not sure this is the desired state here. Why don't you want to extend ToXContentObject instead of ToXContentFragment?

but here we are checking for the end object which is added in line 581 above

You are also adding a start object. Which is fine as long as they're paired. But you can't guarantee (based on the interface type) that you're always receiving the same object you wrote. It's possible someone else may make a larger object and include this inline -- the purpose of ToXContentFragment.

You make the start object optional here:

        if (parser.currentToken() == XContentParser.Token.START_OBJECT) {
            parser.nextToken();
        }

So if someone else trusts your object inheritance and places the object inline without start and end objects then you'd never get to the end and would keep iterating parser.nextToken(). Eventually that will be null, and null won't equal end object, so inside the loop you'll throw an NPE on token.isValue().

So you need to either test for a null nextToken() in addition to the END_OBJECT as a loop stopping condition, or you need to implement/extend ToXContentObject which requires start and end objects. I suspect you probably want the latter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, DiscoveryNode is part of the DiscoveryNodes toXContent response and getting used as a fragment like you mentioned and I plan to use DiscoveryNodes toXContent in ClusterState going forward. That's the reason for keeping them Fragment. Anyway, I have pushed a commit modifying the toXContent method. If you still believe we can hit a NPE in the current method. Can you please provide a sample json, I would write a unit test based on that to ensure that NPE is never thrown from our function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are implementing ToXContentFragment but you really should implement ToXContentObject here.

public class DiscoveryNode implements Writeable, ToXContentFragment {

I am not changing the implementing ToXContentFragment interface in the DiscoveryNode. It was already an implementation of the this interface and changing it to a ToXContentObject might causing breaking changes in our multiple API responses. As this object is already getting used as a inline object in cluster state API, and other cat APIs.

No, it's being used as an object. Your example in a comment below shows how it's used inline, and it includes th e { (START_OBJECT) and } (END_OBJECT). It's a nested complete object, not a fragment.

Even a nested object is a fragment only as there is no start and end object around it. According to our documentation here, a fragment needs to have start and end object externally.
The example of you gave around fragment being used for creating array. My use case is also similar, where instead of create a field which has a START_ARRAY, our object has START_OBJECT.

The JSON you have provided is a invalid. And we don't expect to parse XContent not created from our method. Even if our object is getting used inline, like I am using in DiscoveryNodes class, the XContent is created from our toXContent method and object is parsed from our fromXContent method.

you must match a required end object to a required start object. You require one but make the other one optional

I am matching start and end object which are being added in our toXContent method. The line 590-592 you are referring, is to enable parsing of following object. This is a complete XContent object, but even if we don't receive the complete object (i.e. without start and end), we will be able to parse that inline object.

{
  "node_1" : {
    "name" : "name_1",
    "ephemeral_id" : "3Q3xRwYKScWqBgVCrWmNCQ",
    "transport_address" : "0.0.0.0:2",
    "attributes" : {
        "custom" : "PKU"
    }
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not changing the implementing ToXContentFragment interface in the DiscoveryNode. It was already an implementation of the this interface and changing it to a ToXContentObject might causing breaking changes

Agreed, you shouldn't change this.

Even a nested object is a fragment only as there is no start and end object around it.

Agreed.... but then you check for the end object in the line prompting this comment chain:

 while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) 

If there is no end object, this loop will never reach its stopping condition.

The JSON you have provided is a invalid.

I know. But the parsing error thrown would have been somewhere unrelated to the invalid JSON, in some follow-on code much more confusing to debug. I just threw up a quick example to show that in the case you do not require a start object, and thus shouldn't require an end object (because they should match) you end up with the code as it was written hitting some later end object. Or possibly getting a NPE if you're just parsing a small fragment with no start or end.

you must match a required end object to a required start object. You require one but make the other one optional

I am matching start and end object which are being added in our toXContent method.

When you create the JSON you can match them. If you are reading the JSON created elsewhere you do not control that. Someone else could be creating that JSON outside of core.

The code around the loop looks better because you're matching the start (lines 603-604) to the end (the stopping condition for the loop in 605).

But there is still an optional START_OBJECT present in lines 590-592 without a corresponding end object. I still think this is wrong.

It should be legal for anyone parsing this JSON to parse their own start/end outside of your parsing. SO taking your JSON above:

{
  "node_1" : {
    "name" : "name_1",
    "ephemeral_id" : "3Q3xRwYKScWqBgVCrWmNCQ",
    "transport_address" : "0.0.0.0:2",
    "attributes" : {
        "custom" : "PKU"
    }
  }
}

it should be legal to pass this (valid JSON) string as a fragment which completely encapsulates all the information in the object, lacking only its "field name" which is only relevant for DiscoveryNodes.

  {
    "name" : "name_1",
    "ephemeral_id" : "3Q3xRwYKScWqBgVCrWmNCQ",
    "transport_address" : "0.0.0.0:2",
    "attributes" : {
        "custom" : "PKU"
    }
  }

This fails, however, because your fragment behavior is not just the value, it's a key-value pair expected by DiscoveryNodes parsing and you require the format to be:

  "node_1" : {
    "name" : "name_1",
    "ephemeral_id" : "3Q3xRwYKScWqBgVCrWmNCQ",
    "transport_address" : "0.0.0.0:2",
    "attributes" : {
        "custom" : "PKU"
    }
  }

Essentially, the optional start object IS read when parsing a standalone object but is NOT read when parsing the specific key/value format you include in DiscoveryNodes which omits the end object.

So ultimately my chief complaint here is that having the "field name" as part of this object is wrong; I should be able to have a standalone object that works without needing DiscoveryNodes wrapped around it.

Copy link
Member

@dbwiddis dbwiddis Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more clear/succinct, your representation of DiscoveryNode in JSON is this (optionally wrapped in curly braces) key value pair.

{
  "node_1" : {
    "name" : "name_1",
    "ephemeral_id" : "3Q3xRwYKScWqBgVCrWmNCQ",
    "transport_address" : "0.0.0.0:2",
    "attributes" : {
        "custom" : "PKU"
    }
  }
}

or this because the object wrapper is optional, but requires the method caller to know that it's expecting a field name and object:

  "node_1" : {
    "name" : "name_1",
    "ephemeral_id" : "3Q3xRwYKScWqBgVCrWmNCQ",
    "transport_address" : "0.0.0.0:2",
    "attributes" : {
        "custom" : "PKU"
    }
  }

This is not the expectation I've seen anywhere for XContentFragments which are usually standalone values (which may or may not be objects).

I believe it should either be:

{
  "name" : "name_1",
  "id" : "node_1",
  "ephemeral_id" : "3Q3xRwYKScWqBgVCrWmNCQ",
  "transport_address" : "0.0.0.0:2",
  "attributes" : {
      "custom" : "PKU"
  }
}

or your original format with the object wrappers mandatory (even if you keep the fragment interface).

You're mixing-and-matching pairing of start object which relies heavily on implementation details of other classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbwiddis Thanks for being patient to write the detailed comment. Got your point on that we should be able to pass the inner nested object without the fieldname. But we need to create a parser which takes id as a parameter.

DiscoveryNode fromXContent(XContentParser parser, String nodeId)

I am assuming this will help address your concern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, @shiv0408 thanks for the explanation. I see now that this format is expected. It's just very confusing for a maintainer. You've obviously spent more time digging into this and I agree that your implementation is the necessary way to do it.

So my only request now is that you write a javadoc for the fromXContent() method indicating the expected format, specifically either a key-object pair standalone, or a key-object pair wrapped inside start and end objects, so the next person to come along and look at this code won't be as confused as I was.

@lukas-vlcek
Copy link
Contributor

@shiv0408 Would you mind modifying issue description and adding a ticket number that requires this change please? I would be interested in learning more about why exactly this is needed (I assume this is relevant to remote storage?).

@shiv0408 shiv0408 force-pushed the xcontent-discovery-node branch 2 times, most recently from 9bc7716 to 68add26 Compare May 15, 2024 08:55
Copy link
Contributor

❌ Gradle check result for 9bc7716: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 68add26: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for f151e00: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
@github-actions github-actions bot added Cluster Manager enhancement Enhancement or improvement to existing feature or request labels May 15, 2024
@shiv0408
Copy link
Member Author

@lukas-vlcek apologies for late response, I have modified the description of the PR to include the issue for this. And yes, this effort is required for remote state publication.

@shiv0408
Copy link
Member Author

shiv0408 commented May 15, 2024

@dbwiddis Thanks for reviewing the changes. I have pushed a commit to address your concerns. Can you please take another look at the changes?

Also I've seen the method you call fromXContent called parse()

We are consistently using fromXContent method name for de-serializing the object for all the objects with are getting used in our remote cluster state flow.

Copy link
Contributor

❌ Gradle check result for 6a54fb1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

- Add java docs for fromString method in TransportAddress
- Cleaned up toXContent methods in DiscoveryNode, DiscoveryNodes

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Copy link
Contributor

❌ Gradle check result for 08a10cb: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

ephemeralId = parser.text();
break;
case KEY_TRANSPORT_ADDRESS:
transportAddress = TransportAddress.fromString(parser.text());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be able to handle ipv6 addresses?

Copy link
Member Author

@shiv0408 shiv0408 May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we will be able to do that now. I will modify this and add a test to ensure that TransportAddress.fromString can parse both ipv4 and ipv6 addresses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request skip-changelog
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Remote Cluster State] Add toXContent and fromXContent for DiscoveryNodes
4 participants