Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for specifying an array of metadata objects to use for the outgoing requests #234

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Kami
Copy link

@Kami Kami commented Oct 20, 2020

This pull request updates the code so in addition to specifying a single metadata object which is used for all the outgoing gRPC calls / requests, user can also specify a list of metadata objects to use for the outgoing requests.

This allows people to use different metadata for different requests. It's similar to the change implemented in #87, but this one is for the actual request metadata and for the payload / body / data.

Background, Context, Rationale

#87 implemented support which allows users to specify different messages for unary calls. Those messages are then used in a round robin fashion for the outgoing requests (thanks to @ezsilmar for implementing that).

That functionality comes handy in many scenarios where sending the same data with every request will not results in a representative result (e.g. due to the server logic, caching or similar).

In our case, actual processing performed by the server also depends on the metadata field values.

This means if we want to get a representative result when performing simple gRPC server level benchmark using this tool, we need to send different (and specific) metadata for each outgoing request.

And in that specific case we can't utilize call template data functionality since that data is not available in the template context.

Proposed Implementation

This PR implements a change which allows user to either specify an object directly or an array of objects for the metadata argument.

If an array is specified, we will use different metadata values for different outgoing requests in a round-robin fashion.

This follows similar approach which is already used for the actual messages and was implemented in #86.

Proposed implementation in this PR is just a quick WIP version of this change.

If other people agree this is indeed a good feature / functionality to have, I will clean it up and finish it.

Open Questions

Making sure the change is fully backward compatible (aka that both approaches - single object and an array with objects is supported) is pretty straight forward when metadata is either specified directly as a command line argument or read from a file.

This becomes more problematic though when reading configuration from a JSON or TOML configuration file.

One approach I could think of is to "rewrite" config on load and ensure metadata field value is always an array. This approach is somewhat fragile not ideal so I wonder if there is a better way to handle that (suggestions and feedback is welcome).

Another approach is to have two Config struct definitions - one of the map and one of the array of maps and then internally after parsing we make sure we always convert that field to an array.

EDIT: I pushed a change so we use the same generic approach as we use for the Data argument - this way we can handle this change config wise in a fully backward compatible manner.

TODO

  • Finish the implementation
    • Make sure the change is fully backward compatible (ensure we support both notations - single object or an array for objects for metadata specified either via command line or config file)
    • Clean up the code
    • Tests
    • Update docs
    • Update examples
    • Update readme

which will be used for subsequent requests in a round-robin fashion (in
the same way we handle the actual payload / body if multiple protobuf
objects are specified).
for the metadata config option - either an object or an array of
objects.
@Kami
Copy link
Author

Kami commented Oct 21, 2020

Alright I pushed additional changes:

  • The change is now fully backward compatible not matter how you specify the data (command line argument with actual values, command line argument with file path, json / yaml / toml config file)
  • Added end to end tests
  • Updated readme, docs and examples

I believe functionality wise this should be more or less complete, I just need to clean up and refactor some duplicated code.

If I missed something or some place, please let me know.

@Kami Kami changed the title [RFC] [WIP] Support for specifying an array of metadata objects to use for the outgoing requests Support for specifying an array of metadata objects to use for the outgoing requests Oct 21, 2020
if err := json.Unmarshal([]byte(*md), &metadataArray); err != nil {
return fmt.Errorf("Error unmarshaling metadata '%v': %v", *md, err.Error())
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With respect to determining the underlying type we are trying to Unmarshal, instead of relying on strings.Contains() of the error message, I think we can take advantage of the actual UnmarshalTypeError. The approach may be a little bit more reliable. Example of what I mean:

md := `["foo", "bar"]`
	var metadataMap map[string]string
	if err := json.Unmarshal([]byte(md), &metadataMap); err != nil {
		if e, ok := err.(*json.UnmarshalTypeError); ok && e.Value == "array" {
			fmt.Println("trying to Unmarshal array into map", e)
		}
	} else {
		fmt.Println(metadataMap)
	}

// an array. If the input is an object, but not an array, it's converted to an array.
func (td *callTemplateData) executeMetadataArray(metadata string) ([]map[string]string, error) {
var mdArray []map[string]string
var metadataSanitized = strings.TrimSpace(metadata)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am missing something, but can we ensure that the metadata string is sanitized / trimmed earlier, wherever we set it as w.config.metadata. That way we do not have to trim on every invocation of executeMetadataArray().

var metadataSanitized = strings.TrimSpace(metadata)

// If the input is an object and not an array, we ensure we always work with an array
if !strings.HasPrefix(metadataSanitized, "[") && !strings.HasSuffix(metadataSanitized, "]") {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not a big deal, but should this go in the check if len(metadata) > 0 { check?

for k, v := range objData3 {
sk, isString := k.(string)
if !isString {
return errors.New("Data key must string")
Copy link
Owner

@bojand bojand Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the errors here should reference "Metadata key..."?

var array []map[string]interface{}
for _, item := range arrData {
objData3, isObjData3 := item.(map[interface{}]interface{})
newItem := make(map[string]interface{})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe valid metadata object has to be map[string]string so here we should be asserting that both key and value in the internal representation of the objects in array are strings and then converting those to that type?

@bojand
Copy link
Owner

bojand commented Oct 22, 2020

Hello, thank you for the PR and the detailed description! I understand the problem, and I think this is a sensible approach. I will probably need another read through, but I have left a few comments to double check on from my initial review. I'll probably re-read again when I get some more time. Once we get these addressed and when you feel you've cleaned up the code we can get this merged. Thanks again!

@Kami
Copy link
Author

Kami commented Jan 6, 2021

@bojand Sorry for the delay - I some how missed your review feedback.

I'll try to address this review feedback this weekend (I've been using those changes for a while now, but I do agree that it needs more clean up, etc. :)).


if len(metadata) > 0 {
input := []byte(metadata)
tpl, err := td.execute(metadata)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bojand I started looking and profiling this code again and I think it would be good to add another optimization here so we don't re-render / evaluate the template and unmarshall the string for every single request.

That's especially important when using large metadata JSON objects like in my case.

Basically, I'm looking at adding some new flag (e.g. --plaintext-metadata, open to better naming) and when this flag is used we don't evaluate metadata as a template and cache it on the worker object and re-use it for subsequent requests, similar as we do with w.cachedMessages.

This should substantially speed things up and reduce memory usage (I'm just testing this change to confirm that).

WDYT?

Copy link
Author

@Kami Kami Jan 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we go, here is a quick WIP version - d586c35.

I confirmed that using --plaintext-metadata flag is much more efficient when working with large metadata JSON objects / arrays (like in my case) and results in lower worker CPU usage.

Which is kinda expected, because trying to render the template + parsing JSON for every single request will never be efficient when working with large metadata objects.

When this flag is used, we don't templatize metadata JSON object for
every single request and json load it, but we only do that once and
re-use cached version for subsequent requests.

This results in much less overhead when metadata template functionality
is not needed and utilizing large metadata objects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants