Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SERVER-81279 Create Genny Workloads for UpdateMany #1100

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

brettnawrocki
Copy link
Contributor

@brettnawrocki brettnawrocki commented Jan 5, 2024

Jira Ticket: SERVER-81279

Whats Changed:
Adds performance workloads for various cases involving updateMany in sharded clusters.

Patch testing results:
https://spruce.mongodb.com/version/6597130356234348f74ef21a/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

@dalyd
Copy link
Collaborator

dalyd commented Jan 5, 2024

This passed through my inbox, but is not assigned to me. So two quick thoughts:
I think you probably want product-performance instead of performance-infrastructure on this. @mongodb/performance

Also, could you add Keywords along with description and owner?

@brettnawrocki brettnawrocki requested review from a team and removed request for a team January 5, 2024 23:08
@brettnawrocki
Copy link
Contributor Author

brettnawrocki commented Jan 5, 2024

@dalyd Sorry for the mixup! I read somewhere to add performance infrastructure for genny PRs, but in retrospect I suppose that means core genny code and not genny workloads. 😅

I can add keywords to the workloads as well. Thanks for taking a glance.

@thessem
Copy link
Collaborator

thessem commented Jan 8, 2024

Sorry about the confusion @brettnawrocki, I've got an eye on using DEVPROD-3142 to simplify the "who should review this" question with a CODEOWNERS file, once I get a little time to look at code quality infrastructure.

Copy link
Contributor

@jimoleary jimoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't started looking at the actual workloads yet but I'm concerned about everything in the templates directory.

Do we really want this in the repo? And if we are going to add template generation to genny I don't think that the patch command is how it should be implemented.

And I should add for full transparancy that I added some templating support in the past here.

Copy link
Contributor

@jimoleary jimoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some general comments to the yml.base file but overall the workload looks ok.

Do you have any idea how noisy the new workload is? We would generally run a new workload 3 to 5 times to determine how noisy it is.

Owner: "@mongodb/sharding"
Description: |
This workload does the following:
Create an unsharded collection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: either Create a sharded collection with a hashed monotonic int64 shard key. OR Insert 50k documents of around 20kB each using the monotonic loader. .

The monotonic incrementing nature of _id is important w.r.t to the update Filter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

SchemaVersion: 2018-07-01
Owner: "@mongodb/sharding"
Description: |
This workload does the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we move this point up to the top or change the intro to:

This workload does the following (with PauseMigrationsDuringMultiUpdates enabled): (and update the relevant patches so that the comment is correct for the other versions).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Collection: &Collection coll
Namespace: &Namespace test.coll

ApproxDocumentSize: &ApproxDocumentSize 20_000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think just DocumentSize is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Operations:
- OperationName: updateMany
OperationCommand:
Filter: { _id: 0 }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: could we half the number of workloads by executing the Filter: {} and Filter: { _id: 0 } in every workload? Essentially add a new phase for Filter: { _id: 0 } to every one.

Although I do appreciate that you are doing exactly what the ticket asks for.

Operations:
- OperationName: updateMany
OperationCommand:
Filter: { _id: 0 }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: you could we half the number of workloads by executing the Filter: {} and Filter: { _id: 0 } in every workload? Essentially add a new phase for Filter: { _id: 0 } to every one.

Although I do appreciate that you are doing exactly what the ticket asks for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment to highlight the differences for each version?

E.g: Filter: { _id: 0 } # updateMany on a single document. and Filter: { } # updateMany on all documents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the first suggestion and did both kinds of updateManys in separate phases per workload. It does make sense that it's better to reuse the setup if we can for both variants. I didn't add the comments since I think it's clear enough now the purpose of each filter based on the name of the metrics.

@brettnawrocki
Copy link
Contributor Author

I haven't started looking at the actual workloads yet but I'm concerned about everything in the templates directory.
Do we really want this in the repo? And if we are going to add template generation to genny I don't think that the patch command is how it should be implemented.
And I should add for full transparancy that I added some templating support in the past here.

I expected this to be somewhat controversial, though as you mention there is some existing precedent for "bring your own code generation" for other workloads in the repo. I see DEVPROD-3567 was filed partially in reaction to this PR. I agree that using patch is not ideal and probably shouldn't be standardized, but for smaller scale usage like this, I think it has the advantage of being flexible and already well understood by the average engineer.

My hope is that a better alternative can be proposed (and implemented if necessary), but that this can be used in the meantime.

Do you have any idea how noisy the new workload is? We would generally run a new workload 3 to 5 times to determine how noisy it is.

The cluster parameter in this workload isn't implemented yet (so enabling it does literally nothing). From the first patch, each workload is run effectively twice and I saw ~5% difference between them.

I planned to run another patch and then rerun the workload a few times to get better numbers, but my patch failed I believe due to BF-31402.

Copy link
Contributor

@jimoleary jimoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a suggestion about parameterization rather than code generation.

@@ -15,7 +26,7 @@ GlobalDefaults:
Collection: &Collection coll
Namespace: &Namespace test.coll

ApproxDocumentSize: &ApproxDocumentSize 20_000
DocumentSize: &DocumentSize 20_000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to parameterized this workload and get rid of the code generation?

Some loadconfig examples of this are:

  1. ClusteredCollection
  2. ClusteredCollectionLargeRecordIds

Essentially you would move this file into src/phases/multi_updates and then pass inNops for the pauseMigrationsDuringMultiUpdates and ShardCollection operations as needed. The operations could be parameterized something like the queries in

OperationCommand: {^Parameter: {Name: "OperationCommand", Default: *Q1_1Query}}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@brettnawrocki
Copy link
Contributor Author

New patch build: https://spruce.mongodb.com/version/65a85231a4cf47ebaffecbdd/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC
I also ran an additional patch to get a larger sample size: https://spruce.mongodb.com/version/65a9504b5623434a2afaad70/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

Results on Linux Shard Lite (all feature flags) 2022-11

Unsharded, Update One Unsharded, Update All Sharded, Update One Sharded, Update All
Trial 1 130502 125 128031 201
Trial 2 124107 117 130082 198
Trial 3 132235 118 126664 198
Trial 4 128496 121 118581 195
Min 124107 117 118581 195
Max 132235 125 130082 201
Diff 6.55% 6.84% 9.70% 3.08%

Copy link
Contributor

@jimoleary jimoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the major refactor.

I hope it wasn't too onerous but the workload looks much cleaner now.

updateMany will update all of the 50k documents per command in the first
phase and only a single document per command in the second phase.

Two parameters are accepted to alter the behavior of this workload:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the documentation!

@brettnawrocki
Copy link
Contributor Author

Thanks for the major refactor.

I hope it wasn't too onerous but the workload looks much cleaner now.

It was no problem, and as I mentioned before, I expected the code generator to be controversial. 😅 I think the end result is definitely more maintainable than the original iteration.

The way LoadConfig and OnlyActiveInPhases can be combined to run some blocks conditionally was not obvious to me, so thanks for pointing that out. I do kind of wish there was some special value for NopInPhasesUpTo that meant "all other phases" instead of needing to pass that in (and potentially modify it in a number of places if another phase is added). Similarly, I wish I could have made the parameters e.g. shardCollection: true|false instead of needing to pass in [] or [1] to indicate the phases active, which again could change if the structure of the workload was updated.

@alicedoherty
Copy link
Contributor

Hey @brettnawrocki, just taking over from Jim on behalf of the performance team. We’d definitely encourage you to open a DEVPROD ticket (with the Assigned Teams field set to “Performance Infrastructure”) if you think having an “all other phases” feature in Genny would be valuable. Also, we have an OnlyRunInInstance: sharded feature that can be used to conditionally shard a collection (the downside of that is until DEVPROD-2049 is resolved you’ll need to exclude your workloads from dry-runs here).

None of this is blocking and I see Jim’s already approved this PR, but just adding this comment in case it adds any value!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants