-
Notifications
You must be signed in to change notification settings - Fork 899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Controller Options Overrides one-pager #5557
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: max-melentyev <119438383+max-melentyev@users.noreply.github.com>
Different Crossplane controllers may have very different load profile and number of managed resources. | ||
Consider examples in crossplane-contrib/provider-aws: | ||
|
||
- Recomciling route53 records is fast, but having many resources of this type may trigger AWS API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I my head, polling in providers serve two purposes:
- gradually getting a resource from non-existent, to creating and finally ready/synced, and communicating this state and connection parameters to a xr. This is latency sensitive, there's often a human in the loop waiting for this.
- drift detection and observing and state changes
I think such a change here should take this into consideration. I believe I have read something about it but I'm not on top of all the current activities these days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Purposes are correct. What do you think is missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea is that we add both pollInterval (for resources not ready) and driftInterval (for other resources). But there may be other alternatives.
At least this proposal as it is would allow us to add this interval later without any big change here.
But another question is if you would need per-controller intervals if you have pollInterval and driftInterval as options? Do we need both features, or just one? And if we just have one, which would serve most users/use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's decided to add a driftInterval, I think it makes sense to have it as a separate option. There can be different scenarios and people may want to response to drift faster than others.
Signed-off-by: max-melentyev <119438383+max-melentyev@users.noreply.github.com>
### Providing Configuration | ||
|
||
Controller specific options can be provided with `--controller-option {controller-id}.{option}={value}`, | ||
for example `--controller-option ec2.instance.pollInterval=30m`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative could be more fine-grained controls via annotations on MR metadata, which would override the provider default. Per controller config you're proposing here is something in the middle; even though they have different purposes/use cases, not sure if we need both of these configuration knobs 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, and it may require less changes in providers.
Though it makes resource owner responsible for configuring poll interval in the cases when it shouldn't care of it. For example, in provider-aws we implemented a monitor of AWS event that triggers reconciliation when resource is updated in AWS, and this allows us to increase poll interval for supported resources. Having this configuration as an annotation, will require updating all existing resources, and any CRD controllers that create these resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we'd have something like spec.pollPolicy
(similar to spec.managementPolicies
). I could imagine a few options:
- Whatever the provider default is
- A specific interval (30s, 2h, etc)
- Don't poll at all? This would basically disable drift correction - MRs would only be reconciled if they changed.
I've created a PR with a preview of the feature that relies on options overrides: crossplane-contrib/provider-aws#2048 |
Having a feature that makes it possible to override default poll interval for a specific controller | ||
would allow to avoid these issues without affecting other responsiveness of controllers in the provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's anything else we'd want to override, apart from poll interval.
I'm thinking things like configuring exponential backoff for retries, or number of worker goroutines for a particular controller.
Interestingly poll interval feels like it could be configured on a per-MR basis, whereas the others are more a per-MR-type / per controller basis.
Description of your changes
We faced an issue with scaling crossplane-contrib/provider-aws for thousands of EC2 instances and route53 records. This PR is a part of the process we plan to address that issues: