Add Controller Options Overrides one-pager #5557

max-melentyev · 2024-04-08T12:51:40Z

Description of your changes

We faced an issue with scaling crossplane-contrib/provider-aws for thousands of EC2 instances and route53 records. This PR is a part of the process we plan to address that issues:

This one-pager.
Add support for PollInterval overrides in crossplane-runtime.
Add support for PollInterval overrides in provider-aws (Allow to override options for specific controllers crossplane-contrib/provider-aws#2030 Discussion there resulted in this one-pager)
Implement optional event-driven reconciliation for ec2 instances and route53 records based on EventBridge events (WIP).

Signed-off-by: max-melentyev <119438383+max-melentyev@users.noreply.github.com>

chlunde · 2024-04-12T09:49:53Z

design/one-pager-controller-options-overrides.md

+Different Crossplane controllers may have very different load profile and number of managed resources.
+Consider examples in crossplane-contrib/provider-aws:
+
+- Recomciling route53 records is fast, but having many resources of this type may trigger AWS API


I my head, polling in providers serve two purposes:

gradually getting a resource from non-existent, to creating and finally ready/synced, and communicating this state and connection parameters to a xr. This is latency sensitive, there's often a human in the loop waiting for this.

drift detection and observing and state changes

I think such a change here should take this into consideration. I believe I have read something about it but I'm not on top of all the current activities these days.

Purposes are correct. What do you think is missing?

One idea is that we add both pollInterval (for resources not ready) and driftInterval (for other resources). But there may be other alternatives.

At least this proposal as it is would allow us to add this interval later without any big change here.

But another question is if you would need per-controller intervals if you have pollInterval and driftInterval as options? Do we need both features, or just one? And if we just have one, which would serve most users/use cases.

If it's decided to add a driftInterval, I think it makes sense to have it as a separate option. There can be different scenarios and people may want to response to drift faster than others.

Signed-off-by: max-melentyev <119438383+max-melentyev@users.noreply.github.com>

turkenh · 2024-04-15T07:26:57Z

design/one-pager-controller-options-overrides.md

+### Providing Configuration
+
+Controller specific options can be provided with `--controller-option {controller-id}.{option}={value}`,
+for example `--controller-option ec2.instance.pollInterval=30m`.


Another alternative could be more fine-grained controls via annotations on MR metadata, which would override the provider default. Per controller config you're proposing here is something in the middle; even though they have different purposes/use cases, not sure if we need both of these configuration knobs 🤔

Sounds good, and it may require less changes in providers.
Though it makes resource owner responsible for configuring poll interval in the cases when it shouldn't care of it. For example, in provider-aws we implemented a monitor of AWS event that triggers reconciliation when resource is updated in AWS, and this allows us to increase poll interval for supported resources. Having this configuration as an annotation, will require updating all existing resources, and any CRD controllers that create these resources.

I wonder if we'd have something like spec.pollPolicy (similar to spec.managementPolicies). I could imagine a few options:

Whatever the provider default is

A specific interval (30s, 2h, etc)

Don't poll at all? This would basically disable drift correction - MRs would only be reconciled if they changed.

max-melentyev · 2024-04-23T14:23:14Z

I've created a PR with a preview of the feature that relies on options overrides: crossplane-contrib/provider-aws#2048

negz · 2024-04-25T21:01:55Z

design/one-pager-controller-options-overrides.md

+Having a feature that makes it possible to override default poll interval for a specific controller
+would allow to avoid these issues without affecting other responsiveness of controllers in the provider.


I wonder if there's anything else we'd want to override, apart from poll interval.

I'm thinking things like configuring exponential backoff for retries, or number of worker goroutines for a particular controller.

Interestingly poll interval feels like it could be configured on a per-MR basis, whereas the others are more a per-MR-type / per controller basis.

Add Controller Options Overrides one-pager

e71125c

Signed-off-by: max-melentyev <119438383+max-melentyev@users.noreply.github.com>

max-melentyev requested review from negz and a team as code owners April 8, 2024 12:51

max-melentyev requested a review from bobh66 April 8, 2024 12:51

chlunde reviewed Apr 12, 2024

View reviewed changes

Update one-pager-controller-options-overrides.md

5280fcd

Signed-off-by: max-melentyev <119438383+max-melentyev@users.noreply.github.com>

turkenh reviewed Apr 15, 2024

View reviewed changes

max-melentyev mentioned this pull request Apr 23, 2024

Add monitor for event-based reconciliation crossplane-contrib/provider-aws#2048

Draft

2 tasks

negz reviewed Apr 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Controller Options Overrides one-pager #5557

Add Controller Options Overrides one-pager #5557

max-melentyev commented Apr 8, 2024 •

edited

chlunde Apr 12, 2024

max-melentyev Apr 12, 2024

chlunde Apr 12, 2024

max-melentyev Apr 15, 2024

turkenh Apr 15, 2024

max-melentyev Apr 18, 2024

negz Apr 25, 2024

max-melentyev commented Apr 23, 2024

negz Apr 25, 2024

		Having a feature that makes it possible to override default poll interval for a specific controller
		would allow to avoid these issues without affecting other responsiveness of controllers in the provider.

Add Controller Options Overrides one-pager #5557

Are you sure you want to change the base?

Add Controller Options Overrides one-pager #5557

Conversation

max-melentyev commented Apr 8, 2024 • edited

Description of your changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

max-melentyev commented Apr 23, 2024

Choose a reason for hiding this comment

max-melentyev commented Apr 8, 2024 •

edited