-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pilot: update pods in EndpointShardz when labels change #50432
pilot: update pods in EndpointShardz when labels change #50432
Conversation
// Annotations are only used in endpoints in one case, so just compare that one | ||
relevantAnnotationsChanged := old.Annotations[constants.AmbientRedirection] != cur.Annotations[constants.AmbientRedirection] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is why you wanted to ditch the annotation.
Can you throw a TODO in here and linkref to #50355?
Alternatively we can just check all annotations && labels but assuming that's measurably slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually isn't related to my discussion around annotation actually -- even if we do that, we still need this IMO. For instance, if I enroll the entire namespace, then the CNI will annotate the pod even with that proposal I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying we could make that annotation the CNI does be a label (which that issue discusses as an option), and we then wouldn't need to check annotations JUST for ambient status.
if we still choose to do that in #50355 then we would need to update this anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, at one point it had been discussed "humans use labels, machines (CNI) use annotations". I don't mind the annotation check here really -- I wouldn't weigh it too much on #50355
457f635
to
ff923fc
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of minor comments
func (c *Controller) recomputeServiceForPod(pod *v1.Pod) { | ||
allServices := c.services.List(pod.Namespace, klabels.Everything()) | ||
cu := sets.New[model.ConfigKey]() | ||
services := getPodServices(allServices, pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPodServices
is very costy, especially when services number is huge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called extremely rarely -- only if a pod's labels change. We already call this WAY more often in other codepaths (GetProxyServiceTargets). I don't think its so bad to list a namespace's services every time a pod changes.. even a large cluster isn't going to have >1k services in one namespace probably?
The complexity to optimize this is not worth it IMO -- it would be very complex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even a large cluster isn't going to have >1k services in one namespace probably?
Maybe
This function is called extremely rarely
Suggest we add a feature flag, this could be a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest we add a feature flag, this could be a breaking change.
How can it be a breaking change? Or just a performance regression you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance regression i mean
// If labels/annotations updated, trigger proxy push | ||
labelsChanged := !maps.Equal(old.Labels, cur.Labels) | ||
// Annotations are only used in endpoints in one case, so just compare that one | ||
relevantAnnotationsChanged := old.Annotations[constants.AmbientRedirection] != cur.Annotations[constants.AmbientRedirection] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I donot think we should change the behavior of redirect behavior on flight
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In envoy or in all of ambient?
If we do it in ztunnel but not envoy it causes an outage when you switch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all ambient, remember we have some fixed ops based on labels for ambient in CNI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can change the control plane, but you cannot make the CNI to rework
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This topic has been thoroughly discussed already and the conclusion was that dynamic changes was a hard requirement for ambient. See #48876 for some of the past discussion.
if len(endpoints) > 0 { | ||
c.opts.XDSUpdater.EDSCacheUpdate(shard, string(hostname), svc.Namespace, endpoints) | ||
} | ||
cu.Insert(model.ConfigKey{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we donot need to insert here, if a service->enpoint relationship changed, endpoint handler will do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not the service->endpoint relationship. Neither service nor endpoint change in k8s, pod did.
Istio's endpoint representation augments EndpointSlice with Pod
info, so this ensures if only the pod changes the IstioEndpoint updates.
/retest |
LGTM |
Fixes #43694
Fixes #50431