Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix istioctl analyze vs error when a custom cluster domain #51064

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

nicole-lihui
Copy link
Member

@nicole-lihui nicole-lihui commented May 15, 2024

resolved #33174

Please provide a description of this PR:

@nicole-lihui nicole-lihui requested review from a team as code owners May 15, 2024 09:12
@istio-policy-bot istio-policy-bot added area/user experience release-notes-none Indicates a PR that does not require release notes. labels May 15, 2024
@istio-testing istio-testing added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 15, 2024
@@ -61,6 +62,17 @@ func InitServiceEntryHostMap(ctx analysis.Context) map[ScopedFqdn]*v1alpha3.Serv
return true
})

// use meshConfig.trustDomain changed the default domain
// todo: replace by `values.global.proxy.clusterDomain`
ctx.ForEach(gvk.MeshConfig, func(r *resource.Instance) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get MeshConfig with this function? I am not sure, MeshConig is from configmap

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a hack for MeshConfig.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the configmap was saved in cache as MeshConfig for analysis during initialization.

@@ -43,7 +43,9 @@ func (ctx *Context) Exists(config.GroupVersionKind, resource.FullName) bool { re
// ForEach implements analysis.Context
func (ctx *Context) ForEach(_ config.GroupVersionKind, fn analysis.IteratorFn) {
for _, r := range ctx.Resources {
fn(r)
if !fn(r) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @hanxiaop question

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to end the loop when the loop gets meshconfig, so added here

var configClusterLocalDomain string

func SetConfigClusterLocalDomain(domain string) {
configClusterLocalDomain = "svc." + domain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using globals like this is not safe or acceptable, this will cause data races. note this package is used in long running servers, not just the CLI

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that's indeed a problem, I'll find a new methods.

meshConfig := r.Message.(*meshconfig.MeshConfig)
if meshConfig.GetTrustDomain() != "" {
SetConfigClusterLocalDomain(meshConfig.GetTrustDomain())
return false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the following questions are related to @hanxiaop question below about !fn(r).

  • Why do we return false when GetTrustDomain() is not empty? Is the goal to stop iterating once we get the trust domain?, hence:
if !fn(r) {
    break
}

as seen in the ctx.ForEach(...) function update you added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah 😄

I think we need a break here and then add a loop break for the ForEach func.


var configClusterLocalDomain string

func SetConfigClusterLocalDomain(domain string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for my ignorance here but can you explain the reason behind checking for a non-empty Trust Domain(as returned via GetTrustDomain()) and setting Config Cluster Local Domain to that value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backgraound:

This issue originates from #33174, where the user customized the k8s cluster domain and used several related Istio configurations.

meshConfig.trustDomain
values.global.trustDomain
values.global.proxy.clusterDomain

However, when using istioctl analyze, it consistently shows the error service-name.namespace.svc.customDnsDomain IST0101 Error (Referenced host not found).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use trusDomain?
Actually, the up configuration keys, values.global.proxy.clusterDomain aligns better semantically.
and then it only used in the sidecar template and is not get value through any runtime configuration. Implementing this would be a complexity task.

so, considering the user scenario(when user need changed cluster domain), they generally configure these parameters simultaneously. As a result, trustDomain can get from meshConfig, and can be used as a reference value.

@@ -61,6 +62,17 @@ func InitServiceEntryHostMap(ctx analysis.Context) map[ScopedFqdn]*v1alpha3.Serv
return true
})

// use meshConfig.trustDomain changed the default domain
// todo: replace by `values.global.proxy.clusterDomain`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is the todo? Also, does not doing that work now effect the resolution of the issue in any way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the description above( why use ** trusDomain** ??), I believe the current solution is only a preliminary workaround. A more optimal solution is needed and requires further discussion.

And I'm not sure how the community plans and designs this aspect, and whether there's a need for unification. Additionally, I think it would be helpful to include some usage documentation, as many people might encounter the same issue.

@zirain @howardjohn @hzxuzhonghu @hanxiaop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a bug fix perspective, trustDomain can already resolved the user's issue.

However, I want to implement a more standardized and user-friendly solution. Therefore, I'll add a TODO to drive this improvement forward.

@@ -43,7 +43,9 @@ func (ctx *Context) Exists(config.GroupVersionKind, resource.FullName) bool { re
// ForEach implements analysis.Context
func (ctx *Context) ForEach(_ config.GroupVersionKind, fn analysis.IteratorFn) {
for _, r := range ctx.Resources {
fn(r)
if !fn(r) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @hanxiaop question

@istio-testing istio-testing added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 16, 2024
@nicole-lihui nicole-lihui requested a review from zirain May 16, 2024 09:43
@zirain zirain changed the title >fix istioctl analyze vs error when a custom cluster domain fix istioctl analyze vs error when a custom cluster domain May 16, 2024
"istio.io/api/networking/v1alpha3"
"istio.io/istio/pkg/config/analysis"
"istio.io/istio/pkg/config/resource"
"istio.io/istio/pkg/config/schema/gvk"
)

func GetCustomClusterDomain(ctx analysis.Context) string {
cusClusterDomain := ""
// use meshConfig.trustDomain changed the default domain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trustDomain and cluster domain are 100% orthogonal to each other. There is absolutely no relationship at all (beyond the same default) and they should not be used interchange-ably like this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to find a better solution 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/user experience release-notes-none Indicates a PR that does not require release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error IST0101 on VirtualService from analyze when using a cluster dnsDomain != "cluster.local"
8 participants