-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the kubelet check error reporting in the output of agent status
#6315
Conversation
8ce68e3
to
f9ca015
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just a small comment on using typecast.
pkg/collector/python/kubeutil.go
Outdated
log.Errorf("connection to kubelet failed: %v", err) | ||
return nil | ||
if e, ok := err.(*retry.Error); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably use errors.As
here? https://golang.org/pkg/errors/#example_As
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point !
0a3af81
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the usage of errors.As()
!!! 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small question, other than that LGTM 👍
pkg/util/retry/retrier.go
Outdated
@@ -104,10 +105,10 @@ func (r *Retrier) doTry() *Error { | |||
} | |||
method := r.cfg.AttemptMethod | |||
r.RUnlock() | |||
err := method() | |||
r.lastTryError = method() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it safe to write to r.lastTryError
while the mutex is unlocked?
What does this PR do?
In combination with DataDog/integrations-core#7495 , this change improves the
kubelet
check error reported in the output ofagent status
when the agent cannot properly connect to the kubelet.Motivation
In case the agent cannot properly connect to the kubelet, the useful details were in the logs but the output of the
agent status
command gave no clue about the reasons.Here is an example of the
agent status
output in such a case:Here is what the output becomes with this PR:
Additional Notes
This would help the investigation of issues like DataDog/integrations-core#2582.
Describe your test plan
Start the agent in a context where it schedules the
kubelet
check, but it cannot connect to it:docker run --rm --name datadog-agent -e DD_API_KEY=$DD_API_KEY -e KUBERNETES=yes -e DD_KUBERNETES_KUBELET_HOST=1.2.3.4 datadog/agent:7.23.0