New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: use contexts #9949
base: master
Are you sure you want to change the base?
metrics: use contexts #9949
Conversation
@@ -146,27 +146,31 @@ func FinishReportAndFlushUserAction(action string, err error, start time.Time) f | |||
return wait | |||
} | |||
|
|||
func (r *Reporter) reportClusterMetrics(ctx context.Context) { | |||
func (r *Reporter) reportClusterMetrics(rctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There’s no need for *r*ctx
here.
return | ||
case <-ticker.C: | ||
} | ||
metrics := &Metrics{} | ||
r.internalMetrics(metrics) | ||
externalMetrics(r.env.GetKubeClient(), metrics) //nolint:errcheck | ||
ctx, c := context.WithTimeout(rctx, reportingInterval/2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all this is put into the ticker case
of the select
(which I think makes more sense anyway), the the scoping here is more obvious.
r.internalMetrics(metrics) | ||
externalMetrics(r.env.GetKubeClient(), metrics) //nolint:errcheck | ||
ctx, c := context.WithTimeout(rctx, reportingInterval/2) | ||
r.internalMetrics(ctx, metrics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why dedicate half the time to collecting and half to reporting, instead of trying to do both in total? time.Ticker
“will adjust the time interval or drop ticks to make up for slow receivers.”
var (
metrics = new(Metrics)
ctx, cancel = context.WithTimeout(ctx, reportingInterval)
)
defer cancel()
r.internalMetrics(ctx, metrics)
externalMetrics(ctx, r.env.GetKubeClient(), metrics) //nolint:errcheck
metrics.ClusterId = r.clusterID
metrics.PodId = uuid.NewWithoutDashes()
metrics.Version = version.PrettyPrintVersion(version.Version)
r.router.reportClusterMetricsToSegment(metrics)
Edit: deleted mistake.
@@ -248,12 +252,9 @@ func inputMetrics(input *pps.Input, metrics *Metrics) { | |||
} | |||
} | |||
|
|||
func (r *Reporter) internalMetrics(metrics *Metrics) { | |||
func (r *Reporter) internalMetrics(ctx context.Context, metrics *Metrics) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a call to pctx.TODO
in here as well (and a bonus typo!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the TTL and context lifetime differential below is a bit surprising.
This avoids a dpanic in the grpc client logger while collecting internal metrics. It also puts a limit on how long metrics will be collected for.
I don't know if we even get these metrics anymore, but at least pachd doesn't die when PACHYDERM_DEVELOPMENT_LOGGER=1.