Refactor the `Output` interface #2430

na-- · 2022-03-10T13:19:00Z

I've recently been going back to basics and making some substantial changes and refactors in how base components of k6 work. One of these has been the current core.Engine. Through #1889, #2426 and another WIP PR I've been slowly trying to split it apart and obviate it, since that will unlock a lot of flexibility and remove tons of cruft.

During that effort, I've noticed another suboptimal component we have in k6, our outputs. This is the current output.Output API:

k6/output/types.go

Lines 58 to 76 in 55c7ccd

    
           type Output interface { 
        
           	// Returns a human-readable description of the output that will be shown in 
        
           	// `k6 run`. For extensions it probably should include the version as well. 
        
           	Description() string 
        
           	// Start is called before the Engine tries to use the output and should be 
        
           	// used for any long initialization tasks, as well as for starting a 
        
           	// goroutine to asynchronously flush metrics to the output. 
        
           	Start() error 
        
           	// A method to receive the latest metric samples from the Engine. This 
        
           	// method is never called concurrently, so do not do anything blocking here 
        
           	// that might take a long time. Preferably, just use the SampleBuffer or 
        
           	// something like it to buffer metrics until they are flushed. 
        
           	AddMetricSamples(samples []stats.SampleContainer) 
        
           	// Flush all remaining metrics and finalize the test run. 
        
           	Stop() error 
        
           }

And even though I was the one who kind of wrote it, I still think it's quite bad 😅 I think the biggest reason it's so bad is because it kind of had to be, due to the dysfunctional nature of our current Rube-Goldberg core.Engine 😅 But I'm really close to getting rid of the Engine completely, and it has been making obvious how bad the Outputs are because of it...

Here's what I actually propose the new Output API should look like:

type Output interface {
	// The output stops when `samples` is closed, and k6 will wait() for it to finish.
	Start(samples chan stats.SampleContainer) (wait func() error, err error)
}

type WithDescription interface {
	Output
	Description() string
}
// ... same single-method optional interfaces: WithThresholds, WithTestRunStop, WithRunStatusUpdates, WithBuiltinMetrics

Besides the fact that then new API is much simpler and more idiomatic Go code, the old API actually has a problem for the use case of k6. Specifically, buffering is important when we have network IO, but counter-productive when we're passing metrics internally in k6. It's probably somewhat jittery to have these buffers from AddMetricSamples() internally, since their batched processing will cause memory churn, but also small CPU usage spikes at intermittent intervals when they are processed. If there are CPU bursts every 50ms, that might affect the actual test execution and metric measurements... 😞

I also think there is a way to do this transition gracefully and support both old and new Output extensions for a few releases (the old ones with an adapter to the new API that emits a warning). We can move the outputs to the new /metrics/ top-level package and make the current /output package a adapter proxy for the new API for a few k6 version. That proxy can emit a deprecation warning, so users can switch their extensions to the new API.

The text was updated successfully, but these errors were encountered:

mstoykov · 2022-03-10T13:31:13Z

Without looking into the proposition too much:

Given that #1831 will likely have breaking changes for outputs as well I would prefer if we don't break the interface once to then break it again. Especially as (from a quick look at this) one of the breaking changes will not beneficial to current outputs all that much, even if it keeps supporting them for some time.

Given that I would prefer if we don't work on this before we work on #1831 which likely will need changes to the output interface as well.

na-- · 2022-03-10T13:39:36Z

🤔 Good points 👍 Though I am not sure if we'll be able to gracefully deprecate the current outputs without a ton of effort for #1831, so maybe don't bother with that and do both breaking changes at once, to rip the band-aid off quickly? 😅

na-- · 2022-03-21T11:53:11Z

We should probably also add a context to either the current Stop() function, or, ideally, to my proposed wait() error function, like this:

type Output interface {
	// The output stops when `samples` is closed, and k6 will wait() for it to finish.
	Start(samples chan stats.SampleContainer) (wait func(context.Context) error, err error)
}

This will allow us to set a soft limit for how long outputs are allowed to stop.

na-- · 2022-11-17T12:07:22Z

In the course of removing the Engine (#1889) by rebasing and polishing this old PoC commit from #2438, I started hitting some import cycles due to #2536 😞 😮‍💨 But then, in a classic "Hal fixing a light bulb" moment, I decided that it is the perfect reason to finally refactor the lib.RunStatus mess... 😅

The big problem there is that RunStatus is something quite specific to our cloud execution, but it is unfortunately part of the lib/ packages and used in a bunch of core places... 😞

My current thinking is that since these run statuses are pretty much specific for our cloud API, so they should be moved to the cloudapi/ package and basically not used anywhere else in the code. However, then this output interface becomes a minor problem:

k6/output/types.go

Lines 78 to 82 in 5fa71b7

    
           // WithRunStatusUpdates means the output can receive test run status updates. 
        
           type WithRunStatusUpdates interface { 
        
           	Output 
        
           	SetRunStatus(latestStatus lib.RunStatus) 
        
           }

The logical solution would be to have some way to notify the output that the test has finished executing with some error. If the err value is nil, then it finished normally. If it isn't, the cloud output (or any other output that cares) can determine the "run status" in a similar way to how we determine exit codes and such things elsewhere in the code 🤔

I'm in the process of making a PR with this refactoring, but I am also commenting in this issue because this will affect how we refactor the Output interface in the future. We either need to have some way to pass outputs this information (e.g. have a Stop(err error) method, not just a Start() one), or we need to provide outputs some way of querying this information themselves.

codebien · 2022-12-05T15:39:37Z

or we need to provide outputs some way of querying this information themselves.

In that case, it could be something like: s/SetRunStatus(latestStatus lib.RunStatus)/SetTestRunError(err error)/. The advantage of the setter is that it decouples the notification from the Stop operation. I don't know if it is helpful in some way, however, if the Stop method or the setter will require a deeper analysis of the usage.

In the case of Stop the context would still be required like Stop(ctx context.Context, err error) because as we said before it would be helpful to have a:

soft limit for how long outputs are allowed to stop.

For example, it would be beneficial for the Prometheus Remote Write Output that makes an HTTP call during the Stop operation. grafana/xk6-output-prometheus-remote#73 (comment)

na-- added refactor evaluation needed proposal needs to be validated or tested before fully implementing it in k6 breaking change for PRs that need to be mentioned in the breaking changes section of the release notes labels Mar 10, 2022

na-- mentioned this issue Mar 22, 2022

Use the registry for (sub-)metric validation and move data crunching out of the Engine #2426

Merged

na-- mentioned this issue Jun 9, 2022

OpenTelemetry metrics output #2557

Open

codebien mentioned this issue Dec 5, 2022

Stale markers grafana/xk6-output-prometheus-remote#73

Merged

This was referenced Dec 9, 2022

Add new errext.AbortReason type, move and use old RunStatus type only for k6 cloud code #2810

Merged

Remove the core.Engine 🎉 #2815

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the `Output` interface #2430

Refactor the `Output` interface #2430

na-- commented Mar 10, 2022 •

edited by codebien

mstoykov commented Mar 10, 2022

na-- commented Mar 10, 2022

na-- commented Mar 21, 2022

na-- commented Nov 17, 2022

codebien commented Dec 5, 2022

Refactor the Output interface #2430

Refactor the Output interface #2430

Comments

na-- commented Mar 10, 2022 • edited by codebien

mstoykov commented Mar 10, 2022

na-- commented Mar 10, 2022

na-- commented Mar 21, 2022

na-- commented Nov 17, 2022

codebien commented Dec 5, 2022

Refactor the `Output` interface #2430

Refactor the `Output` interface #2430

na-- commented Mar 10, 2022 •

edited by codebien