Put metric samples in time series (immutable tag sets) on creation #1831

na-- · 2021-02-01T09:03:57Z

Currently, k6 builds the set of tags for every metric measurement it emits in a very ad-hoc manner and as a map[string]string. For example, building the tags set for the HTTP metrics happens here and here. It is a gradual accretion of key-value pairs, based on all of these things:

the global test run tags option
the current group tag from the lib.State, set there by group()
the tags parameter for the specific request Params
the observed system tag values from the request and response data, modulated by the systemTags global option

This is not too terrible and it mostly works, but it's not very efficient either... One big issue is that all of these string maps put a lot of GC pressure on k6 - we generate unique tag maps for every metric sample, and discard them as soon as we've crunched the data for the end-of-test summary and thresholds in the Engine and/or we've sent it to whatever external output is configured. Another performance problem is that we can't efficiently aggregate the metrics back, since that involves heavy string comparisons, both in the outputs and for any sub-metrics we have.

And finally, we don't really know the total number of unique metric tag sets we have in a test run. Load tests should generally have a small number of such, for easier data analysis, that's why we have things like the name tag and URL grouping. It would be very good UX if k6 could warn users if they exceed a certain number of unique tag sets in their script, since that almost certainly is a mistake.

To solve all of these issues, I think we can probably use some graph-based data structure to keep track of the various tag-sets that exist. The process of building the tag set for a specific metric sample would be walking through the graph and adding nodes when new tag keys or values are observed that weren't present in the graph. At the end of that tag accretion process, instead of a string map, the metric sample would contain a pointer to an immutable node from the graph. And every other metric sample with the same tags will have the exact same pointer. So, comparing metrics (for aggregation in outputs or sub-metric partitioning) should become as easy as comparing the pointers of their tag set objects.

The exact data structure we want to use and how we'll ensure thread-safety (given that tag sets can be generated concurrently by multiple VUs) is something that still needs to be determined... Some sort of an STM approach seems better than adding locks to every graph node, but I'm not sure how realistic that would be for Go... There seem to exist some libraries for it (e.g. https://github.com/anacrolix/stm), but it needs a lot of further evaluation.

Also, it's not certain how much we can optimize the Contains() check if a tag set is a subset of another tag set, used for sub-metric updates. It won't be slower, given that the graph nodes will probably contain the full tag sets, but it might not be possible to make it faster, if the alternative is a BFS of the graph. This depends solely on the data structure and algorithms we pick.

Alternatives to graphs should also be explored. We might be able to achieve most of the same benefits by using clever hashing of the tag set values. Or even having a global set with values being the concatenated key=value pairs, so that each metrics sample tag sets is a subset of the global one. For sure, before we start anything here, we should see how other projects (e.g. Prometheus) deal with similar issues in an efficient way...

Finally, it's not clear if we should try to enforce an upper limit to the number of unique tag sets. Personally, I think not - we should emit a very stern warning of some number is exceeded, say 1000 unique tag set combinations, but not try anything more fancy than that (like falling back to stringmap based tag sets).

The text was updated successfully, but these errors were encountered:

na-- · 2021-02-01T09:44:06Z

#1832 (or something like it) is a prerequisite for the current issue.

mstoykov · 2021-02-01T09:45:41Z

As an additions:
While I was thinking of the same thing, I released that at least in the cloud aggregation(and probably everywhere else), we bucket metrics by time. So for that matter the "tag set" can also be per time bucket, as the major reason for this change IMO is that comparing map[string]string is highly unefficient. This will have the benefit that if the tag set is increasing unbounded we at least will be dropping the old tag sets. And the deficiency that the tag set will be rebuild each time bucket. I personally after that decided that the building will probably be more costly and having an unbounded tag set is probably not a useful feature, as I am perfectly certain no system can handle 1million tag sets in a way that it matters that it does.

I was also thinking of a simple ordered [][2]string where the strings are interned. This has the benefit of only the interning of string to be unbounded, but we can do the above trick. This again will require comparing strings, but at least that noes is O(n) where n is the number of elements so ... that is much better ;).
Prometheus is basically having []struct{name, value string} which is better from whatI I've learned as golang is much finer with keeping struct{name, value string} on the stack then [2]string. https://github.com/prometheus/prometheus/blob/adc8807851b87257964fab57e064771520751872/pkg/labels/labels.go#L38-L44 . But as far as I know they use hashes for most of the bucketing later on (citation needed).

I am of the opinion that the most bang for our money will be a complete refactoring (through many smaller ones, including probably some of the above) of how metrics are handled.

Currently we emit metrics (generate objects and they move through the system), even if that objects will never be processed because of the lack of an output, for example. If we can have a system where metrics are not emitted for no reason and where we know when they are processed, we can also reuse them (and from my proffiling, GC is having problem with those as well).

We will also probably want to aggregate metrics before they reach any output, effectively cutting their lifetime, probably removing the need for some of the other optimizations. If you are just traversing the graph of metrics in the "aggregator" to reach the bucket where they will be aggregated. Obviously the aggregator will then emit "new" metrics with "new" tag sets but arguably they will be a much smaller set of those. But doing this work always, even if there isn't an output seems wasteful.

But we kind of have to do it as even now we always aggregate metrics https://github.com/loadimpact/k6/blob/46d53d99deba98add23079df93579408fa7ea094/core/engine.go#L405-L428 that will be show in the summary output or are needed for thresholds. This though IMO depends heavily on #1832 . So it might be worth first doing that and seeing how the code linked behaves then. Especially if we change it to always create submetrics for new tag sets

na-- · 2021-02-01T10:46:30Z

Not sure about time buckets being part of the tag set. They would just add cardinality to whatever we have needlessly. And time buckets depend on the actual "user" of the metrics - the cloud output will need very different buckets than something like #1136, and the summary and normal thresholds don't need any time buckets at all. So unless I am missing something, time and tag sets seem orthogonal concepts that shouldn't be mixed. They can be easily stacked as they currently are in the cloud aggregation, only where necessary.

👍 for string interning. Having a graph with immutable nodes would achieve that, but so would having a global set of key=value tags. Though you are right we should probably use [2]string or a simple struct{name, value string} for that, no need to concatenate tags needlessly.

na-- · 2021-07-12T10:06:25Z

Stated another way, this issue is somewhat about assigning every metric sample to a particular time series at the moment of its measurement. Relevant links:

yorugac · 2021-10-12T08:13:30Z

Just adding some observations from digging into Prometheus code.

I confirm that there is hashing of labels involved in Prometheus processing:
https://github.com/prometheus/prometheus/blob/adc8807851b87257964fab57e064771520751872/pkg/labels/labels.go#L136-L160

[As far as I saw] those hashes are involved in some tricky staleness logic and for caching Prometheus is primarily relying on a map[string]cacheEntry with the original scrapped string as a key. But probably more interesting here is the stripeSeries structure:
https://github.com/prometheus/prometheus/blob/e245edf5a4c281499060ab1bd6d473b576094c04/tsdb/head.go#L1288-L1304

It seems they bucket hashmaps by ID and keep multiple locks after all. And there are two mechanisms for lookup:

func (s *stripeSeries) getByID(id uint64) *memSeries
func (s *stripeSeries) getByHash(hash uint64, lset labels.Labels) *memSeries

But I suspect that IDs are considered more reliable than labels as I haven't seen getByHash used in main processing.

This logic appears to be somewhat tied to how TSDB is handled but then, its purpose is to move samples from memory to storage. Still, this can be classified as "clever hashing of the tag set values" with primary difference that k6 doesn't have such IDs now. OTOH, metric names are unique in k6.

And yes, this definitely comes up in #1761. It doesn't make sense to add much complex logic there at this stage but it doesn't currently use an efficient way to process labels either.

codebien · 2021-12-23T10:48:29Z

I explored a bit this issue (so many things to take into consideration 😱):

Prometheus is basically having []struct{name, value string} which is better from whatI I've learned as golang is much finer with keeping struct{name, value string} on the stack then [2]string

@mstoykov executing a basic escape analysis seems they both rely on the required memory size and there isn't per-type dedicated memory management but for sure it would require more complex investigation. If you have a reference for it, it would be helpful.
If we don't find an objective (or at least concrete) value for [2]string then I think a struct could have the value to maintain the code much more readable. I would prefer to read and use tag.Key then tag[0] all around the code, or *Tag if required to have a pointer.

So from my understanding of the previous comments, eventual PR should introduce 3 different types:

Tag struct{Key, Value string}: it should allow us to have a pointer of the string pair in one shot
TagSet []*Tag: it should be the sorted type used from a Sample and if needed for supporting aggregations it could be able to generate the hash. Would we cache the hash somewhere?

Also, it's not certain how much we can optimize the Contains() check if a tag set is a subset of another tag set, used for sub-metric updates.

@na– yes, I don’t think we can have something faster than O(logn)+O(len(subset)). I’m considering the case ABC.contains(BC) == true.

TagMap sync.Map (map[hash()]Tag): it contains the dataset of the set of unique tags and the real allocation. Considering the “immutability” and the read over writes sync.Map should fit better here.

oleiade · 2022-03-24T17:09:11Z

@codebien As discussed in private, while implementing support for submetrics in thresholds validation, I had somewhat of an intuition: wouldn't it simplify the metrics package types if Metric had a Parent that could be nil, instead of having a dedicated Submetric type?

I believe it could simplify metrics/submetrics manipulation throughout the codebase. No idea if it's a good idea, and what impact it would have though 🤔

na-- added performance refactor evaluation needed proposal needs to be validated or tested before fully implementing it in k6 labels Feb 1, 2021

na-- mentioned this issue Feb 1, 2021

Common metrics registry #1832

Closed

na-- changed the title ~~Immutable metric tag sets~~ Immutable metric tag sets (attach every metric sample to a time series) Jul 12, 2021

na-- changed the title ~~Immutable metric tag sets (attach every metric sample to a time series)~~ Immutable metric tag sets (attach metric samples to time series on creation) Jul 12, 2021

This was referenced Oct 29, 2021

Improve metrics processing and aggregation grafana/xk6-output-prometheus-remote#2

Closed

js/k6/exec: Set VU Tags #2172

Merged

na-- changed the title ~~Immutable metric tag sets (attach metric samples to time series on creation)~~ Put metric samples in time series (immutable tag sets) on creation Nov 10, 2021

codebien mentioned this issue Dec 9, 2021

output/influxdb: Counter aggregation #2289

Open

na-- mentioned this issue Dec 13, 2021

Fix/1443 remove the JS runtime from threshold calculations #2251

Merged

oleiade mentioned this issue Dec 13, 2021

Remove the JS runtime from threshold calculations #1443

Closed

na-- mentioned this issue Dec 22, 2021

Aggregated submetrics for nested groups #2309

Open

yorugac mentioned this issue Jan 13, 2022

Guidance for remote write time > flush period grafana/xk6-output-prometheus-remote#10

Closed

na-- assigned yorugac Jan 19, 2022

yorugac mentioned this issue Jan 20, 2022

Bug: incorrect grouping of metrics during aggregation grafana/xk6-output-prometheus-remote#13

Closed

na-- mentioned this issue Feb 10, 2022

"sliding window" thresholds #2379

Open

sniku unassigned yorugac Feb 16, 2022

mstoykov mentioned this issue Mar 10, 2022

Refactor the Output interface #2430

Open

sniku assigned codebien Mar 17, 2022

rverma-dev mentioned this issue May 23, 2022

Occasional "duplicate sample for timestamp" errors grafana/xk6-output-prometheus-remote#11

Closed

na-- mentioned this issue Jun 9, 2022

OpenTelemetry metrics output #2557

Open

codebien mentioned this issue Jun 29, 2022

Time Series data model #2580

Closed

codebien mentioned this issue Jul 11, 2022

Refactor metric samples to use Atlas for tags and have time series #2594

Merged

codebien assigned na-- Aug 4, 2022

codebien added this to the v0.40.0 milestone Aug 4, 2022

na-- modified the milestones: v0.40.0, v0.41.0 Aug 19, 2022

codebien mentioned this issue Sep 1, 2022

K6 Remote write took 5.1603032s while flush period is 1s. Some samples m ay be dropped. grafana/xk6-output-prometheus-remote#36

Closed

codebien closed this as completed in #2594 Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Put metric samples in time series (immutable tag sets) on creation #1831

Put metric samples in time series (immutable tag sets) on creation #1831

na-- commented Feb 1, 2021

na-- commented Feb 1, 2021

mstoykov commented Feb 1, 2021

na-- commented Feb 1, 2021

na-- commented Jul 12, 2021

yorugac commented Oct 12, 2021

codebien commented Dec 23, 2021

oleiade commented Mar 24, 2022

Put metric samples in time series (immutable tag sets) on creation #1831

Put metric samples in time series (immutable tag sets) on creation #1831

Comments

na-- commented Feb 1, 2021

na-- commented Feb 1, 2021

mstoykov commented Feb 1, 2021

na-- commented Feb 1, 2021

na-- commented Jul 12, 2021

yorugac commented Oct 12, 2021

codebien commented Dec 23, 2021

oleiade commented Mar 24, 2022