Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Scripted Metrics Aggregation #2646

Closed
doubret opened this issue Jan 15, 2015 · 105 comments
Closed

Support Scripted Metrics Aggregation #2646

doubret opened this issue Jan 15, 2015 · 105 comments
Labels
Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Feature:elasticsearch Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@doubret
Copy link

doubret commented Jan 15, 2015

Replaced original description ~ @timroes

This ticket tracks implementing Elasticsearch's Scripted Metrics Aggregation (SMA) into Kibana.

SMA can be used to completely calculate a custom metric value based upon map/reduce on each individual document, i.e. you have the chance to map each document to a value (based on all fields in that document) and then use a combine and reduce script to reduce all those values into one metric result. An example for a scripted metric aggregation (taken from the documentation) could look as follows:

"scripted_metric": {
  "init_script" : "params._agg.transactions = []",
  "map_script" : "params._agg.transactions.add(doc.type.value == 'sale' ? doc.amount.value : -1 * doc.amount.value)", 
  "combine_script" : "double profit = 0; for (t in params._agg.transactions) { profit += t } return profit",
  "reduce_script" : "double profit = 0; for (a in params._aggs) { profit += a } return profit"
}

Do not confuse with Bucket script aggregation!

If you are simply looking into, e.g. the ratio of two metric values, this is not the right ticket to track. To calculate the result of multiple metric aggregation into a new value per bucket (e.g. metric1 / metric2 * 100), you would use the Bucket Script Aggregation available in Elasticsearch.

The progress for supporting that aggregation in Kibana is tracked in #4707.

Original description

Hi,

Are you planning to support scripted metric aggregation in a near future ?
I would like this to compute a ratio between to aggregations.
The first aggegation is NbClicks, the second is NbPrints, and i would like to output the ratio NbPrints / NbClicks.

With a scripted metric aggregation it would be easy to do, but if you know a way to do it without let me know please.

Thanks.

@rashidkpc
Copy link
Contributor

Would be a great enhancement. I'd love to see a better script debugger in Kibana to complement this.

@doubret
Copy link
Author

doubret commented Jan 15, 2015

Should be simple to implement considered that it is somewhat similar to scripted fields (at least in terms of interface).

@timroes
Copy link
Contributor

timroes commented Jan 28, 2015

Would also like to see this a lot!

@prachaquant

This comment has been minimized.

@loren
Copy link
Contributor

loren commented Feb 11, 2015

This would be huge. I just got the scripted metric aggregation working in Sense (yay!) and I'd love to be able to hook it up in Kibana for the win.

@monotek

This comment has been minimized.

1 similar comment
@the-fine
Copy link

+1

@rashidkpc
Copy link
Contributor

Unfortunately, due to the Groovy issues, we don't really have a secure way to accomplish this until we have a safe language that has loops and such

@hsm3
Copy link

hsm3 commented Mar 11, 2015

I need this. Note that with many other similar products (NewRelic, Librato), you can submit a metric along with a "count" of how many samples that metric covers, and then the tool can do proper averages over a bucket of those items.

@leoatavalancha

This comment has been minimized.

@kzarzycki-advertine

This comment has been minimized.

@lcs777

This comment has been minimized.

@gauravkumar37
Copy link

+1
I am storing pre-aggregated data in ES- sum() and count(), and want to calculate average.
Individual average can be calulated by scripts by dividing sum by count.
However, to see global average, I would need, sum(sum())/sum(count()) which I currently cannot do without scripted metrics aggregation.

@sqpdln

This comment has been minimized.

@ryaanwells
Copy link

+1
Is this something we can do in the front end in Kibana? Say if I'm processing the two columns doubret has said (NbPrints and NbClicks) then we already have the data we require.

Maybe the introduction of a new "Metrics" aggregation which is a "Ratio" of two current aggregations that have been processed by ES? Or could be two new specified aggregations (field and aggregation type) that we would aggregate in ES behind the scenes and then surface the ratio metric.

@jlew-cs
Copy link

jlew-cs commented Apr 16, 2015

@rashidkpc Regarding the groovy security issues:
Would it be possible to sidestep the problem by using file-based (or id-based) scripted aggregations (installed in ES config/scripts) rather than issuing dynamic script aggregation requests? This seems safe, and better than nothing.

@guybartal

This comment has been minimized.

4 similar comments
@fabiangebert
Copy link

+1

@zijian1981
Copy link

+1

@vikrim1
Copy link

vikrim1 commented Apr 28, 2015

+1

@mimes70
Copy link

mimes70 commented Apr 30, 2015

+1

@fabiangebert
Copy link

Can anyone give instructions how to implement this? My expectation would be that you can save scripted aggs in the objects list and use them from the dropdown in the visualize function.

@Krever

This comment has been minimized.

2 similar comments
@jbarata
Copy link

jbarata commented May 14, 2015

+1

@nambrot
Copy link

nambrot commented May 19, 2015

+1

@fabiangebert
Copy link

I started here with a very basic implementation of scripted_metric in the visualize editor, hope that helps:
https://github.com/fabiangebert/kibana

@sbyim

This comment has been minimized.

@lfroment-datasweet
Copy link

@timroes . You are absolutely right. Formula works on the output dataset given by ES and is computed "within" Kibana.

@timroes
Copy link
Contributor

timroes commented May 17, 2018

@lfroment-datasweet It seems we have forgotten to backport that entry. I am now backporting it to 6.x and 6.3, so it will appear on the list of known plugins from 6.3 onwards.

@fbaligand
Copy link
Contributor

You're right @timroes, datasweet-formula brings feature way more similar to bucket script aggregation.

@georgezoto
Copy link

georgezoto commented May 17, 2018

@lfroment-datasweet happy to some more progress. I noticed an issue with visualizations created in datasweet when displayed in kibana_dashboard_only_user mode though. I have opened an issue in datasweet's repository. It is breaking my dashboard currently.

Also are there any plans to support more chart types like region maps or pie charts through your plugin? This might be a current limitation of Kibana that does not allow multiple metrics to be defined for these kind of charts but I might be mistaken.

@timroes timroes added Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Visualizations Generic visualization features (in case no more specific feature label is available) labels Sep 16, 2018
@tr4l
Copy link

tr4l commented Jan 2, 2019

+1

If I'm correct, this should be the same request to _msearch than agregating by terms, with the term field optional when we have a script inside the "JSON Input" Field.

The terms aggregration generate something like:

   "aggs":{
      "2":{
         "terms":{
            "field":"data.field.keyword",
            "size":10,
            "order":{
               "_count":"desc"
            },{
            "script":{
               "lang":"painless",
               "source":"myComplexFormulaThatIdontWantToUpdateDocumentWith"
            }
         }}
      }
   }

Which doesn't work (and should not work)
However, if I manually edit the request to remove the field terms (and replace it with the script)

"aggs":{  
      "2":{  
         "terms":{ 
           "size":10,
            "script":{  
               "lang":"painless",
               "source":"myComplexFormulaThatIdontWantToUpdateDocumentWith"
            }
         }
      }
   }

In that case the request to _msearch works.

As this is not supported yet, did someone have alternative (plugin ? manual edition of Vizualisation?) for this?
To add a piece of context, the formula will change regularly based on some experiment, and I really don't want to update my document each time.

@fbaligand
Copy link
Contributor

@tr4l

Today, there are 2 alternatives :

  • kibana-enhanced-table plugin for “table” visualizations
  • kibana-datasweet-formula plugin for other visualisations (it works also for table visualizations)

@tr4l
Copy link

tr4l commented Jan 2, 2019

@tr4l

Today, there are 2 alternatives :

  • kibana-enhanced-table plugin for “table” visualizations
  • kibana-datasweet-formula plugin for other visualisations (it works also for table visualizations)

This look like both plugins works with the result provide by ES, when what I need is to agregate based on the result of my script.

Let say I have a Person Document, with First Name and Last Name.
I want to know the top most common Full Name using an aggregation with a script that concatenate both of them.
Of course I can update my document and add a FullName field, but that's not my goal.

@fbaligand
Copy link
Contributor

Well, to do that in Kibana, the easiest way is to create a Kibana Scripted Field (in Management) named "full_name", which script is :
doc['first_name'].value + ' ' + doc['last_name'].value

Then you create a visualization (say Tag Cloud), and you do a "Terms" bucket aggregation based on "full_name" field.

@mathewthekkekara

This comment has been minimized.

@SylvainVISSIERE-GUERINET

This comment has been minimized.

@silviodc
Copy link

Hi Everyone,

There is any plans to include it on Kibana for the next release?
It seems very demanded in the community and the issue dates from 2015!

@anelson-vidscale
Copy link

I was able to connect the output of my scripted metric to a Vega visualization.
You might find that approach will unblock you and allow you to visualize the output from the scripted_metric.

In other news, I hear that supporting triple quoted strings is coming. I can't wait to push readable code into code reviews.

@yinrong

This comment has been minimized.

1 similar comment
@Sagesh
Copy link

Sagesh commented Jan 23, 2020

+1

@hendrikmuhs
Copy link
Contributor

Another alternative for this is to use a transform, transforms are available since 7.2 under basic license.

  1. create a transform that does the scripted metric aggregation
  2. query the output of the transform and create visualization, etc.

The benefit of transform is less load at query time and likely a more responsive dashboard. However, the price tag is the extra storage you need for the output index of the transform.

@fbaligand
Copy link
Contributor

Nice alternative @hendrikmuhs

@SolomonShorser-OICR
Copy link

@hendrikmuhs According to the documentation for 7.5, transforms are "in beta and [are] subject to change" (https://www.elastic.co/guide/en/elasticsearch/reference/7.5/transform-overview.html). They're also marked as an X-pack feature. Do you know how stable transforms are? Have they changed much since being introduced in 7.2?

@hendrikmuhs
Copy link
Contributor

@SolomonShorser-OICR Note that I am one of the authors of transform. It is an X-pack feature, while starting with 6.3 X-pack has been opened and elasticsearch distributions bundle it. We now like to say it's a commercial feature. In the case of transform however it's licensed as 'basic', which is a free license. That means you can use transform for free without any limitation.

Improvements have been made in every version, soon 7.6 will add cross cluster search support. Prior 7.5. the name was data frame transform.

I (and my co-workers) happily answer more questions, however I think this issue is not the right place. I only wanted to give a quick pointer to anyone trying to solve a usecase that require visualization of scripted metric results.

Let's use https://discuss.elastic.co/ for further questions (my name there: @Hendrik_Muhs)

@timroes
Copy link
Contributor

timroes commented Oct 5, 2020

Elasticsearch decided to discourage usage of scripted metric aggregations across the stack (see elastic/elasticsearch#63096 for more details why). So we'll be closing this issue, since there are no longer plans to implement this.

Please don't confuse scripted metric aggregation with bucket script aggregation which can be used to do calculations, like average(bytes) / requests and which is tracked via #4707.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Feature:elasticsearch Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

Successfully merging a pull request may close this issue.