Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Histogram support aggregation #1364

Closed
wants to merge 1 commit into from

Conversation

alimousazy
Copy link
Contributor

@alimousazy alimousazy commented Jun 11, 2016

Required for all PRs:

  • CHANGELOG.md updated
  • Sign CLA (if not already signed)
  • README.md updated (if adding a new plugin)

Adding Histogram (aggregation) support. User can list down metric for aggregation, supported methods
1- Percentile 90 80 ... 20 etc.
2- Count.
3- Mean.
4- Variance.

Method used to calculate these values has a very low memory print and specially designed for streaming data. More information can be found here http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf and here https://www.vividcortex.com/blog/2013/07/08/streaming-approximate-histograms/.

Note: This method of Streaming Approximate Histograms result quantile approximations there for it may not lead for exact values. User can optimize accuracy by increase bucket size.

I'm thinking about introducing middleware plugins which process data between inputs and output this will make the agent code clean and allow user to provide additional processing to the pipeline, If you agree on this I will move the code and put it inside a folder called middleware. Creating interface for middle ware to add and retrieve metrics using channels

@alimousazy
Copy link
Contributor Author

alimousazy commented Jun 14, 2016

I just moved the Histogram code to new type of plugins (which I called filter) this will make it easier to write test cases against it and will allow future add of these types of plugins which act like middleware between input and the output, some of use cases for middleware/filter plugins will be adding sampling, bandwidth control, control tag and fields included in metrics, shaping metrics like adding/renaming tags, fields and even metric names plus editing metric values . adding aggregation support , split metrics to multi metric based on fields, combine metrics to one metrics and a lot more.

@alimousazy alimousazy force-pushed the master branch 2 times, most recently from b0c2322 to 710f97c Compare June 14, 2016 19:38
@alimousazy
Copy link
Contributor Author

alimousazy commented Jun 14, 2016

I will add more error handling and Unit test once we agree one filter implementation.

@alimousazy alimousazy force-pushed the master branch 7 times, most recently from b95f146 to d1d0adf Compare June 15, 2016 20:22
@alimousazy
Copy link
Contributor Author

@sparrc any idea on this ? what you think about the filter plugins ?

@sparrc
Copy link
Contributor

sparrc commented Jun 21, 2016

@alimousazy you'll need to give me more time, I'm very swamped with other features at the moment.

@alimousazy
Copy link
Contributor Author

alimousazy commented Jun 21, 2016

sure. take your time.

@alimousazy alimousazy reopened this Jun 21, 2016
@jadbox
Copy link

jadbox commented Jun 23, 2016

Will this also handle sum/aggregate count?

@alimousazy
Copy link
Contributor Author

alimousazy commented Jun 23, 2016

@jadbox it does handle
1- Percentile 90 80 ... 20 etc.
2- Count.
3- Mean.
4- Variance.
Sum can be added easily if you want I can do that.

Here is an example usage if you can merge pull and compile it from source

[[filter.histogram]]
  bucketsize = 20  
  flush_interval = "1m"
  [filter.histogram.metrics]
    (replace with your metric name) = [0.90]

You can add more percentile if you want like [0.90, 0.50]
Note: this might change after telegraf team review the pull request.

I'm planning to add more filter in the future (which intercept metrics between input and output plugins) some of these filters might be [[filter.rename]] ,[filter.rename.tags],[[filter.must_have]], [[filter.sample_metrics]], [[filter.only_pass_if]], [[filter.bandwidth]] which help shape metrics before emitting for final distention.

@jadbox
Copy link

jadbox commented Jun 23, 2016

@alimousazy "Sum can be added easily if you want I can do that." YES, please!

My data "firehose" looks like the following (abstractly):

userID | timestamp | doesActionA | doesActionB
joe, 1466550440, 50, 20
joe, 1466550440, 10, 15

and I want to aggregate in telegraph before sending to Influx:

# aggregate into 1s blocks, and send each block to Influx
joe, 1466550440, 60, 35

My pipeline does 100k/s in messages so micro-aggregates before going into Influx is needed.
I appreciate the guidance as I'm an Influx/Telegraf newbie.

@alimousazy
Copy link
Contributor Author

alimousazy commented Jun 23, 2016

if joe is used as name of metric then configuration should be like this

[[filter.histogram]]
  bucketsize = 20  
  flush_interval = "1m"
  [filter.histogram.metrics]
    joe = [0.90, 0.25]

*Note: you can tone aggregation interval by modifying flush_interval (I may change flush interval to aggregation interval) , If you don't need percentile just leave the array empty.

The output will be similar to this

{"fields":{ "doesActionA_count":3,"doesActionA_mean":1,"doesActionA_p90":1,"doesActionA_variance":0 "doesActionB_count":3,"doesActionB_mean":1,"doesActionB_p90":1,"doesActionB_variance":0 },"name":"joe","tags":{tags___},"timestamp":1466718240}

@jadbox
Copy link

jadbox commented Jun 23, 2016

@alimousazy Thanks, that's helpful. I guess what I'm really looking for is just aggregation... not histogram aggregation. I'll take this concern to #380

@alimousazy
Copy link
Contributor Author

I'm planing to add support for all the metric types mentioned here http://metrics.dropwizard.io/3.1.0/getting-started/, once this get reviewed and we agree on first version

@alimousazy alimousazy force-pushed the master branch 2 times, most recently from fafb725 to 8f7bbdb Compare July 13, 2016 21:05
@sparrc
Copy link
Contributor

sparrc commented Jul 14, 2016

@alimousazy the histogram filter should do all by default. Since Filters in general should have a plugin structure and should thus appear in the default config file. Can you explain how that is going to look and exactly how one would configure the filters for generically aggregating on fields, tags, measurement names, etc.?

@alimousazy
Copy link
Contributor Author

alimousazy commented Jul 14, 2016

@sparrc Filters are middleware which stand between inputs and outputs (Pipes) . here is how histogram filter configuration. (Note that filter are independent from input and output )

Filters will look like this in configuration files (Note the configuration is flexible and we can change it)

[[filter.histogram]]
  bucketsize = 20  
  flush_interval = "1m"
  [filter.histogram.metrics]
    (replace with your metric name) = [0.90]

here is the interface that filters should implemented

package telegraf

type Filter interface {
        // SampleConfig returns the default configuration of the Input
        SampleConfig() string

        // Description returns a one-sentence description on the Input
        Description() string

        //create pipe for filter
        Pipe(in chan Metric) (out chan Metric)

        // start the filter
        Start(shutdown chan struct{})
}

we may add other filters (middleware) in the future like adding sampling, bandwidth control, control tag and fields included in metrics, shaping metrics like adding/renaming tags, fields and even metric names plus editing metric values , split metrics to multi metric based on fields, combine metrics to one, alerting and anomaly detection.

Please check the code for more detail 8f7bbdb

@sparrc
Copy link
Contributor

sparrc commented Jul 14, 2016

I'll need to do a fuller review, but like I said initially this is going to need to support matching metrics on more than just measurement name, and all name matching will need to support glob filters, much of which we already have implemented in internal.

filter := a.Config.Filters[name]
log.Printf("Filter %s is enabled", name)
metricStream = filter.Pipe(metricStream)
go func(f telegraf.Filter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this in a goroutine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filter suppose to work in is own context like other plugins it may have timer inside and i don't want to block the agent code from collecting metrics or flushing them. consider filter as pipe with input and output (black box). Histogram filter for example has timer for flushing metrics on specific intervals

@alimousazy
Copy link
Contributor Author

alimousazy commented Jul 16, 2016

I have to add command line option to generate default filters (Disabled) when generating configuration file ) please let me know what command line option you prefer for such operation ? also I'm going to work on adding glob matching on metric value and tags value ( I think it is not possible to do matching on field value since it is numeric ), is that what you expect ?

@alimousazy alimousazy force-pushed the master branch 2 times, most recently from 829b0d5 to 38849c9 Compare July 21, 2016 05:21
@alimousazy
Copy link
Contributor Author

Command line option added

 -filter-filter     filter the filter plugins to enable, separator is :
  -filter-list       print all the available filters

I will work on adding globing and unit test

@alimousazy
Copy link
Contributor Author

alimousazy commented Jul 29, 2016

@sparrc I added a new way to match metric, User can now specify Tags or Measurement to match against (Glob enabled) , rollup name, or function to apply on data set beside if they want to keep the old metric @jadbox.

User also can pass the original metric and only create rollup for aggregation like in second example

example
   rollup = [
      "(Name new) (Tag interface en*) (Functions mean 0.90)",
      "(Name cpu_value) (Measurements cpu) (Functions mean sum) (pass)",
    ]

Please tell me what you think

@sparrc
Copy link
Contributor

sparrc commented Sep 21, 2016

I'm closing this in favor of #1777

@sparrc sparrc closed this Sep 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants