Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuring histograms in the API (hints?) #2229

Closed
mateuszrzeszutek opened this issue Dec 17, 2021 · 16 comments · Fixed by #3216
Closed

Configuring histograms in the API (hints?) #2229

mateuszrzeszutek opened this issue Dec 17, 2021 · 16 comments · Fixed by #3216
Assignees
Labels
area:api Cross language API specification issue enhancement New feature or request spec:metrics Related to the specification/metrics directory

Comments

@mateuszrzeszutek
Copy link
Member

What are you trying to achieve?

I'm working on the micrometer->OTel bridge instrumentation in the javaagent. Some of the micrometer instruments translate to OTel concepts in a rather straightforward way, but some of them are pretty different. I had the most problems with histograms (used in e.g. micrometer Timer).
Histograms in micrometer are fully configurable via the micrometer API. Micrometer supports calculating percentiles on the client side, setting histogram buckets, rotating/expiring histogram buckets, and some other options that I don't fully comprehend yet. Meanwhile, you can't configure histograms in OTel metrics API at all - you can use Views, but that's in the SDK; and pretty much the only thing you can configure in a view are explicit buckets.
For now, in my instrumentation I can just fall back to the micrometer way of collecting histogram data (which basically is two sets of gauges with different tags for percentiles/count buckets) - but perhaps we should introduce a way to configure these things in the metrics API (hints)?

Additional context.

Micrometer bridge PR: open-telemetry/opentelemetry-java-instrumentation#4919

CC @jsuereth

@mateuszrzeszutek mateuszrzeszutek added the spec:metrics Related to the specification/metrics directory label Dec 17, 2021
@bogdandrutu
Copy link
Member

See #2232 (comment)

@reyang reyang added the area:api Cross language API specification issue label Jan 11, 2022
@reyang reyang added this to the Metrics Future Release milestone Jan 11, 2022
@reyang
Copy link
Member

reyang commented Jan 11, 2022

Related to #1753 (Metrics Hint API).

@reyang reyang added the enhancement New feature or request label Jan 11, 2022
@weyert
Copy link

weyert commented Jun 21, 2022

The inability to define histogram bucket sizes in the API is quite a inconvenience. As the consumer of a library (e.g. express metrics, host metrics) need to be aware:

  • Library has histograms
  • Need to know the expected bucket sizes of the histograms
  • Need to define the buckets in the SDK for the histogram requires the name of the metric

Quite a few steps when you just want to 'throw' a Express middleware which creates Express metrics like http request/response size and http duration or the hos metrics.

I made this mistake and now I am having a mismatch of bucket sizes for the same metrics in Prometheus not sure how to solve that at the moment

@jsuereth
Copy link
Contributor

@jmacd
Copy link
Contributor

jmacd commented Nov 29, 2022

To get Lightstep onto using the OTel APIs I had to add several off-spec metrics features features (mainly the synchronous gauge aggregation), which would ideally be expressed via hints. The powerful thing about an API-level hint is that it allows configuring things when the instrument is created as opposed to when the SDK is started. For the Lightstep metrics SDK we support configuring the exponential histogram size, e.g., https:/lightstep/otel-launcher-go/tree/main/lightstep/sdk/metric#metric-instrument-hints-api

@asafm
Copy link
Contributor

asafm commented Dec 4, 2022

I left, and comment in Slack, and I'm also leaving it here.

When you create a bucket-based histogram, you use the API, in which you don’t specify it’s a bucket-based histogram.
You have to do that via a view, which in the Java SDK seems to be done in the initialization of your app.
So it seems that defining a bucket-based histogram, with the buckets you’d like, is split into two separate locations - the actual file where the logic using the histogram is and the app in a file in which you define the view and specify the name of the histogram.
I’m contrasting this with how I’m used to working with metrics in DropWizard / Prometheus Simple Client and other metrics libraries in which all is done in place.

I understand the power of overriding buckets/aggregation type for a given instrument if you do not own that instrument - i.e., in a library you use in your app. Yet, as a user - be it the library writer or just you, the app developer: you just create an instrument, like a histogram, and you would like to define its aggregation in the exact location when it's created - it's the natural place for this. My only guess is that 80% of the time, you'd be doing that, and only 20% of the time you'll override the aggregation type.
Moreover, as an application developer, you need to start creating constants to share the metric name, so if the metric name is changed, its aggregation type associated with it will also be matched (in the view declaration). Seem like there is incidental complexity here.

Therefore I think enabling the user to specify the aggregation type and configure it seems pretty important to exist in the API itself.

@jack-berg
Copy link
Member

As I'm reading through the feedback I'm noticing that virtually all of it is related to the lack of ability to define histogram configuration options at the point of instrumentation.

What if we narrow the scope (as the PR title suggests) and instead of designing a hint API just fix the histogram instrument?

When you think about it, API users are actually already making "hints" to the SDK by their choice of instrument. By using a counter you're hinting that a sum is probably sufficient even though the SDK could keep a full histogram to track the distribution of measurements. By recording a set of attributes you're hinting at the set of dimensions that are likely to be useful, even though the SDK may drop some to reduce cardinality.

Today, histogram creation accepts name, description (optional), unit (optional), and value_type as arguments. We could add an additional optional argument called something like histogram_aggregation_options.

histogram_aggregation_options could consist of the type of histogram to use (exponential bucket or explicit bucket) as well as options for the selected type (max size for exponential, bucket boundaries for explicit).

The main question would be to determine how these options interact with the SDK's logic for determining aggregation, which involves MetricReader's default output aggregation and Views. I would suggest solving this by adjusting the definition of the Default Aggregation to incorporate the options specified by the API caller when creating the histogram instrument.

The result would be:

  • If the instrument matches one or more view's selection criteria, use the aggregation specified by the view.
  • Else if the metric reader (or associated metric exporter) specifies a default output aggregation for the instrument type, use it.
  • Else use the default aggregation, which incorporates the options specified by the API caller when creating the instrument.

What do folks think? To me this seems like a pretty tractable way to provide a lot of the value, without boiling the ocean.

@tsloughter
Copy link
Member

When you think about it, API users are actually already making "hints" to the SDK by their choice of instrument.

I just wanted to second this important point. There is no guarantee your histogram will even be aggregated as a histogram, adding additional option to instrument creation that is passed to the default instrument type aggregation seems clean and clear, and what people are asking for.

@asafm
Copy link
Contributor

asafm commented Dec 22, 2022

The result would be:

If the instrument matches one or more view's selection criteria, use the aggregation specified by the view.
Else if the metric reader (or associated metric exporter) specifies a default output aggregation for the instrument type, use it.
Else use the default aggregation, which incorporates the options specified by the API caller when creating the instrument.

The way I see it:
Specifying it on the instrumentation level presents the default - meaning, if no other override mechanism was specified, this should be used.
I guess my problem is with the next in the chain, as you mentioned: the reader. When you say, "This is the default for a certain instrumentation type", for me, it means: if no aggregation were defined, we'd default to what I'm configuring here in the reader. So I wouldn't see it as an override to what the user says but as a default.

Next is the view mechanism, which, by definition, is an override (if you use the same name, that is).

@tsloughter
Copy link
Member

The reader basically defines what an instrument's aggregation is, so the defaults are what the reader considers defaults. The view is the only thing that can define a different aggregation.

@asafm
Copy link
Contributor

asafm commented Dec 26, 2022

One question popped into my mind: Will Hints be typed? If so, will the aggregations allowed be constrained only to the ones specified here, i.e., Explicit Bucket and Exponential Bucket? I'm asking because I was thinking of creating an SDK for my purposes which will require using the Summary type of aggregation, and I wanted to know if I can rely on the API for creating instruments for my needs.

@jack-berg
Copy link
Member

I'm asking because I was thinking of creating an SDK for my purposes which will require using the Summary type of aggregation, and I wanted to know if I can rely on the API for creating instruments for my needs.

If you're referring to an aggregation that produces summary type metrics, that wouldn't be possible because there currently is no summary aggregation. However, you can produce a histogram that looks very similar to a summary by configuring an empty list of bucket boundaries, putting all measurements in a single bucket and providing: min, max, sum, count.

@asafm
Copy link
Contributor

asafm commented Dec 27, 2022

If you're referring to an aggregation that produces summary type metrics, that wouldn't be possible because there currently is no summary aggregation.

My thinking was creating my own SDK (my flavor, so it doesn't match the SDK spec). The protocol supports a Summary, as can also be seen in the Java SDK io.opentelemetry.sdk.metrics.data.SummaryPointData.
What you suggest only support min/max and lack quantiles :)

@asafm
Copy link
Contributor

asafm commented Jan 4, 2023

@jack-berg How can we move your idea forward of amending the Histogram interface to have a way to configure aggregation and its options/configuration?

@jack-berg
Copy link
Member

@jack-berg How can we move your idea forward of amending the Histogram interface to have a way to configure aggregation and its options/configuration?

Someone needs to open a PR to the speec with the proposed changes. Often helps to have a prototype implementation to drive discussion. There will likely be a debate on whether the histogram API can be changed in isolation or should be considered within a wider (and still ambiguous) "hint API". Assuming the PR to update the spec is merged, language SDKs can go and implement the changes. I'm interested in this topic but juggling quite a few things so can't commit on driving it on a particular timeline.

jack-berg added a commit that referenced this issue Apr 8, 2023
Fixes #2229.
Related to #3061 (lays groundwork but does not resolve).
Related to #2977, which may use this new API to have
`http.server.duration` report in seconds instead of ms without changing
/ breaking default bucket boundaries.

Summary of the change:
- Proposes a new parameter to optionally include when creating
instruments, called "advice".
- For the moment, advice only has one parameter for specifying the
bucket boundaries of explicit bucket histogram.
- Advice can be expanded with additional parameters in the future (e.g.
default attributes to retain). The parameters may be general (aka
applicable to all instruments) or specific to a particular instrument
kind, like bucket boundaries.
- Advice parameters can influence the [default
aggregation](https:/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#default-aggregation),
which is used if there is no matching view and if the reader does not
specify a preferred aggregation.
- Not clear that all advice will be oriented towards configuring
aggregation, so I've intentionally left the scope of what they can
influence open ended.

I've prototyped this in java
[here](open-telemetry/opentelemetry-java#5217).
Example usage:
```
DoubleHistogram doubleHistogram =
        meterProvider
            .get("meter")
            .histogramBuilder("histogram")
            .setUnit("foo")
            .setDescription("bar")
            .setAdvice(
                advice -> advice.setBoundaries(Arrays.asList(10.0, 20.0, 30.0)))
            .build();
```

Advice could easily be changed to "hint" with everything else being
equal. I thought "advice" clearly described what we're trying to
accomplish, which is advice / recommend the implementation in providing
useful output with minimal configuration.

---------

Co-authored-by: Reiley Yang <[email protected]>
@asafm
Copy link
Contributor

asafm commented Apr 16, 2023

🥳 Brilliant work @jack-berg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue enhancement New feature or request spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants