Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DoubleGaugeHistogram. #236

Closed
wants to merge 3 commits into from

Conversation

imccarten1
Copy link

This value is intended to be used with a GAUGE style DoubleHistogram.

This value is intended to be used with a GAUGE style DoubleHistogram.

// GAUGE is an AggregationTemporality for a metric of type GAUGE. Essentially
// it means that aggregation over time is not being used.
AGGREGATION_TEMPORALITY_GAUGE = 3;
Copy link
Member

@bogdandrutu bogdandrutu Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting because it does not apply to other places except where you wanted which is for Histogram to support OpenCensus and OpenMetrics concept of gauge Histograms.

I want others to comment, maybe we have a different type DoubleGaugeHistrogram

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I think adding a DoubleGaugeHistogram would be a tidier solution, but the lack of it seemed so odd, especially in comparison to the old open census types, that I assumed leaving it out was a deliberate decision. I'm happy to add it, instead of the aggregation_temporality_gauge, if people are ok with a new metric type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like more a new type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the pull request to instead add a metric type DoubleGaugeHistogram

@imccarten1 imccarten1 changed the title Add AGGREGATION_TEMPORALITY_GAUGE. Add DoubleGaugeHistogram. Nov 15, 2020
@jmacd
Copy link
Contributor

jmacd commented Nov 16, 2020

I think we need a clear explanation for what a Gauge Histogram is. Is this the sort of data you would get from a ValueObserver configured to reduce label dimensions, so that multiple gauge-like values are being assembled into a histogram?

I believe we've discussed using the "Instantaneous" temporality, which I like, but I believe we first have to argue why AggregationTemporality=DELTA does not apply. This is not as easy as it sounds.

In my own proposals to address this (what I believe to be the same) topic, I added a new two-value enum called "Structure", which could be either "Adding" or "Grouping".

@imccarten1
Copy link
Author

Yes, a DoubleGaugeHistogram is essentially supposed to be multiple DoubleGauge measurements grouped into a histogram. The specific metric that I was trying to process that triggered this pull request was a distribution showing the time since the VM image was built across multiple VMs.

The other reason for adding this is to allow handling converting to and from OpenCensus metrics in a more consistent and logical way. DoubleGaugeHistogram is intended to provide an equivalent to OC type MetricDescriptor_GAUGE_DISTRIBUTION. Currently in OpenTelemetry Collector the function OCSliceToMetrics silently drops metrics of type GAUGE_DISTRIBUTION, while MetricsToOC converts a metric of type DoubleHistogram with the AggregationTemporality set to something other than cumulative to an OC metric of type MetricDescriptor_GAUGE_DISTRIBUTION. The current setup doesn't make a lot of sense since the conversion only works one way, and the aggregation_temporality in never supposed to be set to unspecified and converting aggregation_temporality_delta as a gauge doesn't make sense.

@bogdandrutu
Copy link
Member

I believe we've discussed using the "Instantaneous" temporality, which I like, but I believe we first have to argue why AggregationTemporality=DELTA does not apply. This is not as easy as it sounds.

I spent time trying to understand this, but I think I got it. As @imccarten1 mentioned the usual case for the GaugeHistogram is usually when you aggregate for example multiple Gauge (points, or metrics) into a distribution (histogram). There are more other examples like (distribution of temperature values recorded by some iot devices in an area, requests size in a queue at a specific time, etc.).

After @jmacd comment I am more convinced that this are different than "DELTA". The gauge histogram implies that values are not related with previous reported points and cannot be "merged" with a previous reported point to increase the time "window" like we do in a "delta" case.

To clarify with an example, in case of a gauge histogram, let's report the size of requests in a queue:

t0 -> [R1 - size 10; R2 - size 15; R3 - size 12]
t1 -> [R2 - size 15; R4 - size 17]

Merging this is impossible because "measurements" in this case are removed from the distribution, and then the merged histogram will be wrong. Also some values may be reported in multiple points, others in just one.

I am not convinced if using "Instantaneous" temporality vs a different type is better. The reason is that there are other aggregations like Sum that have temporality, which arguably have the same behavior if I SUM bunch of gauge metrics. The reason I am not 100% confident is because a Instantaneous/Sum may be confusing compare with a Gauge metric.

@jmacd
Copy link
Contributor

jmacd commented Nov 19, 2020

I've spent some time thinking about this and didn't reach any new conclusions.
I agree with @bogdandrutu's argument that DELTA temporality gives an impression that you can add these up.
In ways, this is asking for a new compound point type, a first-class histogram value, that users can enter as a single event, and I'm inclined to resist this interpretation.

In Bogdan's example, it's the set of sizes currently in a queue containing uniquely-identified elements. We can model this as an instantaneous metric event that carries a histogram-valued point, but there are other ways we might instrument this situation.

Using a ValueObserver: in the callback, iterate through the queue and emit one value per entry (works because of uniqueness), using a label to distinguish their identity. Thus, we'll have labels R1, R2, R3, ... Now, we can configure a histogram aggregation of those values to remove the identity label, the result would be a histogram with neither DELTA nor CUMULATIVE temporality. I would prefer we not add a new type, still think of this as INSTANTANEOUS or NONE temporality.

But if we are introducing a NONE or INSTANTANEOUS temporality for this case, we could've done the same for Sum vs Gauge points: Sum points are DELTA or CUMULATIVE temporality, while Gauge points are NONE or INSTANTANEOUS temporality. This relates to my proposal over the summer that we combine Sum and Gauge into a single "Scalar" type and use Structure to distinguish them.

Comment on lines +232 to +244
// DoubleGaugeHistogram represents the type of a double histogram metric that
// always exports the "current value" for every data point. It should be used
// for an "unknown" aggregation.
//
// A DoubleGaugeHistogram does not support different aggregation temporalities.
// Given the aggregation is unknown, points cannot be combined using the same
// aggregation, regardless of aggregation temporalities. Therefore,
// AggregationTemporality is not included. Consequently, this also means
// "StartTimeUnixNano" is ignored for all data points.
message DoubleGaugeHistogram {
repeated DoubleHistogramDataPoint data_points = 1;
}

Copy link
Member

@bogdandrutu bogdandrutu Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description needs to be re-written. Aggregation is known, but temporality is unknown (INSTANTANEOUS).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imccarten1 can you fix this?

Base automatically changed from master to main January 28, 2021 18:32
@jmacd
Copy link
Contributor

jmacd commented Feb 24, 2021

Now that #257 proposes removing histogram types, I'm even more included to use the same histogram type and vary temporality to represent GaugeHistogram. The temporality I would call "Instantaneous" would be used with Histogram to identify GaugeHistogram.

I propose we create a new issue and PR to introduce this "instantaneous" temporality mode. The data model spec will say (like Prometheus) that when an instant vector of Gauge values is converted to a histogram point, the resulting data point should have instantaneous temporality. Suggest closing this after creating a new issue?

@jmacd
Copy link
Contributor

jmacd commented Mar 15, 2021

Thank you @imccarten1 I will close this to keep our attention focused on more urgent PRs. When we return to this topic, I support the idea of calling GaugeHistogram a new kind of temporality, not a new data point. See #274.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants