-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly handle gauge distributions in OCSliceToMetrics #2099
Conversation
Previously these metrics were silently dropped.
|
|
d69a7a7
to
f7f5c1b
Compare
|
Codecov Report
@@ Coverage Diff @@
## master open-telemetry/opentelemetry-collector#2099 +/- ##
==========================================
+ Coverage 92.15% 92.17% +0.02%
==========================================
Files 279 279
Lines 17086 17144 +58
==========================================
+ Hits 15746 15803 +57
Misses 921 921
- Partials 419 420 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
If someone else can confirm what should be the "temporality" set to (see comment) it would be great.
metric.InitEmpty() | ||
metric.SetDataType(pdata.MetricDataTypeDoubleHistogram) | ||
metric.DoubleHistogram().InitEmpty() | ||
return pdata.MetricDataTypeDoubleHistogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also return pdata.MetricDataTypeDoubleHistogram
for case ocmetrics.MetricDescriptor_CUMULATIVE_DISTRIBUTION:
below.
Is that correct to treat both types the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the cumulative one sets
histo.SetAggregationTemporality(pdata.AggregationTemporalityCumulative)
Possible values are
opentelemetry-collector/consumer/pdata/metric.go
Lines 24 to 28 in 0c44bc2
const ( | |
AggregationTemporalityUnspecified = AggregationTemporality(otlpmetrics.AggregationTemporality_AGGREGATION_TEMPORALITY_UNSPECIFIED) | |
AggregationTemporalityDelta = AggregationTemporality(otlpmetrics.AggregationTemporality_AGGREGATION_TEMPORALITY_DELTA) | |
AggregationTemporalityCumulative = AggregationTemporality(otlpmetrics.AggregationTemporality_AGGREGATION_TEMPORALITY_CUMULATIVE) | |
) |
Not sure how the "Unspecified" is treated, do we need to set it to delta here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverse conversion for reference:
opentelemetry-collector/translator/internaldata/metrics_to_oc.go
Lines 249 to 254 in c365946
case pdata.MetricDataTypeDoubleHistogram: | |
hd := metric.DoubleHistogram() | |
if hd.AggregationTemporality() == pdata.AggregationTemporalityCumulative { | |
return ocmetrics.MetricDescriptor_CUMULATIVE_DISTRIBUTION | |
} | |
return ocmetrics.MetricDescriptor_GAUGE_DISTRIBUTION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the temporality value, there are other GAUGE type metrics such as MetricDescriptor_GAUGE_DOUBLE which also leave the AggregationTemporality unset in descriptorTypeToMetrics(). I was also able to get correctly formatted results in an end to end test of the change that sent metrics to Stackdriver.
Returning pdata.MetricDataTypeDoubleHistogram for both DISTRIBUTION type metrics seems to make sense based on the reverse operation in metrics_to_oc. The alternative would be creating a new MetricDataType in pdata which seems like a large change that might generate push back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@imccarten1 this is tricky, because indeed for GAUGE_DOUBLE we do not set temporality because the Gauge in otlp proto does not have a temporality see https:/open-telemetry/opentelemetry-proto/blob/master/opentelemetry/proto/metrics/v1/metrics.proto#L153
I think the current way to "not" add temporality is wrong, if you want OTLP to support gauge histograms, please create a PR that adds a temporality called "GAUGE" in the proto, or document that missing temporality means "gauge" otherwise this is an abuse of the proto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created a pull request to add the proto value AGGREGATION_TEMPORALITY_GAUGE. Please take a look. open-telemetry/opentelemetry-proto#236
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Description: internaldata.OCSliceToMetrics has been updated to correctly handle metrics of type MetricDescriptor_GAUGE_DISTRIBUTION. Previously metrics with this type were silently dropped.
Link to tracking Issue: Fixes open-telemetry/opentelemetry-collector-contrib#10584
Testing: I've added unit tests that handle type gauge distribution to metrics_to_oc_test.go and oc_to_metrics_test.go. I also successfully manually tested exporting metrics of this type from a custom receiver to Stackdriver.
Documentation: no documentation updated