Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to limit length of values of attributes and metric values #1130

Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
4507656
Add an option to limit span attribute length
jtmalinowski Oct 22, 2020
deca25b
Change default value of OTEL_SPAN_ATTRIBUTE_SIZE_LIMIT to empty
jtmalinowski Oct 22, 2020
748eae0
Fix wording in Span attribute limiting section
jtmalinowski Oct 22, 2020
e7e54b3
Introduce metric label value size limit
jtmalinowski Oct 23, 2020
10af7ba
Spec compliance for truncating metric label values
jtmalinowski Oct 23, 2020
ed9bc5f
Limits docs improvements
jtmalinowski Oct 23, 2020
dabb292
Fix traces limiting requirement
jtmalinowski Oct 26, 2020
4102464
Expand span attribute limits to all attributes
jtmalinowski Oct 27, 2020
a6da2b9
Merge remote-tracking branch 'upstream/master' into feature/span-attr…
jtmalinowski Oct 27, 2020
c175613
Wording improvements
jtmalinowski Oct 27, 2020
5259377
More accurate naming for attribute truncation env variables
jtmalinowski Oct 28, 2020
d4c28eb
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jun 15, 2021
df01ff4
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jun 15, 2021
5ea1f00
fix: typos
jtmalinowski Jun 15, 2021
47c86a3
fix: merge mistakes
jtmalinowski Jun 15, 2021
8bd3a9b
fix: PR feedback
jtmalinowski Jun 16, 2021
c2ed17c
fix: typo in link
jtmalinowski Jun 16, 2021
05173b9
fix: PR feedback
jtmalinowski Jun 21, 2021
5b763d3
fix: PR feedback
jtmalinowski Jun 21, 2021
b180054
chore: changelog
jtmalinowski Jun 21, 2021
1434084
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jun 28, 2021
77a0688
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jul 5, 2021
5bde7ba
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jul 8, 2021
cf73a3b
Update specification/sdk-environment-variables.md
jtmalinowski Jul 12, 2021
50556d1
Update specification/common/common.md
jtmalinowski Jul 12, 2021
f3c5243
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jul 13, 2021
89f4b3a
fix: clarify priority of limits
jtmalinowski Jul 13, 2021
992b132
Merge branch 'main' into feature/span-attribute-size-limit
jtmalinowski Jul 16, 2021
f428739
fix: nerf logging requirements
jtmalinowski Jul 17, 2021
6f9992a
fix: grammar
jtmalinowski Jul 20, 2021
dd0b245
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jul 20, 2021
e3d7b73
Fixes from Yuri
jtmalinowski Jul 27, 2021
e526784
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jul 27, 2021
243b07d
fix: clarify metrics exemption
jtmalinowski Jul 27, 2021
cd51664
fix: link
jtmalinowski Jul 27, 2021
ff664a2
fix: language
jtmalinowski Jul 27, 2021
53522b6
Merge branch 'main' into feature/span-attribute-size-limit
jtmalinowski Jul 28, 2021
684a0a6
Merge remote-tracking branch 'upstream/main' into feature/span-attrib…
jtmalinowski Jul 29, 2021
d33eadf
Merge branch 'main' into feature/span-attribute-size-limit
jtmalinowski Jul 30, 2021
49b7ec9
Merge branch 'main' into feature/span-attribute-size-limit
jtmalinowski Aug 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ release.

### Traces

- Add generalized attribute count and attribute value length limits and relevant
environment variables.
([#1130](https:/open-telemetry/opentelemetry-specification/pull/1130))

### Metrics

### Logs
Expand Down
4 changes: 4 additions & 0 deletions spec-compliance-matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ formats is required. Implementing more than one format is optional.
| [IdGenerators](specification/trace/sdk.md#id-generators) | | + | + | | + | + | | | + | + | | + |
| [SpanLimits](specification/trace/sdk.md#span-limits) | X | + | + | | + | + | | | | - | | + |
| [Built-in `SpanProcessor`s implement `ForceFlush` spec](specification/trace/sdk.md#forceflush-1) | | | | | + | + | | | + | + | | |
| [Attribute Limits](specification/common/common.md#attribute-limits) | X | | | | | | | | | | | |

## Baggage

Expand Down Expand Up @@ -141,10 +142,13 @@ Note: Support for environment variables is optional.
|OTEL_TRACES_EXPORTER | - | + | | + | + | + | | - | - | | |
|OTEL_METRICS_EXPORTER | - | + | | + | - | - | | - | - | - | - |
|OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT | - | + | | + | + | - | | + | - | | |
|OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT | | | | | | | | | | | |
|OTEL_SPAN_EVENT_COUNT_LIMIT | - | + | | + | + | - | | + | - | | |
|OTEL_SPAN_LINK_COUNT_LIMIT | - | + | | + | + | - | | + | - | | |
|OTEL_TRACES_SAMPLER | - | + | | + | + | + | | - | - | | |
|OTEL_TRACES_SAMPLER_ARG | - | + | | + | + | + | | - | - | | |
|OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT | | | | | | | | | | | |
|OTEL_ATTRIBUTE_COUNT_LIMIT | | | | | | | | | | | |

## Exporters

Expand Down
40 changes: 39 additions & 1 deletion specification/common/common.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Table of Contents
</summary>

- [Attributes](#attributes)
- [Attribute Limits](#attribute-limits)

</details>

Expand Down Expand Up @@ -40,4 +41,41 @@ indices that are kept in sync (e.g., two attributes `header_keys` and `header_va
both containing an array of strings to represent a mapping
`header_keys[i] -> header_values[i]`).

See [Attribute and Label Naming](attribute-and-label-naming.md) for naming guidelines.
### Attribute Limits

Execution of erroneous code can result in unintended attributes. If there is no
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
limits placed on attributes, they can quickly exhaust available memory, resulting
in crashes that are difficult to recover from safely.

By default an SDK SHOULD apply truncation as per the list of
[configurable parameters](#attribute-limits-configuration) below.

If an SDK provides a way to:

- set an attribute value length limit and the limit is set, then for each
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
attribute value:
- if it is a string, if it exceeds that limit (counting any character in it as
1), SDK Spans MUST truncate that value, so that its length is at most equal
MrAlias marked this conversation as resolved.
Show resolved Hide resolved
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
to the limit,
- if it is an array of strings, then apply the above rule to each of the
values separately,
- otherwise a value MUST NOT be truncated;
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
- set a limit of unique attribute keys, and the limit is set:
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
- for each unique attributes key, addition of which would result in exceeding
the limit, SDK MUST discard that key/value pair.

There SHOULD be a log emitted to indicate to the user that an attribute was
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the sentiment of this paragraph -- we should tell the user when we discard or modify their telemetry data and we should be careful not to overwhelm a telemetry collection system by providing too many error messages about bad data.

This, to me, is the wrong place to specify error handling behavior. I'd like to see statements about error handling in general in the library guidelines part of the specification. For errors like this which may occur as a result of user data, I typically configure them to print once per minute or so. I believe we should use metrics to count occurrences of excessive metric value length and/or count.

Both of these outputs, logs and/or metrics about these events, are a form of "meta" telemetry, and the specification's SDK guidelines should say we aim to (1) do this well, (2) at low cost. For meta-telemetry logs this means once rarely per log statement, for metrics this means a requirement to have low cardinality, for example.

I propose removing this paragraph and filing an issue to update the library guidelines with general guidance on handling errors of this nature.

Copy link
Contributor Author

@jtmalinowski jtmalinowski Jul 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacd Given that this is in line with existing spec here https:/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#span-limits, do you think we could leave the paragraph as is, create the issue, and once we reach a consensus, I'll create a PR which will address this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacd please feel free to amend the issue #1787 if you'd like to describe the problem in a different way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacd does the above work for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think solving this issue with logging is not the ideal solution. It assumes that logging is always available, which is not always the case - many languages have no standard log framework, and even if they do, using std log is not always what the rest of the application is using. In other words, logger must always be treated as a dependency, which may or may not be available.

In addition, the throttling of the logs effectively requires keeping a state per span. So why not report truncations as part of the span itself?

Regarding #1787, why not leave this paragraph out of this PR until an agreement on #1787 is reached? We are changing the spec, after all, it's better to leave the behavior unspecified than to change it.

Copy link
Contributor Author

@jtmalinowski jtmalinowski Jul 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I have changed "SHOULD" to "SHOULD NOT" to satisfy this criteria:

  • your ask to not prescribe a solution
  • don't immediately invalidate existing solutions, which may already be logging
  • don't make it hard for implementors in case they really needed to log (until we have a prescribed solution)

This isn't exactly what you asked for, but I think it does the job.

EDIT: Also, I'd appreciate if you double posted these comments about logging on the issue, otherwise it'll be hard to make any progress there.

truncated or discarded. To prevent excessive logging, the log MUST NOT be
emitted more than once per record on which an attribute is set. If a record is
embedded in another record (e.g. Events within a Span), then the log SHOULD NOT
be emitted more than once per parent record.

If the SDK implements the limits above, it MUST provide a way to change these
limits programmatically. Names of the configuration options SHOULD be the same as
in the list below.

<a name="attribute-limits-configuration"></a>
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
**Configurable parameters:**

* `AttributeCountLimit` (Default=128) - Maximum allowed attribute count per record;
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
* `AttributeValueLengthLimit` (Default=Infinity) - Maximum allowed attribute value length;
7 changes: 7 additions & 0 deletions specification/metrics/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Table of Contents
</summary>

* [MeterProvider](#meterprovider)
* [Measurement Limits](#measurement-limits)
* [MeasurementProcessor](#measurementprocessor)
* [MetricProcessor](#metricprocessor)
* [MetricExporter](#metricexporter)
Expand All @@ -40,6 +41,12 @@ TODO:
* Configure timing (related to [issue
1432](https:/open-telemetry/opentelemetry-specification/issues/1432)).

## Measurement Limits

Measurement attributes are exempt from the
[common rules of attribute limits](../common/common.md#attribute-limits) due to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing to have this here and not in the common section.

the experimental status of the Metrics SDK specification.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is true.

Our data model is stable and in it, the attributes contribute to the identity of a metric. If we truncate attributes we need to denote how that changes the identity of the point.

I'm not sure the SDK exiting experimental status will change this point


## MeasurementProcessor

`MeasurementProcessor` is an interface which allows hooks when a
Expand Down
31 changes: 25 additions & 6 deletions specification/sdk-environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,17 +72,36 @@ Depending on the value of `OTEL_TRACES_SAMPLER`, `OTEL_TRACES_SAMPLER_ARG` may b
| OTEL_BSP_MAX_QUEUE_SIZE | Maximum queue size | 2048 | |
| OTEL_BSP_MAX_EXPORT_BATCH_SIZE | Maximum batch size | 512 | Must be less than or equal to OTEL_BSP_MAX_QUEUE_SIZE |

## Span Collection Limits
## Attribute Limits

See the SDK [Attribute Limits](common/common.md#attribute-limits) section for the definition of the limits.

If an SDK does not implement truncation mechanism for all implementations of
attributes, then it SHOULD NOT offer an unscoped (e.g.
`OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT`, as opposed to scoped, e.g.
`OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT`) global environment variable for such a
limit. SDKs SHOULD only offer environment variables for the types of attributes,
for which that SDK implements truncation mechanism. For each implemented limit
SDK MUST first try to use the unscoped variable, if it isn't set, only then try
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
to use the scoped variable.

| Name | Description | Default | Notes |
| --------------------------------- | ------------------------------------ | ------- | ----- |
| OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT | Maximum allowed attribute value size | | Empty value is treated as infinity. Non-integer and negative values are invalid. |
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved
| OTEL_ATTRIBUTE_COUNT_LIMIT | Maximum allowed span attribute count | 128 | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the diff with OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT?

Copy link
Contributor Author

@jtmalinowski jtmalinowski Jul 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applies to all attributes, except Metrics (temporarily exempted due to experimental status), so effectively that (for now) means spans and events. EDIT: also resources

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we go ahead: how does this interact with the rest of the related environment variables? What if the user sets OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT and also OTEL_ATTRIBUTE_COUNT_LIMIT? I'm guessing what you expect here, but it's not clarified anywhere.

For what is worth, if you remove this, we can totally merge this (and have this single environment variable done as a follow up, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So previously there was a paragraph about this in sdk-environment-variables, but there was an error in one of the sentences, and because it also applies to programmatic configuration, I moved that paragraph here https:/open-telemetry/opentelemetry-specification/pull/1130/files#diff-500cb43b9725373c25ea03660c293e59681e6e2f6ba3eb588188ed3b83dcbdcdR79.

I hope the wording is clear and we can merge.


## Span Limits <a name="span-collection-limits"></a>

**Status**: [Stable](document-status.md)

See the SDK [Span Limits](trace/sdk.md#span-limits) section for the definition of the limits.

| Name | Description | Default | Notes |
| ------------------------------- | ------------------------------------ | ------- | ----- |
| OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT | Maximum allowed span attribute count | 128 | |
| OTEL_SPAN_EVENT_COUNT_LIMIT | Maximum allowed span event count | 128 | |
| OTEL_SPAN_LINK_COUNT_LIMIT | Maximum allowed span link count | 128 | |
| Name | Description | Default | Notes |
| -------------------------------------- | ------------------------------------ | ------- | ----- |
| OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT | Maximum allowed attribute value size | | Empty value is treated as infinity. Non-integer and negative values are invalid. |
| OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT | Maximum allowed span attribute count | 128 | |
| OTEL_SPAN_EVENT_COUNT_LIMIT | Maximum allowed span event count | 128 | |
| OTEL_SPAN_LINK_COUNT_LIMIT | Maximum allowed span link count | 128 | |

## OTLP Exporter

Expand Down
14 changes: 5 additions & 9 deletions specification/trace/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -319,20 +319,16 @@ Optional parameters:

## Span Limits

Erroneous code can add unintended attributes, events, and links to a span. If
these collections are unbounded, they can quickly exhaust available memory,
resulting in crashes that are difficult to recover from safely.
Span attributes MUST adhere to the [common rules of attribute limits](../common/common.md#attribute-limits).

To protect against such errors, SDK Spans MAY discard attributes, links, and
events that would increase the number of elements of each collection beyond
the configured limit.
SDK Spans MAY also discard links and events that would increase the number of
elements of each collection beyond the configured limit.

If the SDK implements the limits above it MUST provide a way to change these
limits, via a configuration to the TracerProvider, by allowing users to
configure individual limits like in the Java example bellow.

The name of the configuration options SHOULD be `AttributeCountLimit`,
`EventCountLimit` and `LinkCountLimit`. The options MAY be bundled in a class,
The name of the configuration options SHOULD be `EventCountLimit` and `LinkCountLimit`. The options MAY be bundled in a class,
which then SHOULD be called `SpanLimits`. Implementations MAY provide additional
configuration such as `AttributePerEventCountLimit` and `AttributePerLinkCountLimit`.
jtmalinowski marked this conversation as resolved.
Show resolved Hide resolved

Expand All @@ -350,7 +346,7 @@ public final class SpanLimits {

**Configurable parameters:**

* `AttributeCountLimit` (Default=128) - Maximum allowed span attribute count;
* [all common options applicable to attributes](../common/common.md#attribute-limits-configuration)
* `EventCountLimit` (Default=128) - Maximum allowed span event count;
* `LinkCountLimit` (Default=128) - Maximum allowed span link count;
* `AttributePerEventCountLimit` (Default=128) - Maximum allowed attribute per span event count;
Expand Down