Capture all request and response headers #1061

pavolloffay · 2020-10-06T16:40:16Z

What are you trying to achieve?

I want to capture selected/all HTTP request and response headers.

The instrumentation libraries should be configurable to capture selected or all HTTP request/response headers.

To move this forward we probably need to define:

data semantics e.g. http.request.header.<header-name>/http.response.header.<header-name>
configuration for instrumentations e.g. OTEL_INSTRUMENTATION_CAPTURE_REQUEST_HEADERS=<name1,name2> OTEL_INSTRUMENTATION_CAPTURE_REQUEST_HEADERS_ALL=bool

Additional context.

Related issues:

The text was updated successfully, but these errors were encountered:

Oberon00 · 2020-10-06T16:50:15Z

Actually what you suggest sounds for semantic conventions sounds great already. Just have to specify details, e.g. "header-name" should be used as-is with any dashes and casing preserved (e.g. http.response.header.Content-Type) and values should be string arrays (because headers with the same name can appear multiple times).

For the instrumentation conventions, that sounds also reasonable generally, but I don't think we have anything in the spec already that only affects instrumentations and not even the SDK, so I'm not sure where we would put them. It may be best to split these into a separate issue.

Oberon00 · 2021-02-22T15:04:31Z

I created open-telemetry/build-tools#29 to add support in the semantic convention generator for this, but we can go ahead and write the markdown by hand for now.

mateuszrzeszutek · 2021-08-30T12:26:22Z

Hey,
We (Splunk) are very interested in introducing this to HTTP semantic conventions. Before I file a PR I wanted to ask one thing: why "header-name" should be used as-is with [...] casing preserved? HTTP headers have case-insensitive names, wouldn't it make it a bit easier on the user to have them all lowercased?

Oberon00 · 2021-08-30T13:26:00Z

I wanted to ask one thing: why "header-name" should be used as-is with [...] casing preserved? HTTP headers have case-insensitive names, wouldn't it make it a bit easier on the user to have them all lowercased?

I simply suggested what is most information-preserving. If you lowercase everything, it may be simpler but it loses information.
I would rather do such normalization (lowercasing) on the backend/collector if desired.

While HTTP headers ought to be case-insensitive, there might be issues with hashing/signing or bad server implementations that only work if the headers use canonical casing. Since one use case of telemetry data is to debug issues like "why did this HTTP request fail but that other one didn't", such subtle differences in input may be relevant. Differences in casing may also hint at different client implementations, even if these clients send no/same User-Agent headers.

mateuszrzeszutek · 2021-08-30T14:10:55Z

Got it -- that makes sense, thanks for the explanation!

pavolloffay · 2021-08-30T14:37:23Z

Lowercasing works as well, we have actually added lowercasing to all our agents.

I agree that agents can capture data in the original form and lowercasing can be done in the ingestion pipeline (e.g. collector). I am not sure if OTEL collector can easily change attribute keys - with no perf impact.

yurishkuro · 2021-08-30T15:37:51Z

and values should be string arrays

I'd say they should be string arrays if the instrumentation has access to them in such form. If it only has access to comma-separated string, perhaps just output as a single string?

yurishkuro · 2021-08-30T15:39:00Z

http.request.header.<header-name>

another option would be to use nested structures

mateuszrzeszutek · 2021-08-30T16:00:36Z

another option would be to use nested structures

What do you mean by "nested structures"?

I'd say they should be string arrays if the instrumentation has access to them in such form. If it only has access to comma-separated string, perhaps just output as a single string?

The type would still be string[], but if instrumentation does not have access to the list/array of header values it just sets a single-item array -- did I understood that correctly?

Oberon00 · 2021-08-30T16:12:35Z

I'd say they should be string arrays if the instrumentation has access to them in such form. If it only has access to comma-separated string, perhaps just output as a single string?

Theoretically allowing both strings and string arrays shouldn't be a problem but in practice it is, because some SIGs, e.g. Java, have types encoded in their generated semantic conventions, and "union types" pose a problem. That's also the reason we have both process.command_line (string) and process.command_args (string array), see #831.

Putting a comma-separated string as a single array item should be allowed though (or mixing split and concatenated headers in the same array).

Oberon00 · 2021-08-30T16:15:16Z

another option would be to use nested structures

These are only allowed for log attributes, not tracing or metrics.

yurishkuro · 2021-08-30T16:19:04Z

another option would be to use nested structures

These are only allowed for log attributes, not tracing or metrics.

I understand why for metrics, but for traces this does not sound like a reasonable or necessary restriction.

Oberon00 · 2021-08-30T16:26:14Z

Since the choice between nesting vs encoding in the name is more of a detail for the issue at hand, let's keep further discussion on allowing nested structures to #376

pavolloffay added the spec:trace Related to the specification/trace directory label Oct 6, 2020

Oberon00 added area:semantic-conventions Related to semantic conventions release:after-ga Not required before GA release, and not going to work on before GA labels Oct 6, 2020

Oberon00 mentioned this issue Oct 6, 2020

Capture request and response bodies open-telemetry/semantic-conventions#857

Open

Oberon00 mentioned this issue Feb 22, 2021

Support for "map" conventions based on prefixes open-telemetry/build-tools#29

Closed

mateuszrzeszutek mentioned this issue Aug 30, 2021

Capture HTTP request/response headers as span attributes #1898

Merged

mateuszrzeszutek mentioned this issue Sep 10, 2021

Support map values and nested values for attributes #376

Closed

yurishkuro closed this as completed in #1898 Sep 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capture all request and response headers #1061

Capture all request and response headers #1061

pavolloffay commented Oct 6, 2020 •

edited by Oberon00

Loading

Oberon00 commented Oct 6, 2020 •

edited

Loading

Oberon00 commented Feb 22, 2021

mateuszrzeszutek commented Aug 30, 2021

Oberon00 commented Aug 30, 2021

mateuszrzeszutek commented Aug 30, 2021

pavolloffay commented Aug 30, 2021

yurishkuro commented Aug 30, 2021

yurishkuro commented Aug 30, 2021

mateuszrzeszutek commented Aug 30, 2021

Oberon00 commented Aug 30, 2021 •

edited

Loading

Oberon00 commented Aug 30, 2021

yurishkuro commented Aug 30, 2021 •

edited

Loading

Oberon00 commented Aug 30, 2021

Capture all request and response headers #1061

Capture all request and response headers #1061

Comments

pavolloffay commented Oct 6, 2020 • edited by Oberon00 Loading

Oberon00 commented Oct 6, 2020 • edited Loading

Oberon00 commented Feb 22, 2021

mateuszrzeszutek commented Aug 30, 2021

Oberon00 commented Aug 30, 2021

mateuszrzeszutek commented Aug 30, 2021

pavolloffay commented Aug 30, 2021

yurishkuro commented Aug 30, 2021

yurishkuro commented Aug 30, 2021

mateuszrzeszutek commented Aug 30, 2021

Oberon00 commented Aug 30, 2021 • edited Loading

Oberon00 commented Aug 30, 2021

yurishkuro commented Aug 30, 2021 • edited Loading

Oberon00 commented Aug 30, 2021

pavolloffay commented Oct 6, 2020 •

edited by Oberon00

Loading

Oberon00 commented Oct 6, 2020 •

edited

Loading

Oberon00 commented Aug 30, 2021 •

edited

Loading

yurishkuro commented Aug 30, 2021 •

edited

Loading