Allow multiple tracestate entries for one vendor #137

discostu105 · 2018-07-27T10:48:20Z

Use Case

Multi-tenant tracing.

Currenty TraceContext elegantly solves the problem of different tracing system that interact with each other. Via tracestate, "foreign" vendor information is transported throughout the trace. This is great, but a system traced only by one vendor can have a similiar problem. If services are monitored by different tenants/customers of that vendor, the trace breaks (assuming the vendor is relying on tracestate information).

Concrete example
A 3-tier trace where the first and the last tier are monitored by the same tenant (A), but the in-between tier is monitored by a different tenant (B), but both use the same APM vendor.

(A) -> (B) -> (A)

1. tracestate="vendor1=tenant:0xaaa;importantData:0xfff"
2. tracestate="vendor1=tenant:0xbbb;importantData:0xeee"
3. tracestate="vendor1=tenant:0xaaa;importantData:0xccc"  // trace for tenant A is broken

Since, with current design, a vendor can only use one key in tracestate, the trace would break.

Possible Solutions

Vendor needs to encode multi-tenant support into a single tracestate entry.

1. tracestate="vendor1=(tenant:0xaaa;importantData:0xfff)"
2. tracestate="vendor1=(tenant:0xbbb;importantData:0xeee)|(tenant:0xaaa;importantData:0xfff)"
3. tracestate="vendor1=(tenant:0xaaa;importantData:0xfff)|(tenant:0xbbb;importantData:0xeee)"

Vendor is allowed to have multipe tracestate enties with the same key.

1. tracestate="vendor1=tenant:0xaaa;importantData:0xfff"
2. tracestate="vendor1=tenant:0xbbb;importantData:0xeee, vendor1=tenant:0xaaa;importantData:0xfff"
3. tracestate="vendor1=tenant:0xaaa;importantData:0xfff, vendor1=tenant:0xbbb;importantData:0xeee"

Vendor is allowed to encode dynamic info into the tracestate key.

1. tracestate="vendor1-0xaaa=importantData:0xfff"
2. tracestate="vendor1-0xbbb=importantData:0xeee, vendor1-0xaaa=importantData:0xfff"
3. tracestate="vendor1-0xaaa=importantData:0xfff, vendor1-0xbbb=importantData:0xeee"

Thoughts

Solution 1 would not need any change to the current tracestate spec. But it would require every vendor to solve a problem, that is pretty much elegantly solved already with tracestate (for multi-vendors).
Solution 2: The current spec says "only one entry per key". This would need to be changed. It might have implications on the data-type we use for the HTTP-header. It also bears the risk of implementations doing this wrong (Map instead of MultiMap).
Solution 3: Could be a legit solution. We'd need to define if keys can be arbitraty, or if there is still a defined format (<vendor>-<dynamicdata>) where <vendor> should still be a registered name (Provide vendor/open source product name register for tracestate #58).

There's been an issue by @SergeyKanzhelev , which I believe might overlap with this and maybe serves the same use-case: #117.

The text was updated successfully, but these errors were encountered:

codefromthecrypt · 2018-07-27T10:54:06Z

why do the keys have to be registered names? When described earlier I would have expected folks to have multiple keys for a given format for use cases like this. A "vendor" key would only be used when there is a consistent single-tenant environment. If it follows, isn't there an option 0 which is to simply use different keys but the same format, regardless of which vendor that format is? The implication is that there is a mapping (ex configuration management) between keys and how to parse them. I'd expect a configuration management problem to exist independent of this anyway. Regardless, we should enumerate the choices, I think option 0 is one, do you agree?

codefromthecrypt · 2018-07-27T11:08:02Z

another monkey wrench for you is multiple concurrent tenant. The feature is to copy the same data to another place, and propagate hints about this, ex how long to do so. It doesn't matter so much if copying the trace is for the purpose of a second tenant (where primary is "admin" for example), or just a data retention related detail, or any other reason. Such a feature requires another tracestate entry or a different header to achieve, unless there is strict coordination between it and the primary trace state. Use cases I've heard so far imply there is no strict implication that the secondary observers are tightly coupled to the primary. For example "edge" could have a value which includes how many hops to retain a copy of the traceparent data into that namespace. It could also not read the tracestate at all, simply tracking propagated info such as service names in headers. Main idea is that there are probably many cases where a formal and constrained registry of names will limit the applicability to brownfield sites who have non-vendor code present. What we do about tenancy, I think it is probably safe to assume that unregistered tracestate names will be used for the convenience of sites.

discostu105 · 2018-07-27T12:14:40Z

@adriancole do I understand that correctly, that your suggestion would be basically Solution 3, but without the requirement to have this in any defined way. So, any vendor can chose arbitrary keys?

IMHO there is no good way to enforce "registered" keys (or key-prefixes) anyway, so not sure if it should even go into the standard. However, I believe it would make sense to share a list a keys (or key-prefixes) to avoid potential key-clashes.

codefromthecrypt · 2018-07-27T12:57:16Z

I agree with all you said. That we should avoid thrashing on well-known, but understand people will use different keys and knowing this allows flexibility to specify less.

…

On Fri, Jul 27, 2018 at 8:14 PM, Christoph Neumüller < ***@***.***> wrote: @adriancole <https:/adriancole> do I understand that correctly, that your suggestion would be basically Solution 3, but without the requirement to have this in any defined way. So, any vendor can chose arbitrary keys? IMHO there is no good way to enforce "registered" keys (or key-prefixes) anyway, so not sure if it should even go into the standard. However, I believe it would make sense to share a list a keys (or key-prefixes) to avoid potential key-clashes. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#137 (comment)>, or mute the thread <https:/notifications/unsubscribe-auth/AAD61xRSDGA2S0Nqgb9jaHo9o4keRAlqks5uKwQ-gaJpZM4VjTID> .

codefromthecrypt · 2018-07-27T12:58:11Z

sorry meant to say "should avoid clashing on well-known" (though we should also avoid thrashing :P)

yurishkuro · 2018-07-27T22:04:48Z

@discostu105 why does the fact that (B) is a different tenant require changes to the propagated context? Your example of "importantData" is very vague, it's hard to draw any conclusions or generalizations.

discostu105 · 2018-07-30T07:41:20Z

@yurishkuro sure, let me try to clarify.

At Dynatrace, we're currently propagating more information than what's currently possible via traceparent. Some examples of fields we propagate:

agentId, tagId (these combined could arguably be merged into spanId)
tenantId (identifies the tenant/customer)
clusterId (identifies the backend)
serverId (id of the server-node inside the cluster that is supposed to correlate this trace)
...

A more detailed description is here: https://docs.google.com/document/d/1O40xdY0pLVQG4ChlhSsfU7m6HRCH1GJHNhLLD_Ko8tg/edit

The point is, that we have information that we need to propagate that does not fit into traceparent. In order to do that over multiple tenants/clusters, we would need any of the above solutions.

yurishkuro · 2018-07-30T23:25:29Z

and all that information needs to be present in the tracestate in case the control returns to the first tenant?

discostu105 · 2018-07-31T06:04:10Z

yes. otherwise we cannot (efficiently) correlate the two tiers of tenant A.

The goal is that tenant A sees a connection between hope (1.) -> (3.). The second hop (tenant B) should be transparent. Of course it's clear from tracestate that a "foreign" tenant was in-between and we could employ logic to indicate that in our analysis, but it's important for us that the (1.)->(3.) connection is there.

discostu105 · 2018-08-07T00:07:59Z

After discussion in Seattle:

To make parsing easier, we came up with the idea to put the vendor name before the "=" sign and prefix it with a "starting-character" ("@").
examples:
- <dynamicdata>@vendor1=<value>
- fw5-29a-3039@dt=1;FEC354CD6;1;2;727c
Searching for @dt= would be more robust for parsing (searching the string), compared to the search for dt-. (or is it?)
There are no objections against using tracestate like this.
@AloisReitbauer suggested it could make sense to add this as a possible usage example of tracestate, in case other vendors have that same problem to solve.
The exact format of the tracestate-key does not need to be specified exactly. If vendors want to interop, they need to exchange their key format anyway to be able to call each other for querying spans.

codefromthecrypt · 2018-08-08T00:37:11Z

I don't know if really we can deduce there are no objections due to the lack of participation. I think only a few parties are present at the workshop.. just keep this in mind.

Widening the character set and suggesting a pattern seems a bit before the fact. Personally, I don't like adding @ as it is a distraction.

What is hard for me to understand now, is this.. currently you have one header right? How is it that you can use x-dynatrace yet not be able to use one state entry? Can you please show how what you are asking to do works now (with normal headers) and how that can't be done in one trace state?

codefromthecrypt · 2018-08-08T00:39:35Z

Ex it would seem you would have exactly the same problem today, perhaps I missed you calling it out, but how are you able to have multiple tenants today without clobbering and somehow you can't apply this to tracestate? Or is the point that you would like to today or what? I think we should really focus the use case before a: widening characters sets, and much more than encouraging a specific practice. Ex this whole idea seems experimental in how it is described, which is fine, just not sure we should be documenting things that might thrash in practice.

discostu105 · 2018-08-08T05:16:31Z

Adrian, you are right. Dynatrace currently uses just one header (X-dynatrace), and multi-tenant tracing currently does not work for us (we re-start the trace). This is why we are seeking a solution to this problem. Tracestate is a natural fit to this problem. As discussed earlier, this putting dynamic information into the key is not in violation of the spec, so this shall be the solution to go.

Allowing the @ character to tracestate key names is of course discussible. At the workshop in Seattle it was the solution we jointly came up with. There was general interest in the use-case and the solution for it, although no other vendor attempted this solution yet. There was consensus that the use-case is valid enough that this solution shall be documented as a possible usage of tracestate.

I also agree with you that on github it's sometimes not very transparent who actually signed off & agreed with something (a PR like this). Even more so, when participants were locally on-site and just verbally agreed.

codefromthecrypt · 2018-08-08T05:29:29Z

Thanks for the transparency! appreciated. Also, at least I feel a little less crazy as I didn't know that we were doing something new here. Makes more sense now.

SergeyKanzhelev · 2018-08-09T21:34:29Z

this is explained now. closing the issue

SergeyKanzhelev added this to the 3. FPWD milestone Aug 6, 2018

SergeyKanzhelev mentioned this issue Aug 7, 2018

Allow @ in tracestate key names for multi-tenant vendors #153

Merged

SergeyKanzhelev closed this as completed Aug 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow multiple tracestate entries for one vendor #137

Allow multiple tracestate entries for one vendor #137

discostu105 commented Jul 27, 2018 •

edited

Loading

codefromthecrypt commented Jul 27, 2018 via email

codefromthecrypt commented Jul 27, 2018 via email

discostu105 commented Jul 27, 2018

codefromthecrypt commented Jul 27, 2018 via email

codefromthecrypt commented Jul 27, 2018 via email

yurishkuro commented Jul 27, 2018

discostu105 commented Jul 30, 2018

yurishkuro commented Jul 30, 2018

discostu105 commented Jul 31, 2018

discostu105 commented Aug 7, 2018

codefromthecrypt commented Aug 8, 2018

codefromthecrypt commented Aug 8, 2018

discostu105 commented Aug 8, 2018

codefromthecrypt commented Aug 8, 2018 via email

SergeyKanzhelev commented Aug 9, 2018

Allow multiple tracestate entries for one vendor #137

Allow multiple tracestate entries for one vendor #137

Comments

discostu105 commented Jul 27, 2018 • edited Loading

codefromthecrypt commented Jul 27, 2018 via email

codefromthecrypt commented Jul 27, 2018 via email

discostu105 commented Jul 27, 2018

codefromthecrypt commented Jul 27, 2018 via email

codefromthecrypt commented Jul 27, 2018 via email

yurishkuro commented Jul 27, 2018

discostu105 commented Jul 30, 2018

yurishkuro commented Jul 30, 2018

discostu105 commented Jul 31, 2018

discostu105 commented Aug 7, 2018

codefromthecrypt commented Aug 8, 2018

codefromthecrypt commented Aug 8, 2018

discostu105 commented Aug 8, 2018

codefromthecrypt commented Aug 8, 2018 via email

SergeyKanzhelev commented Aug 9, 2018

discostu105 commented Jul 27, 2018 •

edited

Loading