Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear whether Span.flags represents span's parent or own flags #507

Open
Oberon00 opened this issue Sep 19, 2023 · 10 comments
Open

Unclear whether Span.flags represents span's parent or own flags #507

Oberon00 opened this issue Sep 19, 2023 · 10 comments
Assignees

Comments

@Oberon00
Copy link
Member

Oberon00 commented Sep 19, 2023

In the PR description of #503 it sounds like it should be the parent flags, and that is also the only thing that would make sense in the link usage, but from the definition it reads more like the spans own flags (NB: Conceptually IMHO the span does not even have flags on its own and each inject call might result in different flags, not right now probably but it might with future w3c flags, or you might want to inject with sampled=0 despite the span actually being sampled to suppress downstream spans while still having a trace ID for logging, etc.)

Please clarify the description, and also clarify what this means for local child spans.

@tigrannajaryan
Copy link
Member

@jmacd can you take this?

@AlexanderWert
Copy link
Member

AlexanderWert commented Sep 20, 2023

@Oberon00
Aren't we mixing up two things here?

The flags defined on a span (or a link) in opentelemetry-proto is a 32 bit field that conveys information about that span in general.

Now, the least significant 8 bits of that field represent the W3C TraceFlags which are being propagated from the parent (so those bits describe information coming from the parent span, or the linked span in case of a link). But, the remaining bits can be used to describe any kind of information on the current span (for example the is_parent_remote information on that span, or any other span information we might come up with in the future that we want to encode into the flags for efficiency reasons (instead of defining a new explicit field)).

@tigrannajaryan
Copy link
Member

But, the remaining bits can be used to describe any kind of information on the current span

+1. This was precisely my intent.

@Oberon00
Copy link
Member Author

Now, the least significant 8 bits of that field represent the W3C TraceFlags which are being propagated from the parent (so those bits describe information coming from the parent span, or the linked span in case of a link).

Here #484 (comment) @yurishkuro asserted they describe the current span, so definitely this needs to be clarified:

Sampled flag also applies to the current span

@Oberon00
Copy link
Member Author

To be more concrete: Let's say I have an AlwaysOn sampler, ignoring the parent sampled flag. So I create a sampled span with the remote context with sampled=0 as parent. What should the sampled flag in Span.flags be set to?

@yurishkuro
Copy link
Member

imo Span.flags should be flags of the current span. Some of them may refer to parent semantically, like "was my parent remote?", but even that is from the perspective of the current span. It doesn't make sense to me to capture the actual parent's view in the span data for the current span - that would be done in the parent span's data already.

@Oberon00
Copy link
Member Author

Oberon00 commented Sep 20, 2023

What do we do with the flags we added on the link then? It would be inconsistent to store the unmodified incoming flags on the link message but the "span's own flags" on the span message. Also, it might be interesting if the parent was sampled, so if we really want to store the spans own flags in Span.flags, then I'd suggest we would need an additional Span.parent_flags.

(And conceptually, I believe it is best to think of the W3C flags to belong neither to the parent nor to the child, but something in-between)

@jmacd
Copy link
Contributor

jmacd commented Sep 20, 2023

Thank you @Oberon00 for raising this question, in connection with #484.

I agree with the comments above that the flags of the span describe the span, not the parent.

I see what you mean "would be inconsistent", but I think I disagree. In the Span Link, flags consistently means the trace trace flags of the parents not selected.

We are starting to recognize (across the tracing effort) that we need to address situations where there are multiple propagators and multiple eligible parents (for example). The choice of a parent establishes the trace ID and semantically, the random flag ought to be equal between parent and child context. For the sampled flag, as you say, there is something we may wish to know.

Typically, the parent and child sampled flags are the same, since it's what the ParentBased sampler gives. When an non-root span uses one of the other samplers, there can be a change of sampled flag, but this is a rare event. I would suggest that we are most interested knowing in when the span's sampled flag is different than its parent, which is to say to use bit 10 (after #484 claims bits 8 and 9) to indicate the XOR of span-sampled and parent-sampled.

To be explicit, here's why this is useful as a consumer:

  • If the span is sampled and the parent is unsampled, it means the span is a sub-trace root. Consumers should not expect to locate the parent span in the data store. We think this could mean one of the links is sampled.
  • If the span is not sampled and the parent is sampled, typically the span will not be recorded. We recognize that spans can be recorded even when the context they establish are unsampled, for example, because they help complete one of the traces described by a (sampled) Link.

For example, here's how we might record a span with sampled=false and the hypothetical parent_sampled_xor=true:

To, to close this issue, and to unblock #484, I propose we update #484 to answer the question and refer to this discussion. The Span flags unambiguously refer to the span and the trace context it creates. We may follow this with a PR to introduce parent_sampled_xor if we agree on this subject.

@Oberon00
Copy link
Member Author

Why the xor? To save a few bits on the wire when they are mostly zero?

@jmacd
Copy link
Contributor

jmacd commented Sep 21, 2023

Discussed in sampling SIG (today), we rejected the complexity of this idea (to use xor). A simpler plan is to simply allocate one new bit (number 10) to record a copy of the parent sampled bit. My primary reasoning was conceptual -- the important thing to me is when the sampled bit changes from parent to child, so that is the condition I would flag.

In #484, we are proposing to add 2 bits to encode a single property--on the one hand, this feels wasteful of a bit (e.g., maybe we add data format version somewhere, say via the schema_url, to know which bits are valid)--however it serves a second purpose. When we use two exclusive bits in #484, we ensure that 1 bit is always set, so we know the difference between legacy spans (all flags are 0) and new spans.

I was considering adding guidance to the protocol to say that when flags is entirely zero, consumers should make default assumptions. When flags are entirely 0 for a recorded-and-exported span, we should assume the trace was sampled and that its parent was sampled, since that is the common interpretation today.

By suggesting a single bit as the XOR of the sampled flag, we are able to use a single bit to encode the parent-sampled state--i.e., XOR achieves the same thing that using two bits does, in #484, we do not need a second bit to say the parent was unsampled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants