Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Sampling Decision may need to update tracestate, not only attributes #66

Closed
SergeyKanzhelev opened this issue May 29, 2019 · 13 comments
Labels
area:api Cross language API specification issue area:sampling Related to trace sampling
Milestone

Comments

@SergeyKanzhelev
Copy link
Member

No description provided.

@bogdandrutu bogdandrutu changed the title Sampling Decision may need to update tracestate, not only attributes API: Sampling Decision may need to update tracestate, not only attributes May 30, 2019
@tedsuo
Copy link
Contributor

tedsuo commented May 30, 2019

@SergeyKanzhelev can you show an example of the API you would like, or the requirements it needs . to have?

Do we want the sampler to modify TraceState directly? Or return it?

(Personally, I think this should be done under the hood, at least if there is only "one right way" to do this.)

@SergeyKanzhelev
Copy link
Member Author

SergeyKanzhelev commented May 30, 2019

Typical example of such need is when sampler need to read and set if missing some tracestate key/value pair that indicates details of sampling decision. Like a sampling rate that was used by this sampler at this point of time.

Currently the decision sampler returns contain a set of attributes associated with the decision. However, those attributes would not be propagated down the downstream services.

  Decision shouldSample(
      @Nullable SpanContext parentContext,
      @Nullable Boolean hasRemoteParent,
      TraceId traceId,
      SpanId spanId,
      String name,
      List<Span> parentLinks);

  interface Decision {
    /** Return sampling decision whether span should be sampled or not. */
    boolean isSampled();

    /** Return tags which will be attached to the span.
     * @return attributes added to span. These attributes should be added to the span only for root
     *     span or when sampling decision {@link #isSampled()} changes from false to true. */
    Map<String, AttributeValue> attributes();
  }

Perhaps Decisions may be extended to also return a new Tracestate class. That tracestate is easy to construct as incoming Tracestate can be used in the TracestateBuilder. And if not null - it should replace Tracestate in SpanContext.

  interface Decision {
    /** Return sampling decision whether span should be sampled or not. */
    boolean isSampled();

    /** Return tags which will be attached to the span.
     * @return attributes added to span. These attributes should be added to the span only for root
     *     span or when sampling decision {@link #isSampled()} changes from false to true. */
    Map<String, AttributeValue> attributes();

    /** If not null - should be used to overwrite the Tracestate in the SpanContext. **/
    Tracestate tracestate()
  }

@SergeyKanzhelev
Copy link
Member Author

SergeyKanzhelev commented May 31, 2019

@bogdandrutu Any opinions on this?

I also want to add that the approach of moving the Tracestate manipulation to separate callback (which is also a common request) has a problem that now sampler that needs Tracestate manipulation will need to exchange information with that callback. And there actually no means to carry any information from one sampler to that callback except, maybe, attributes of a span (and a intermittent call to toSpanData to get that attribute value). Which will complicate a lot of logic

@SergeyKanzhelev SergeyKanzhelev transferred this issue from open-telemetry/opentelemetry-java Jun 3, 2019
@SergeyKanzhelev SergeyKanzhelev added the area:api Cross language API specification issue label Jun 3, 2019
@SergeyKanzhelev SergeyKanzhelev added this to the API revision: 07-2019 milestone Jun 3, 2019
@jmacd
Copy link
Contributor

jmacd commented Jun 3, 2019

Background question: why should the sampling decision output attributes?
In looking at the prototype Java SDK, I see that there are TraceOptions which include the "is_sampled" bit. I feel the sampling rate belongs there. are TraceOptions serialized into Tracestate?

@SergeyKanzhelev
Copy link
Member Author

examples of attribute may be algo used to make a decision and sampling priority. Sampling rate is not a common-enough concept to make it strongly typed or include into TraceOptions. However I believe it's a common scenario to put it into Tracestate.

TraceOptions are serialized into trace-flags: https://w3c.github.io/trace-context/#trace-flags

@jmacd
Copy link
Contributor

jmacd commented Jun 4, 2019

I have a terminology question. I was under the impression that "Attribute" referred to properties of a span, and that "Tags" referred to contextual properties that would propagate along with trace context. In this thread, I believe we're talking about context tags which a sampling algorithm would output for propagation.

I feel somehow that context tags should be reserved for applications to use, not for tracing systems to propagate internal state. I see the sense in allowing some vendor-specific state in Tracestate, but I worry that we're imagining something generally useful, where all we really need is one floating-point number alongside "is_sampled" to propagate a sampling rate, for systems that use fixed-probability sampling. Sampling algorithms may get quite complex, but I feel that for the most part sampling decisions should be recorded along with span data, not propagated via context.

That said, there are plenty of reasons to allow tracing systems to make decisions about propagating attributes from one node to another to facilitate distributed trace joins (e.g., see Pivot Tracing). In that sense, we should disentangle these things:

  1. tracing systems sometimes want to know sampling rates downstream
  2. tracing systems sometimes want to inject tags for themselves to use downstream in sampling decisions.

I'd suggest that an optional sampling-rate field in trace-flags or Tracestate will address (1), and an independent-of-sampling mechanism to allow tracing implementations to decide on the contents of a correlation context will address (2). A tracing system could want (2) even when it's not sampling.

@SergeyKanzhelev
Copy link
Member Author

this issue is specifically about sampler and how sampler will propagate it's decision and extra arguments to downstream components. Correlation Context will be reserved for apps, so Tracestate is a native place for it.

@jmacd
Copy link
Contributor

jmacd commented Jun 4, 2019

As I understand it, Tracestate contains Entries of key:value attributes that permit the sampling implementation to propagate its decision and extra arguments to downstream components. Your proposal above allows the sampling implementation to the span ("Return tags which will be attached to the span."). That's the part I'm confused about, since I believe Span attributes should be reserved for apps (like Correlation context).

@jmacd
Copy link
Contributor

jmacd commented Jun 5, 2019

In other words, I now see that the sampling decision should have a chance to modify Tracestate. But if the sampling decision intends to record any key-value data, I think it should do so in the Tracestate entries.

@rochdev
Copy link
Member

rochdev commented Jun 5, 2019

Just wanted to mention that we have a constraint with client sampling to not do allocations when a trace will be dropped. The way we do it right now is that we take the decision as early as possible (before the root span is even created) and then use a no-op span for any span that is a part of the that trace. The no-op span is a singleton that is reused for all dropped traces. Then propagation is done from the no-op span which will always propagate a decision to drop.

I'm not familiar with TraceState so hopefully I'm not deviating from the scope of this issue.

@jmacd jmacd mentioned this issue Jun 7, 2019
@lizthegrey
Copy link
Member

#180 is now salient, so cross-linking it.

I'm okay with using Tracestate rather than TraceOptions but I feel uncomfortable with the status quo of not propagating the info at all right now.

@iredelmeier iredelmeier added the area:sampling Related to trace sampling label Jul 30, 2019
@SergeyKanzhelev SergeyKanzhelev removed this from the API revision: 07-2019 milestone Sep 27, 2019
@bogdandrutu bogdandrutu added this to the Alpha v0.3 milestone Sep 30, 2019
@jmacd
Copy link
Contributor

jmacd commented Jan 22, 2020

Closing this as it is discussed under open-telemetry/oteps#79

@jmacd jmacd closed this as completed Jan 22, 2020
@Oberon00
Copy link
Member

@jmacd To me it looks like the only relation of this issue to open-telemetry/oteps#79 is that both are in the area of sampling...

Closing this issue probably still makes sense because it seems to be stale. @lizthegrey what do you think? I think this was related to https://www.honeycomb.io/blog/dynamic-sampling-by-example/

carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this issue Oct 21, 2024
* Proposal to separate context propagation from observability

* cleanup description for Extract

* commas

Co-Authored-By: Christian Neumüller <[email protected]>

* Update text/0000-separate-context-propagation.md

Co-Authored-By: Christian Neumüller <[email protected]>

* RFC proposal: A layered approach to data formats

* whitespace

Co-Authored-By: Reiley Yang <[email protected]>

* Capitalization

Co-Authored-By: Reiley Yang <[email protected]>

* whitespace

Co-Authored-By: Reiley Yang <[email protected]>

* CleanBaggage -> ClearBaggage

* move function descriptions to new line

* Add Optional subheader

* cleanup rough edits

* clean up advice on pre-existing context implementations

* Better context descriptions

* remove data format file

* remove git diff message

* improved code sytnax

* stop stuttering

* Update text/0000-separate-context-propagation.md

Co-Authored-By: Reiley Yang <[email protected]>

* spacing

* Refine propagation

* Remove registry concept
* Add explicit chaining

* Add RFC ID number from PR

* remove RFC status line

* slight calrification for GetHTTPExtractor

* add global propagators

* Clean up motivation

* Clean up explanbation intro

* Clarify context types

* Fix ChainHTTPInjector and ChainHTTPExtractor

* typo

* Reference Trace-Context, not just traceparent

* Bagge context cleanup

* stronger language around context access

* Update text/0042-separate-context-propagation.md

Co-Authored-By: Christian Neumüller <[email protected]>

* clean up tradeoffs

* v2.0 of this OTEP

* Update OTEP number for new submission

* remove image file for unused diagram

* Update text/0066-separate-context-propagation.md

Co-Authored-By: alrex <[email protected]>

* Link to Erlang prototype

* whitespace

* ToC

* Update text/0066-separate-context-propagation.md

Co-Authored-By: Tristan Sloughter <[email protected]>

* more context examples

* typo

* Update text/0066-separate-context-propagation.md

Co-Authored-By: Yusuke Tsutsumi <[email protected]>

* Apply suggestions from code review

Co-Authored-By: Yuri Shkuro <[email protected]>

* Renamed aspects to "cross-cutting concerns"

* injectors are a list instead of chained

* clean up API representation

* cleanup examples

* typo

* remove correlations from proposal

* Update text/0066-separate-context-propagation.md

Co-Authored-By: Yuri Shkuro <[email protected]>

* remove the term "chaining"

* remove the terms  upstream and downstream

* improved architecturural explanation

* whitespace

* update link to Go prototype

* Removed the Baggage API; replaced it with Correlations.

* Fix the intro paragraph for Correlations

* git merges ate my homework
* some sentences were out of order

* Clarify that correltions must be propagated

* Clarify risks

* removed extra header

* Clarify definition of aspect-oriented programming

* Fix RemoveCorrelation

* Spelling

* Update text/0066-separate-context-propagation.md

Co-Authored-By: Sergey Kanzhelev <[email protected]>

* Clarifying details

* Update python prototype

* Update text/0066-separate-context-propagation.md

Co-Authored-By: Tyler Yahn <[email protected]>

* Update text/0066-separate-context-propagation.md

Co-Authored-By: Tyler Yahn <[email protected]>

* Update text/0066-separate-context-propagation.md

Co-Authored-By: Tyler Yahn <[email protected]>

* Clarify that the APIs and example code are not meant as final.  Add C# prototype

* Inject returns headers, not context

* Update text/0066-separate-context-propagation.md

Co-Authored-By: Mauricio Vásquez <[email protected]>

* spelling

* remove spurious go comment

Co-authored-by: Christian Neumüller <[email protected]>
Co-authored-by: Reiley Yang <[email protected]>
Co-authored-by: alrex <[email protected]>
Co-authored-by: Tristan Sloughter <[email protected]>
Co-authored-by: Yusuke Tsutsumi <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Co-authored-by: Sergey Kanzhelev <[email protected]>
Co-authored-by: Tyler Yahn <[email protected]>
Co-authored-by: Mauricio Vásquez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue area:sampling Related to trace sampling
Projects
None yet
Development

No branches or pull requests

8 participants