Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context interaction in different packages like trace/baggage is unclear #1019

Closed
bogdandrutu opened this issue Sep 25, 2020 · 30 comments · Fixed by #1063
Closed

Context interaction in different packages like trace/baggage is unclear #1019

bogdandrutu opened this issue Sep 25, 2020 · 30 comments · Fixed by #1063
Assignees
Labels
priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:baggage Related to the specification/baggage directory spec:context Related to the specification/context directory spec:trace Related to the specification/trace directory

Comments

@bogdandrutu
Copy link
Member

PS: This is written as a last moment issue before GA, and may not be 100% cleared. But it raises a concern that we cannot GA without understanding and fixing it.

What are you trying to achieve?

Simplify API, and consistency across packages. Issue #455 was closed without fixing this problem.

What did you expect to see?

Fewer APIs that do the same thing.

Additional context

Right now, as defined by the specs there are multiple ways to install a Span in the active Context, and some require a Tracer instance, or some require the global Context instance:

Besides the fact that currently there are multiple ways to get the current Context (or current Span), the specs does not provide a way to extract a Span form a given Context object.

Proposal

The current proposal is to separate the Context interaction into two parts:

  1. Adding/Extracting a new entry to a given Context instance.
    This functionality is important for Propagators and for Tracer implementations. User can also use this functionality when manually propagating a Context object.
    This is a per package concern, trace and baggage must expose a way to achieve this. It is recommended to not directly expose the Context.Key, see here.
  2. Interaction with automatic Context propagation mechanism (such as a thread local implementation).
    This functionality is important only for languages that support this, and it is important to be consistent across implementations, as well as be able to do propagation without requiring a Tracer or BaggageManager.
    Tracer implementation may need access to this mechanism to implement the "implicit parent from context" functionality.

The proposed solution is:

  • For "Adding/Extracting a new entry to a given Context instance.":
    • Change the current "Trace Context Utilities" to expose two new methods WithSpan and FromContext (names to be decided).
  • For "Interaction with automatic Context propagation mechanism.":
    • Remove all current public helpers from Tracer and Trace Context Utilities that interact with "current Context".
      Things like Get the currently active span and Set the currently active span should be removed.
    • Initially user will have to call into 2 apis to Get the currently active span and Set the currently active span: the context API to get the active Context and the helper to extract/add a Span to the Context

Alternative Consideration

Don't do anything

This is not going to work because of missing functionality and user confusion by having multiple ways to achieve the same thing.

Try to just add "Adding/Extracting a new entry to a given Context instance."

This may help with the missing functionality that currently is not defined in the specs, but will increase the user's confusion when it comes to interacting with the Context and otel APIs.

Keep provide helper functionality like Get the currently active span and Set the currently active span

The reason to remove this, is because currently they are defined in two places (with a may) which makes things inconsistent across different languages, and also proves that we do not have a good understanding of where this helpers should leave. Because of the imminent GA deadline, I suggest to remove this and add them later (adding new functionality is backwards compatible).

@bogdandrutu bogdandrutu added spec:context Related to the specification/context directory spec:baggage Related to the specification/baggage directory spec:trace Related to the specification/trace directory priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Sep 25, 2020
@tsloughter
Copy link
Member

Right now, as defined by the specs there are multiple ways to install a Span in the active Context, and some require a Tracer instance, or some require the global Context instance:

https:/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#tracing-context-utilities
https:/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#tracer-operations
https:/open-telemetry/opentelemetry-specification/blob/master/specification/context/context.md#optional-global-operations

Only the first two are about installing a Span in the Context.

I assumed the operations were removed from "tracer-operations" when they were added to the new "tracing-context-utilities", maybe it is just a mixup that they still exist in the "tracer-operations"?

If it is removed from "tracer-operations" then there is only 1 documented way to do it.

@bogdandrutu
Copy link
Member Author

Only the first two are about installing a Span in the Context.

They are to install a Span into an "active/current Context", but we still lack a ctx = WithSpan(Context, Span) operation, that can be used by the Propagator to add the Span to a given Context instance.

If it is removed from "tracer-operations" then there is only 1 documented way to do it.

That may be good, but was thinking that we can postpone this and think more, once we have APIs to interact with a Context instance, we can ask users to combine that with the GetCurrentContext API from Context api to get the current Span, this may be too ugly but still a valid option.

@tigrannajaryan
Copy link
Member

@bogdandrutu I advise you to add examples of how you would use the proposed API. Without examples it is difficult to fully understand the proposal. I can assume/imagine how it would be used, but my assumptions may be wrong.

Examples will help facilitate the discussion. Show a few use cases.

@tsloughter
Copy link
Member

I don't think there should be a WithSpan used by propagators. The propagated trace context is for a remote span that shouldn't be the active span in a local Context. In Erlang we do:

-spec w3c_propagators() -> {otel_propagation:http_extractor(), otel_propagation:http_injector()}.
w3c_propagators() ->
    ToText = fun otel_propagation_http_w3c:inject/2,
    FromText = fun otel_propagation_http_w3c:extract/2,
    Injector = otel_ctx:http_injector(?TRACER_CTX, ToText),
    Extractor = otel_ctx:http_extractor(?EXTERNAL_SPAN_CTX, FromText),
    {Extractor, Injector}.

So for injecting a different Context key is used (?TRACER_CTX) than for extracting (?EXTERNAL_SPAN_CTX), the extracted span is then used in the StartSpan operation to be a parent.

@bogdandrutu
Copy link
Member Author

So for injecting a different Context key is used (?TRACER_CTX) than for extracting (?EXTERNAL_SPAN_CTX), the extracted span is then used in the StartSpan operation to be a parent.

@tsloughter (I told you to not post more erlang code because I find bugs):

@bogdandrutu
Copy link
Member Author

bogdandrutu commented Sep 25, 2020

@tigrannajaryan the current Java API example (we still use grpc.Context there but that is in process of changing, so the CurrentContext is not present but it is based on the specs) based on the specs:

public interface Tracer {
  Span getCurrentSpan();
  // Scope is just an auto-closable instance.
  Scope withSpan(Span span);
}

public final class TracingContextUtils {
  private static final Context.Key<Span> CONTEXT_SPAN_KEY =
      Context.key("opentelemetry-trace-span-key");

  // THIS IS NOT IN THE SPECS NOW
  public static Context withSpan(Span span, Context context) {
    return context.withValue(CONTEXT_SPAN_KEY, span);
  }

  public static Span getCurrentSpan() {
    return getSpan(Context.current());
  }

  // THIS IS NOT IN THE SPECS NOW
  public static Span getSpan(Context context) {
    Span span = CONTEXT_SPAN_KEY.get(context);
    return span == null ? DefaultSpan.getInvalid() : span;
  }

  public static Scope currentContextWith(Span span) {
    return ContextUtils.withScopedContext(withSpan(span, Context.current()));
  }

  private TracingContextUtils() {}
}

public final class CurrentContext {
  public static Span getCurrent() {
    ...
  }

  public static Token attach() {
     ...
  }

  public static detach(t Token) {
     ...
  }

  public static Scope withContext(Context context);
}

So based on the current API (defined by the specs), I can get the active Span in 3 ways:

Span current = TracingContextUtils.getCurrentSpan()
Span current = tracer.getCurrentSpan();
Span current = TracingContextUtils.getSpan(CurrentContext.getCurrent());

EDITED

Indeed as @tsloughter pointed the TracingContextUtils.getSpan and TracingContextUtils.withSpan can be optional if we expose the TracingContextUtils.CONTEXT_SPAN_KEY, but as the specs says we should not do that, and even if that is the case the third way to get the active Span becomes Span current = CurrentContext.getCurrent().get(TracingContextUtils.CONTEXT_SPAN_KEY);

@tsloughter
Copy link
Member

@bogdandrutu the keys aren't exposed to the user, they are macros internal to the module only.

@tsloughter
Copy link
Member

And it looks like 2 keys can still be used, it is just named PropagatedSpan now.

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Sep 25, 2020

@bogdandrutu if I read your proposal correctly you suggest to remove the first 2 ways and only allow the last one.

That looks like a reasonable, composable API to me.

I would still want to see API usage examples. In my mind there are a few important use cases:

  1. Creating a span from an incoming request.
  2. Creating a span in a client instrumentation library in the current execution context (that potentially may be from an incoming request or may already have a parent span in the current context).
  3. Creating a span in a server instrumentation library from an incoming request, then creating a child span in application's request handler code.
  4. Async processing, i.e. creating a child span in a different thread or execution unit but in the same execution context of the same process.

There are probably more.

It would help a lot to see what is the shape of the code for these uses cases with your proposed API.

I generally agree with you that it is bad to have 3 slightly different ways to achieve the same thing.

@bogdandrutu
Copy link
Member Author

@tsloughter:

  • For the keys not being exposed: how can someone implement a propagator outside that module? We need to allow that, so you either expose the keys or helper methods as proposed by me.
  • For the second key: the PropagationSpan is a proper Span so no need to have a second key for that, same key can be used. Also that removes potential bugs when the two values stored go out of sync.

@tsloughter
Copy link
Member

The keys don't have to be used directly by the user to implement a propagator (at least in this implementation). The keys are used by otel_ctx:http_extractor so the user only has to provide a module that implements extract. This reminds me to add that function :), right now w3c and b3 duplicate each others code because I hadn't decided on exposing the keys through functions or doing this function based method.

But damn, didn't read that PR right... Don't think I like a new form of Span for this, but I guess easy enough to switch to.

It feels like a lot of stuff has been merged quickly in the last couple days in order to meet the freeze even though it changes stuff that had gone through a lot of discussion already, without similar discussion time.

@bogdandrutu
Copy link
Member Author

@tsloughter completely agree with your assessment regarding merging things. I was keep saying this in a lot of meetings that I am worried about.

Regarding otel_ctx:http_extractor (if I understand correctly is a helper function for http requests) that may work for http request, but you have to do something for grpc, thrift, kafka, etc. So unless you provide helpers for all the possible frameworks, you will have a problem at one moment that one user will have use a framework, will be able to get the bytes sent on the wire, will construct a SpanContext using a custom propagator, then cannot do anything with it.

@tsloughter
Copy link
Member

Right, this is just for the "Text" propagator -- hasn't been renamed since it was renamed in the spec. There is Text and Binary, right?

@bogdandrutu
Copy link
Member Author

bogdandrutu commented Sep 25, 2020

@tigrannajaryan I will try to shape these examples here (mostly copying from Java codebase):

Creating a span from an incoming request.
Creating a span in a client instrumentation library in the current execution context (that potentially may be from an incoming request or may already have a parent span in the current context).
Creating a span in a server instrumentation library from an incoming request, then creating a child span in application's request handler code.
Async processing, i.e. creating a child span in a different thread or execution unit but in the same execution context of the same process.

// Wrapper for remote calls.
public void interceptCall(args) {
  TextMapPropagator textFormat = OpenTelemetry.getPropagators().getTextMapPropagator();;

  // Extract the context entries from the metadata of the gRPC request
  Context extractedContext = textFormat.extract(Context.current(), headers, getter);

  Span span =
      tracer
          .spanBuilder("helloworld.Greeter/SayHello")
          .setParent(extractedContext)
          .setSpanKind(Span.Kind.SERVER)
          .startSpan();

  // Here a helper like TracingContextUtils.currentContextWith will remove this extra line
  appContext = TracingContextUtils.withSpan(extractedContext, span)
  try (Scope scope = CurrentContext.withContext(appContext)) {
     doApplicationWork(args)
  } finally {
    span.end()
  }
}

public void doApplicationWork(args) {
  // Here a helper like TracingContextUtils.getCurrentSpan will remove this extra call to CurrentContext.getCurrent()
  Span currentSpan = TracingContextUtils.getSpan(CurrentContext.getCurrent())
  // Do some work
  currentSpan.setAttribute("foo", "bar")
  // Do more work

  // Create a Span with the initial server span as a parent.
  Span span = tracer.spanBuilder("ScheduleAsyncWork").startSpan();
  try {
    // Here a helper like TracingContextUtils.currentContextWith will remove this extra line
    context = TracingContextUtils.withSpan(CurrentContext.getCurrent(), span)
    // Here it is important to wrap the Runnable with the entire context not just trace Span.
    executor.execute(CurrentContext.wrap(context, doAsyncApplicationWork(args))))
  } finally {
    span.end()
  }
}

public Runnable doAsyncApplicationWork(args) {
  return new Runnable {
      @Override
      public void run() {
          // Here a helper like TracingContextUtils.getCurrentSpan will remove this extra call to CurrentContext.getCurrent()
          Span currentSpan = TracingContextUtils.getSpan(CurrentContext.getCurrent())
          currentSpan.AddEvent("Happily executing the async work")

          // Create a Span with the "ScheduleAsyncWork" span as a parent.
          Span span = tracer.spanBuilder("ScheduleAsyncWork").startSpan();
          // Here a helper like TracingContextUtils.currentContextWith will remove this extra line
          Context context = TracingContextUtils.withSpan(CurrentContext.getCurrent(), span)
          try (Scope scope = CurrentContex.withContext(context) {
            // Here the new Span is active.
          } finally {
            span.end()
          }
      }
  }
}

EDITED
@tigrannajaryan as you can see there are bunch of places where helpers may improve the codebase, the only problem that right now I don't know how to solve (and main reason to not expose this yet) is that CurrentContext right now is global, and I heard complains explicitly in java world that it may need to not be. So by limiting where the APIs interact with CurrentContext to only this class may allow us to not have that as a global, I may be wrong, but this is why I think removing from the helper is beneficial.

@tigrannajaryan
Copy link
Member

@bogdandrutu do you think we need to postpone the trace spec freeze until we resolve this? We may need a bit time to do that.

@jkwatson
Copy link
Contributor

If I read this correctly, there will only be one method left on the Tracer interface (in Java, at least), which is to create a new spanBuilder. As an API user (not author/maintainer), I think I'm going to wonder if the Tracer is really pulling any weight any more, if all it does is let me create/start a new span, and doesn't actually provide any value around "tracing" beyond being a span factory.

I definitely agree that things are muddled at the moment, and it's unclear what the "right" way to do manage the Context is. But, I'm not sure if completely gutting the Tracer is the right way to approach it. I feel like, as an API user, that having the Tracer be the place to do "tracing", which by its very nature involves having to manage the propagation of the context, feels like a more natural approach.

My instincts as an API designer tell me that we should make the Tracer the one-stop-shop for doing tracing, both creating spans, and propagating them, rather than moving all of that to the more wonkily named "TracingContextUtils".

I do like the idea of a CurrentContext class that is the place for dealing with the context itself, outside of tracing concerns, although I might just want to call it a ContextManager at that point.

Those are my rough thoughts at this point. And, yes, I don't think this is something we can just rush through on a Friday night. :)

@tsloughter
Copy link
Member

I would still want to see API usage examples. In my mind there are a few important use cases:

@tigrannajaryan these are similar to (more complementary to, I guess) the examples I'm hoping to get for languages, https://docs.google.com/document/d/1kEW71SYInRB44elJl116vrzYL3_HXr133JomMokZ-PI/edit#

Maybe there should be a checkbox per example in the GA spec table? They are going to be useful not just for discussing and finalizing the API but for users as documentation as well.

@anuraaga
Copy link
Contributor

I have been brainstorming integrating trace context methods into Span itself

static Span.current()
Scope Span.makeCurrent()
static Span fromContext(Context ctx)
Context addToContext(Context ctx)

I think this could be a simpler Java API where static util methods on an unrelated class are not so idiomatic - when working with a span, use the span. Or the context. But it's not common to have a third # party do things I think. Does the spec preclude merging pieces together (Span and TraceContextUtils) in this sort of way?

@bogdandrutu
Copy link
Member Author

bogdandrutu commented Sep 27, 2020

@anuraaga I don't think you can implement the addToContext unless you expose publicly the context key which is against the spec. So that has to be static as well.

Which means you have the same functionality in the Span class instead of the helper. I would say at that point that becomes only a personal preference. Also this issue is not about necessary where the functionality lives, it is about duplicate exposed functionality and missing functionality first, then about where to expose it.

That being said I think your proposal should be consider as one of the possibility to expose the functionality.

@anuraaga
Copy link
Contributor

unless you expose publicly the context key

Ah for both of these non-static methods, I'm thinking of using default methods. If the key is package private, I think it'll not have that problem.

then about where to expose it.

But this is just what I wanted to confirm :) Sounds good then.

@tsloughter
Copy link
Member

I made an example/test propagator to hopefully better show how we are doing it with anonymous functions in anonymous functions: https:/open-telemetry/opentelemetry-erlang/blob/0188ee39790f91df8a3a106beea9cf2fddd5af5e/apps/opentelemetry_api/test/custom_propagator.erl#L41

For the injector the context key is passed along with a function that will return the content to include in the "text map".

For the extractor the carrier and the current value of the key in the context are passed to the function and the returned updated value is inserted into the context by the helper.

So the key isn't exposed but it is passed to and stored in the list of propagators.

@carlosalberto
Copy link
Contributor

carlosalberto commented Sep 28, 2020

My instincts as an API designer tell me that we should make the Tracer the one-stop-shop for doing tracing, both creating spans, and propagating them, rather than moving all of that to the more wonkily named "TracingContextUtils".

+1 on this comment from @jkwatson - I think I mentioned it in the past, but in OpenTracing we still had Tracer.activeSpan() which was clearly documented as a shortcut to ScopeManager.active() (which had a similar role to Context).

@bogdandrutu I see the reason to have less methods, that indeed may create confusion, but I'd be up for:

  1. TracingContextUtils.getSpan(Context ctx) - the official way
  2. Tracer.getCurrentSpan() - with a clear, explicit mention it's merely a shortcut to TracingContextUtils.getSpan(ctx).

@tsloughter
Copy link
Member

+1 on this comment from @jkwatson - I think I mentioned it in the past, but in OpenTracing we still had Tracer.activeSpan() which was clearly documented as a shortcut to ScopeManager.active() (which had a similar role to Context).

+1 to this +1.

I think this is also a case of implementation bleeding into spec. "Utils" is used to simply describe functions that are not dependent to the stateful "Tracer", right?

@jkwatson
Copy link
Contributor

I think this is also a case of implementation bleeding into spec. "Utils" is used to simply describe functions that are not dependent to the stateful "Tracer", right?

That's an interesting interpretation. So, in Java, for example, you think it would be fine to have these methods just be static methods on the Tracer class. That sounds much better to me than what we have at the moment. I think that could use some clarification, if generally agreed upon.

@mwear
Copy link
Member

mwear commented Sep 28, 2020

My 2 cents is that span / context interactions should be handled through the tracer. If the operations don't require a specific tracer instance, then they can be static methods on the tracer. For Ruby, we have two methods, that are working well for us.

  • Tracer.current_span(context = Context.current) - returns the current span from the given context, or the current context if
  • Tracer.span_with_context(span, parent_context = Context.current) - Inserts a span into a new context derived from the parent context

I don't know that we need to introduce a new namespace such as ContextUtils for these methods. Having them on the tracer would be in line with the remove span proposal: open-telemetry/oteps#68.

@tsloughter
Copy link
Member

Regarding the context propagation and not exposing keys:

Currently propagators are spec'ed and in most languages written as the individual propagator implementations (like b3, w3c tracecontext and w3c baggage) must handle Context. I'm arguing that propagators are per-concern (tracing, baggage, etc) and this is separate from the propagator implementation.

Meaning a Propagator still takes a Context but w3c and b3 implementations are serialization functions passed to the Propagator, not Propagators themselves.

W3CTracePropagator = Tracer.getTextMapPropagator(W3CTraceSerializer)
B3TracePropagator = Tracer.getTextMapPropagator(B3TraceSerializer)
W3CBaggagePropagator = Baggage.getTextMapPropagator(W3CBaggageSerializer)

set_global_text_map_propagators([W3CTracerPropagator, B3TracerPropagator, W3CBaggagePropagator])

W3CTraceSerializer implements 2 methods, inject and extract.

Tracer.getTextMapPropagator generates a TextMap Context Propagator based on the Tracer's context key and the passed in serializer. Same for Baggage.getTextMapPropagator.

This boils down to what today are called Propagators become "serializers" (or whatever) and a Propagator is instead a specific instance of a TextMapPropagator that is created by Tracer or Baggage (each which know the context key to get the particular value they are injecting from or extracting to).

@jmacd
Copy link
Contributor

jmacd commented Sep 28, 2020

@bogdandrutu I noticed you wrote "THIS IS NOT IN THE SPECS NOW" twice in

#1019 (comment)

though I would point out that these APIs to set and get the current span exist in the OTel-Go library and they feel natural. I support this aspect of your proposal.

I believe this idea has been previously referred to as the idea that Spans be created in the "detached" state, forcing you to use withSpan() to build a context after starting a child. Is this your intent?

In the OTel-Go SDK there are three context keys (baggage, span context, remote span context). Are you hoping to reduce the number of context keys overall in the move to keep them in one place (baggage and remote span context are often updated together)? From reading the above comments, it looks like there is support for keeping these Context methods in the trace package, not a "Utils" package. Maybe baggage should move back into trace, so that all these APIs are in the same place?

@tsloughter
Copy link
Member

Baggage should remain separate, it could be utilized even without traces but just to propagate some baggage or use it with metrics.

@tsloughter
Copy link
Member

@jmacd also, apparently a separate context key isn't supposed to be used for remote extracted spans anymore? This is what Bogdan was saying in this issue. Is Go still using the separate key? Erlang is. Go is also not OOP so wondering if "PropagatedSpan" doesn't work well in it either.

@andrewhsu
Copy link
Member

from the spec sig mtg today, will assess prototypes by next spec sig in order to move on a decision

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:baggage Related to the specification/baggage directory spec:context Related to the specification/context directory spec:trace Related to the specification/trace directory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants