Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ParentReferenceType. #906

Closed

Conversation

carlosalberto
Copy link
Contributor

Fixes #65

This adds a new ParentReferenceType field to both Span (direct parent usage) and Link (additional parents usage), in order to support OpenTracing's CHILD_OF and FOLLOWS_FROM references.

The current design is not enough, as described in this comment. Also, in order to facilitate (hopefully) fast merge, I decided to go with the names used by OpenTracing.

The first question (mostly for @yurishkuro ) is whether we should keep this ParentReferenceType enum open for future types - @yurishkuro mentioned this in the aforementioned comment that we should probably stick to support what exists now (if we decide to keep it open, we should probably rename its name to not have the Parent suffix).

The second question (mostly for @bogdandrutu) is whether, under this paradigm, Links are always expected to be FOLLOWS_FROM (i.e. the parent does not depend on their outcome).

@carlosalberto carlosalberto requested review from a team September 1, 2020 13:40
@carlosalberto carlosalberto added spec:trace Related to the specification/trace directory area:api Cross language API specification issue labels Sep 1, 2020
@Oberon00
Copy link
Member

Oberon00 commented Sep 1, 2020

Would it not be enough to add a semantic convention for this? Since this is only for backwards compatibility with OpenTracing, the OpenTracing shim would be enough as API, wouldn't it?

@Oberon00
Copy link
Member

Oberon00 commented Sep 1, 2020

If we care about sync vs async, we would need to put this information in the parent not the child, because only the parent knows if it waits for the child (sync) or not (async). EDIT: This is already in the comment of @yurishkuro linked in the PR description #65 (comment), the paragraph that starts with:

Another odd thing about child-of and follows-from is that it's the child span that defines this reference type, even though it talks about parent's dependency on child outcome.

I don't think we should repeat this design mistake in OpenTelemetry (a semantic convention for backwards compatibility is fine though).

@Oberon00 Oberon00 added area:semantic-conventions Related to semantic conventions area:span-relationships Related to span relationships release:required-for-ga Must be resolved before GA release, or nice to have before GA priority:p1 Highest priority level labels Sep 1, 2020
depends on the child `Span` or not, and can be used to calculate the critical
path.

These are the possible `ParentReferenceKind`s:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As also explained in the comment the value names are not very. In both cases the Span is a child Span, it is just that parent waits or not for the result of the child Span.

So if we really need to support something that express this property names like: waits_for_result and ignores_result (or something like that are more meaningful).

I would like to understand how important is to know this information? Can this be added later? Are we breaking any real backend if this is not supported?

I think we should have a longer debate on this, I have some ideas about how to solve this:

  • I would suggest always using links when parent does not wait for any result from the child. This is because the child Span can finish long time after the parent (this can also happen when parent waits for result, but less likely).
  • In cases like batches we encourage using links as well, so we need probably to distinguish between these cases.
  • In OpenCensus my initial idea was to use MessageEvents to describe the relationship between messages exchanged by the child and parent, with the idea that it may actually be a bidirectional stream so there may be more waiting moments not just one.

So to summarize: Can we ignore this for the moment and add it later?

Copy link
Member

@Oberon00 Oberon00 Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the linked issue. We decided that we cannot go GA without having OpenTracing compatibility here. So we need to add some way of representing this information, if only for backward compatibility reasons.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we really need an OTEP :)

@bogdandrutu
Copy link
Member

@Oberon00 actually the "waiting" relationship can be on both sides, think about a bidirectional stream of data, parent may wait for messages at one time, child may also wait for messages. I think during the lifetime of a span may be more than one waiting time, so it is not clear that we can solve this with only one relationship.

I don't think it is necessary odd to have it only on the child as long as we have all the data we need

@anuraaga
Copy link
Contributor

anuraaga commented Sep 1, 2020

This may be silly so tell me, but from my understanding

  • Sync, e.g., wait for response is client span. Even if the logic itself is async, such as using Java CompletableFuture, it's still client span

  • Async is producer span. Send and get back to work. The consumer should be connected so the trace can show failure if the async steps failed, but the producer already did it's job by emitting its span and propagating its ID.

Isn't this enough to capture the relationship without a link from the parent? The way I think about it is the parent should know enough about what it's doing that a special model shouldn't be needed, it's the child that is more ambiguous (possibly batch). I read through the attached issue and couldn't find any clarification. Looking forward to be able to understand more of what's going on here :)

@Oberon00
Copy link
Member

Oberon00 commented Sep 1, 2020

@bogdandrutu

actually the "waiting" relationship can be on both sides, think about a bidirectional stream of data

The by far more common case is a single request-response cycle however, where the parent span does a request that is fulfilled by the child span without any intermediate case. For the more complex case of a bidirectional stream, we would need events and links on them (which seems reasonable enough: Why not add spanId and span+traceId as additional primitive attribute types?), see #65 (comment)

@bogdandrutu
Copy link
Member

@Oberon00 but as was pointed in the comments in the issue you need to know even for one request one response when the waiting for the response ends in order to calculate the critical path. So the problem of determining when the response is received is still there.

@Oberon00
Copy link
Member

Oberon00 commented Sep 1, 2020

Yeah, but anyway, this issue is about backward compatibility, for which we need it on the child side anyway.

@carlosalberto
Copy link
Contributor Author

It was discussed in the Spec SIG call today that we should, if possible, delay this feature, so we can careful study the requirements and come up with a full, proper solution.

How important is this for Jaeger specifically, @yurishkuro ? If this something the Jaeger ecosystem absolutely needs? if that's the case, we may consider doing it via semantic conventions, in order to express the child_of and follows_from reference types.

@yurishkuro
Copy link
Member

It's not terribly important. The one function that I recall that somewhat depends on child-of vs. follows-from is the clock skew adjustment done in Jaeger UI, but there are cases where it breaks even for child-of.

@carlosalberto
Copy link
Contributor Author

@yurishkuro Thanks. In that case I will close this PR but we will keep the related issue (#65 ) so we can work on this later on. Let me know if that doesn't sound right to you ;)

@yurishkuro
Copy link
Member

sgtm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue area:semantic-conventions Related to semantic conventions area:span-relationships Related to span relationships priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:trace Related to the specification/trace directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sync and Async children (FOLLOWS_FROM)
5 participants