Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End-to-end example applications #936

Merged
merged 37 commits into from
Aug 10, 2020

Conversation

alanwest
Copy link
Member

@alanwest alanwest commented Jul 28, 2020

This PR in part addresses #854 for building out a slightly more sophisticated sample application.


At this point it is uncertain how much further we want to expand on the idea of a more complex suite of example applications hosted in this repository. Ideally, we can benefit from a more full-fledged example like eShopOnContainers which would we wouldn't necessarily be responsible for maintaining. So, in the meantime, this example may just provide a stopgap solution for demonstrating some of the more complex OpenTelemetry concepts. Namely, context propagation, messaging systems, and the combination of both an HTTP service and a non-HTTP service.


A fair amount of good conversation has already taken place on this PR.

  • I believe there is a consensus that demonstrating context propagation would be valuable.
  • Less of a consensus may exist that demonstrating messaging systems is valuable.

Some people feel that demonstrating some messaging system concepts would be nice. Though, this does understandably make the example a fair bit more complicated with the usage of docker and relatively complex boilerplate for the messaging components.

In an effort to reduce the cognitive distraction of the messaging boilerplate and hopefully emphasize the context propagation, I have attempted to separate the OpenTelemetry logic into the MessageReceiver and MessageSender classes.

An alternative has been suggested by @reyang to just focus on the context propagation:

How about a simple child-process approach - the parent creates child process, passing in tracecontext as the single argument or environment variable (or two, one for traceparent and the other for tracestate), child process picks up the context for subsequent actions? In this way there is no need to have docker at all, and it is possible to have a single executable that runs everywhere (acting as both a client and server, child and parent).


Questions

  • Do we want to move forward with this PR knowing that something like eShopOnContainers may be a better platform for this kind of example in the future?
  • If we do merge this PR, but later remove this in the future in favor of eShopOnContainers, would we still want a simpler example demonstrating context propagation like the one Reiley suggested? If so, then does it make sense to just build the simpler example now?
  • If we do merge this PR, do we want to continue to expand upon it? For example, I think it would be good to add some gRPC sample applications. Though, these could be stand-alone. Or does it make sense to incorporate it into this example? It was noted that Python has a more complex example that includes some gRPC components.

@alanwest alanwest requested a review from a team July 28, 2020 00:28
@alanwest alanwest changed the title Suite of example applications demonstrating context propagation with … End-to-end example applications Jul 28, 2020
@reyang
Copy link
Member

reyang commented Jul 28, 2020

I have two general feedback:

  1. Do we want to take a dependency on RabbitMQ and keep maintaining this example moving forward?
  2. Do we think RabbitMQ would eventually have its propagator/instrumentation?


namespace WorkerService
{
public class RabbitMqConsumer : BackgroundService
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we have more code showing how to use RabbitMq than OpenTelemetry itself.
Wonder if there is something simpler, that helps people to focus on the custom propagation part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had a similar thought. The applications required a fair amount of boilerplate to get RabbitMQ functionality in place. I can spend a bit of time refactoring things a bit to see if I can get the OpenTelemetry part of it all better highlighted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it takes some time to actually get to the OTel tracing/propagation part in the jungle of RabbitMq and boiler plate codes..

Copy link
Member

@CodeBlanch CodeBlanch Jul 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tough crowd, eh? Nice work @alanwest! Messaging is one of the more complex/advanced cases, I actually think this does a great job of capturing it as concisely as possible. I'm sure these a bit of room for improvement but maybe we just need another sample doing something more trivial like calling a web service?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tough crowd, eh?

😄 not at all, I appreciate the input. Working this afternoon to see what I can do to better highlight the OTel pieces. Stay tuned!

maybe we just need another sample doing something more trivial like calling a web service?

You thinking like a call demonstrating HttpClient instrumentation or something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You thinking like a call demonstrating HttpClient instrumentation or something?

That's what I was thinking. But we do kind of already have one in there:

// Making an http call here to serve as an example of
// how dependency calls will be captured and treated
// automatically as child of incoming request.
var res = httpClient.GetStringAsync("http://google.com").Result;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made an effort to separate the OpenTelemetry logic from the RabbitMQ 🌴 🐰 jungle 🐇 🌴.

Basically there are two classes now that are pretty succinct and I think put a pretty good focus on the context propagation:

  • The WebApi has a MessageSender class
  • The WorkService has a MessageProcessor class

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You thinking like a call demonstrating HttpClient instrumentation or something?

That's what I was thinking. But we do kind of already have one in there:

I'm definitely a fan of a more turnkey, holistic example that showcases a number of things. How about after we get some consensus on the direction of this sample, we can discuss what else may make sense to take on, and then do it in smaller follow up PRs?

@alanwest
Copy link
Member Author

I have two general feedback:

  1. Do we want to take a dependency on RabbitMQ and keep maintaining this example moving forward?
  2. Do we think RabbitMQ would eventually have its propagator/instrumentation?

I would hope that RabbitMQ will eventually have its own instrumentation. I'd want to revise this example when that happens. In discussion with @cijothomas, we figured it would be beneficial for now to demonstrate something a little more complicated that 1) shows how to instrument a non-HTTP application and 2) perform manual context propagation. I'm definitely open to other ideas.

@reyang
Copy link
Member

reyang commented Jul 28, 2020

I would hope that RabbitMQ will eventually have its own instrumentation. I'd want to revise this example when that happens.

That's great!

In discussion with @cijothomas, we figured it would be beneficial for now to demonstrate something a little more complicated that 1) shows how to instrument a non-HTTP application and 2) perform manual context propagation. I'm definitely open to other ideas.

How about a simple child-process approach - the parent creates child process, passing in tracecontext as the single argument or environment variable (or two, one for traceparent and the other for tracestate), child process picks up the context for subsequent actions? In this way there is no need to have docker at all, and it is possible to have a single executable that runs everywhere (acting as both a client and server, child and parent).

@alanwest
Copy link
Member Author

How about a simple child-process approach - the parent creates child process, passing in tracecontext as the single argument or environment variable (or two, one for traceparent and the other for tracestate), child process picks up the context for subsequent actions? In this way there is no need to have docker at all, and it is possible to have a single executable that runs everywhere (acting as both a client and server, child and parent).

If all we ultimately want to demonstrate is context propagation, then yes I think something simpler like this would be a good way to go. Though, the concept here is provide a more holistic example incorporating OpenTelemetry and a number of dependencies. I've noted that ultimately eShopOnContainers would be a far better place to showcase this kind of thing. Though in the meantime, perhaps doing something small in the repo could be useful?

Before simplifying things too much, I'd like to hear some others' input. I think this PR is a good place to take a pause and let people chime in with other ideas. I'm definitely open to evolving things in a way that people think the community will find most useful.

@reyang
Copy link
Member

reyang commented Jul 28, 2020

If all we ultimately want to demonstrate is context propagation, then yes I think something simpler like this would be a good way to go. Though, the concept here is provide a more holistic example incorporating OpenTelemetry and a number of dependencies. I've noted that ultimately eShopOnContainers would be a far better place to showcase this kind of thing. Though in the meantime, perhaps doing something small in the repo could be useful?

Before simplifying things too much, I'd like to hear some others' input. I think this PR is a good place to take a pause and let people chime in with other ideas. I'm definitely open to evolving things in a way that people think the community will find most useful.

In OpenTelemetry Python there is a fairly complex demo app FYI.

@cijothomas
Copy link
Member

If all we ultimately want to demonstrate is context propagation, then yes I think something simpler like this would be a good way to go. Though, the concept here is provide a more holistic example incorporating OpenTelemetry and a number of dependencies. I've noted that ultimately eShopOnContainers would be a far better place to showcase this kind of thing. Though in the meantime, perhaps doing something small in the repo could be useful?
Before simplifying things too much, I'd like to hear some others' input. I think this PR is a good place to take a pause and let people chime in with other ideas. I'm definitely open to evolving things in a way that people think the community will find most useful.

In OpenTelemetry Python there is a fairly complex demo app FYI.

If we can leverage https:/dotnet-architecture/eShopOnContainers as the proper example, then we wouldn't have to maintain a complex example in this repo forever. I think it makes sense to have a less complex, yet demonstrating the concepts (especially manual propagation) in this repo, until we go GA and eShopOnContainer owners can be pursued.

@codecov
Copy link

codecov bot commented Jul 28, 2020

Codecov Report

Merging #936 into master will decrease coverage by 1.47%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #936      +/-   ##
==========================================
- Coverage   77.01%   75.54%   -1.48%     
==========================================
  Files         221      221              
  Lines        6166     6208      +42     
==========================================
- Hits         4749     4690      -59     
- Misses       1417     1518     +101     
Impacted Files Coverage Δ
.../OpenTelemetry/Trace/CompositeActivityProcessor.cs 0.00% <0.00%> (-68.34%) ⬇️
src/OpenTelemetry/Sdk.cs 63.77% <0.00%> (-30.96%) ⬇️
src/OpenTelemetry/Trace/SimpleActivityProcessor.cs 43.47% <0.00%> (-26.09%) ⬇️
...try.Exporter.OpenTelemetryProtocol/OtlpExporter.cs 31.25% <0.00%> (-18.75%) ⬇️
...rc/OpenTelemetry.Exporter.Jaeger/JaegerExporter.cs 73.17% <0.00%> (-17.08%) ⬇️
...Exporter.Jaeger/Implementation/JaegerUdpBatcher.cs 86.86% <0.00%> (-5.11%) ⬇️
...ensions.Hosting/OpenTelemetryServicesExtensions.cs 73.33% <0.00%> (-5.05%) ⬇️
src/OpenTelemetry/Trace/TracerProviderSdk.cs 100.00% <0.00%> (ø)
...Exporter.Jaeger/TracerProviderBuilderExtensions.cs 100.00% <0.00%> (ø)
...Exporter.ZPages/TracerProviderBuilderExtensions.cs 100.00% <0.00%> (ø)
... and 5 more

this.connectionFactory = new ConnectionFactory()
{
HostName = Environment.GetEnvironmentVariable("RABBIT_HOSTNAME") ?? "localhost",
UserName = "guest",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we switch username & password to also use EnvVar if available? Just to be good internet citizens and encourage people to not hardcode their credentials.

UserName = Environment.GetEnvironmentVariable("RABBIT_USERNAME") ?? "guest",`
Password = Environment.GetEnvironmentVariable("RABBIT_PASSWORD") ?? "guest",`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good point.

{
try
{
string activityName = $"{nameof(RabbitMqService)}.{nameof(PublishMessage)}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec says operationName should be <destination name> <operation name> example $"{QueueName} send" for publish here. I just noticed that yesterday it might be new?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool, great call out. I'm really glad to see there's guidance here.

{
services.AddHostedService<RabbitMqConsumer>();

// TODO: Determine if this can be done here in a WorkerService. It does not seem to work... doing this in the RabbitMqConsumer for now.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO is really interesting! It should be here right? Wonder if we have a bug somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I'll circle back in a bit and see what's going on here.

{
var parentContext = this.textFormat.Extract(ea.BasicProperties, ExtractTraceContextFromBasicProperties);

string activityName = $"{nameof(RabbitMqConsumer)}.{nameof(ExecuteAsync)}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here for the activityName/operationName.

var parentContext = this.textFormat.Extract(ea.BasicProperties, ExtractTraceContextFromBasicProperties);

string activityName = $"{nameof(RabbitMqConsumer)}.{nameof(ExecuteAsync)}";
using (var activity = activitySource.StartActivity(activityName, ActivityKind.Server, parentContext))
Copy link
Member

@CodeBlanch CodeBlanch Jul 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyang Being an expert on the spec, should the span/activity created when consuming messages use the propagated context as parent or add as a link ("follows-from")? I would really like us to provide spec-correct guidance where possible.

Copy link
Member

@reyang reyang Jul 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short answer: I think the spec is not clear on this yet.

Long answer:

"follows-from" is an OpenTracing concept, which hasn't been formalized in the OpenTelemetry specification:

Most of the existing backends don't have support for links and I bet they already pass the same trace id over messaging system. And it looks like that the W3C TraceContext for MQTT is going to the same thing. I do think this could expose issue for batch processing and retry, using links instead of following the parent context seems to have more flexibility/power.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is great context. For the immediate purposes I think it makes sense to settle on a parent/child relationship since that's the only thing that most backends support at the moment - if, in particular, Zipkin and/or Jaeger introduce support for visualizing links, then I think it could be cool to demonstrate that.

That aside, I am curious how the spec conversations will evolve. My gut says that there will be value in both use cases. That is, a parent/child relationship may make sense some times and links other times. Though that could make configuration and especially auto-instrumentation complex.

@alanwest
Copy link
Member Author

alanwest commented Aug 7, 2020

@CodeBlanch I've pushed some changes that should address the issues you were encountering. Both apps can be spun up in any order now with out punishing you 👊, or the need for explaining things in the README.

I've also expanded the PR description to include a synopsis of the overall conversation.

@cijothomas @reyang @MikeGoldsmith

@CodeBlanch
Copy link
Member

@alanwest LGTM! My vote is to merge. We can always change it, remove it, or move it later. We can have other examples too or add more to this one? The TODO in there still worries me, but doesn't need to be resolved on this PR. @reyang is working on improving the whole build pattern anyway.

@cijothomas Defer to you to merge or make a call on this.

@alanwest
Copy link
Member Author

alanwest commented Aug 7, 2020

Oh yes thank you! I meant to comment on that. I was thinking leaving the todo in for now and investigate the potential bug in a follow up. I'm away from a computer today, but will update in with some more details soon.

@reyang
Copy link
Member

reyang commented Aug 7, 2020

I think we should move forward and merge this 👍.

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -186,6 +186,20 @@ Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "logs", "logs", "{3862190B-E
docs\logs\logging-correlation.md = docs\logs\logging-correlation.md
EndProjectSection
EndProject
<<<<<<< HEAD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like some conflict resolve header?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice catch! Apparently, Visual Studio was like this is fine.

.AddActivitySource(nameof(MessageSender))
.UseZipkinExporter(b =>
{
var zipkinHostName = Environment.GetEnvironmentVariable("ZIPKIN_HOSTNAME") ?? "localhost";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a followup PR, we should simply get zipkinhostname from this.Configuration. And set this in appsettings.json.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is namely for when running with docker-compose. I can double-check, but the Zipkin hostname is different as seen by the other containers as it is when hitting it at localhost from a browser.

{
this.messageReceiver = messageReceiver;

this.tracerProvider = Sdk.CreateTracerProvider((builder) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..lets come back to this. If program.cs sets up everything, then Worker can simply retrieve TracerProvider from DI in constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Yea, I may have been doing something wrong here. I'll follow up and clean this up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this is fixed now.

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Except we need to show activity? null check in the examples.

@cijothomas cijothomas merged commit bd738f0 into open-telemetry:master Aug 10, 2020
@alanwest alanwest deleted the alanwest/end-to-end-example branch August 24, 2020 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants