Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTelemetry.Shims.OpenTracing Invalid 'SpanContext' when not sampled #4087

Closed
konczykl opened this issue Jan 19, 2023 · 4 comments
Closed
Labels
bug Something isn't working
Milestone

Comments

@konczykl
Copy link

konczykl commented Jan 19, 2023

Bug Report

List of [all OpenTelemetry NuGet packages]:

<PackageReference Include="OpenTelemetry" Version="1.4.0-rc.1" />
<PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.4.0-rc.1" />
<PackageReference Include="OpenTelemetry.Exporter.Jaeger" Version="1.4.0-rc.1" />
<PackageReference Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" Version="1.4.0-rc.1" />
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.4.0-rc.1" />
<PackageReference Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.0.0-rc9.10" />
<PackageReference Include="OpenTelemetry.Instrumentation.Hangfire" Version="1.0.0-beta.4" />
<PackageReference Include="OpenTelemetry.Instrumentation.SqlClient" Version="1.0.0-rc9.10" />
<PackageReference Include="OpenTelemetry.Shims.OpenTracing" Version="1.0.0-rc9.11" />

Runtime version:

  • net7.0

Symptom

When using shim package

What is the expected behavior?
Return some noop instance.

What is the actual behavior?
ArgumentException thrown from SpanShim ctor.

Reproduce

Reproduced error:
https:/konczykl/OpenTracingShimError

Additional Context

Problem exists if we are not sampling spans - underlying activity is not created resulting in default SpanContext.

using (var parent = tracer.BuildSpan("parent").StartActive())
using (var child = tracer.BuildSpan("child").StartActive())
{

}

Parent span's activity is set as PropagationData and gets created but child's activity is null.

 public SpanShim(TelemetrySpan span)
{
    Guard.ThrowIfNull(span);

    if (!span.Context.IsValid) // here Context is default -> not valid
    {
        throw new ArgumentException($"Invalid '{nameof(SpanContext)}'", nameof(span.Context));
    }

    this.Span = span;
    this.spanContextShim = new SpanContextShim(this.Span.Context);
}
@konczykl konczykl added the bug Something isn't working label Jan 19, 2023
@TimothyMothra
Copy link
Contributor

I spent some time today digging into this.

First, I put together a smaller repro sample as a unit test for easier debugging:

using System.Diagnostics;
using Microsoft.Extensions.DependencyInjection;
using OpenTelemetry.Context.Propagation;
using OpenTelemetry.Shims.OpenTracing;
using OpenTelemetry.Trace;
using OpenTracing;
using Xunit;


[Fact]
public void Investigate4087_AlwaysOff()
{
	string serviceName = "Investigate4087";

	var exportedItems = new List<Activity>();

	var services = new ServiceCollection();

	services.AddOpenTelemetry()
		.WithTracing(b => b
			.AddSource(serviceName)
			.AddInMemoryExporter(exportedItems)
			.SetSampler(new AlwaysOffSampler())); // this fails
			//.SetSampler(new AlwaysOnSampler())); // this works

	services.AddSingleton<ITracer>(sp =>
	{
		var tracerProvider = sp.GetRequiredService<TracerProvider>();
		var tracer = new TracerShim(tracerProvider.GetTracer(serviceName), Propagators.DefaultTextMapPropagator);
		return tracer;
	});

	IServiceProvider serviceProvider = services.BuildServiceProvider();
	var tracer = serviceProvider.GetRequiredService<ITracer>();

	using (var parent = tracer.BuildSpan("parent").StartActive())
	using (var child = tracer.BuildSpan("child").StartActive())
	{
	}
}        

Part of the cause is in StartSpanHelper(), this method tries to start a new activity and return a TelemetrySpan.
This method is failing to create an activity and instead returns the TelemetrySpan.NoopInstance.

var activity = this.ActivitySource.StartActivity(name, activityKind, parentContext.ActivityContext, initialAttributes?.Attributes ?? null, activityLinks, startTime);
if (activity == null)
{
return TelemetrySpan.NoopInstance;
}

The TelemetrySpan.NoopInstance uses all default values including:

  • TraceId {00000000000000000000000000000000}
  • SpanId {0000000000000000}

Here, the Span Context is Invalid because of the default values and fails here:

public bool IsValid => IsTraceIdValid(this.TraceId) && IsSpanIdValid(this.SpanId);

public SpanShim(TelemetrySpan span)
{
Guard.ThrowIfNull(span);
if (!span.Context.IsValid)
{
throw new ArgumentException($"Invalid '{nameof(SpanContext)}'", nameof(span.Context));
}

@TimothyMothra
Copy link
Contributor

While digging into the relationship between Samplers and StartActivity I found this:

if (this.sampler is AlwaysOnSampler)
{
listener.Sample = (ref ActivityCreationOptions<ActivityContext> options) =>
!Sdk.SuppressInstrumentation ? ActivitySamplingResult.AllDataAndRecorded : ActivitySamplingResult.None;
this.getRequestedDataAction = this.RunGetRequestedDataAlwaysOnSampler;
}
else if (this.sampler is AlwaysOffSampler)
{
listener.Sample = (ref ActivityCreationOptions<ActivityContext> options) =>
!Sdk.SuppressInstrumentation ? PropagateOrIgnoreData(options.Parent) : ActivitySamplingResult.None;
this.getRequestedDataAction = this.RunGetRequestedDataAlwaysOffSampler;
}

In summary, when using the AlwaysOffSampler, the PropogateOrIgnoreData() is effectively determining to not create a new Activity.
This eventually results in the ArgumentException detailed above.

--
This sounds related to #3290. I'll follow up.

@pjanotti
Copy link
Contributor

@TimothyMothra @open-telemetry/dotnet-maintainers While some other aspects are mentioned on this issue it seems that we can close it as fixed via #4668

@TimothyMothra
Copy link
Contributor

@pjanotti Confirmed!

@open-telemetry/dotnet-maintainers I think this Issue could be assigned to the current milestone and closed. :)

@pellared pellared added this to the 1.6.0 milestone Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants