Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tail Based sampler regardless of exporter #2108

Closed
ovidiubuligan opened this issue Apr 20, 2023 · 4 comments
Closed

Tail Based sampler regardless of exporter #2108

ovidiubuligan opened this issue Apr 20, 2023 · 4 comments

Comments

@ovidiubuligan
Copy link

Is your feature request related to a problem?
My main problem is that traces in our app generate 100k + spans.
Why 100 000+ spans ? Some requests in our app are actually jobs that do a foreach loop with 100k sqls. For example :

main_span
   chlid_span1
     child_span2
       slq_span
       sql_span
       foreach_span
           sql_span
           ...
           100k more sql_span
           ...
           sql_span
 

100k+ spans in trace also crashes the viewer (ex jaeger, grafana tempo, etc)

Describe the solution you'd like
I would like to not sample all sql_span , only the ones with errors=true from the foreach_span .
This behavior should be implementable regardless of the Exporter .What are the reasons smpling is architecturally tied to the exporter? . I am using the OTLPExporter .

Describe alternatives you've considered
The only mention of tail based sampling in the otel-cpp repo is in the ETW exporter with tail based sampler work . But I am not using ETW exporter.
#1780

Additional context
I searched if in dotnet it is possible to to this and it looks like it is possible like in this more complex example :
[Example / proof of concept to achieve a combination of head-based sampling + a basic form of tail-based sampling at a span level.](https:/open-telemetry/opentelemetry-dotnet/pull/4206/files#top)
https:/open-telemetry/opentelemetry-dotnet/pull/4206/files#diff-69b0075317ec479742b05e73254e27ce62963f65ba370b4b5c2520e49143c620R32

@marcalff
Copy link
Member

@ovidiubuligan

In this example, dotnet is implementing a custom span processor to implement the filtering.

For opentelemetry-cpp, the same can be done by implementing a custom span processor also.

Please see class opentelemetry:sdk::trace::SpanProcessor

https:/open-telemetry/opentelemetry-cpp/blob/main/sdk/include/opentelemetry/sdk/trace/processor.h

The span processor will process traces before sending them (or not) to an exporter, this is independent of the exporter used.

@lalitb
Copy link
Member

lalitb commented Apr 21, 2023

Just to add, the otel-cpp doesn't provide readable spans. The Recordable interface design allows for the optimized implementation of trace pipeline by avoiding intermediate copying of data to immutable/readable Span. The only getter method available inside span-processor for reading spans is Recordable::SpanData(), which needs corresponding exporter to deserialize the span-data, and none of the existing exporters implement this method.

This means that - if you plan to write a custom span processor for tail sampling, you have to ensure that the exporter being used also implements Recordable::SpanData().

The ETW exporter is bit different from others - as it uses custom (header-only) SDK, so it can provide flexible tail sampling.

If you are using OTLP exporter, the recommended option would be to use tail-sampling feature provided by collector - https://opentelemetry.io/blog/2022/tail-sampling/

@ovidiubuligan
Copy link
Author

ovidiubuligan commented Apr 21, 2023

@lalitb I will analyze in more detail starting next week . But regarding OTEL collector tail sampling , the main issue there is that the foreach_span is not sent to the collector until it is finished (this is the behavior on all sdk's) . The collector tail sampler will need to wait roughly the time foreach_span takes to make sampling decisions which can be 1 hour log . This most likely means huge memory consumption on the otel collector that adds questionable scalability .

I think I will be left with the in-process sampler solution .

Looks OTLP exporter implements Recordable only :

 class OtlpRecordable final : public opentelemetry::sdk::trace::Recordable
{

https:/open-telemetry/opentelemetry-cpp/blob/main/exporters/otlp/include/opentelemetry/exporters/otlp/otlp_recordable.h

@lalitb
Copy link
Member

lalitb commented May 17, 2023

closing based on above discussion. Please reopen if there are further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants