Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tools] Stress test improvements #5381

Merged
merged 9 commits into from
Feb 23, 2024

Conversation

CodeBlanch
Copy link
Member

@CodeBlanch CodeBlanch commented Feb 22, 2024

Some improvements to the stress tests I made to help in testing #5364.

Changes

  • There is now a CLI which allows you to do this type of thing: OpenTelemetry.Tests.Stress.Metrics.exe -t Counter -v -o -e -d 300
  • While the test is running you will now get some statistics on the console by default.
  • Introduced some class hierarchy instead of linking in the Skeleton/Program from the demo test into the actual tests.

Options

Shared by all signals:

  -c, --concurrency      The concurrency (number of executing threads) for the stress test. Default value: Environment.ProcessorCount.

  -p, --internal_port    The prometheus http listener port where prometheus will be exposed for retrieving internal metrics while the stress test is running. Set to '0' to disable. Default value: 9464.

  -d, --duration         The duration for the stress test to run in seconds. If set to '0' or a negative value the stress test will run until canceled. Default value: 0.

Metrics:

  -t, --type             The metrics stress test type to run. Valid values: [Histogram, Counter]. Default value: Histogram.

  -m, --metrics_port     The prometheus http listener port where prometheus will be exposed for retrieving test metrics while the stress test is running. Set to '0' to disable. Default value: 9185.

  -v, --view             Whether or not a view should be configured to filter tags for the stress test. Default value: False.

  -o, --otlp             Whether or not an OTLP exporter should be added for the stress test. Default value: False.

  -i, --interval         The OTLP exporter export interval in milliseconds. Default value: 5000.

  -e, --exemplars        Whether or not to enable exemplars for the stress test. Default value: False.

@CodeBlanch CodeBlanch requested a review from a team February 22, 2024 23:58
Copy link

codecov bot commented Feb 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.22%. Comparing base (6250307) to head (93953fd).
Report is 97 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5381      +/-   ##
==========================================
- Coverage   83.38%   83.22%   -0.16%     
==========================================
  Files         297      279      -18     
  Lines       12531    11961     -570     
==========================================
- Hits        10449     9955     -494     
+ Misses       2082     2006      -76     
Flag Coverage Δ
unittests ?
unittests-Solution-Experimental 83.22% <ø> (?)
unittests-Solution-Stable 82.97% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

see 48 files with indirect coverage changes

@@ -145,10 +181,13 @@ public static void Stress(int concurrency = 0, int prometheusPort = 0)
var cntCpuCyclesTotal = GetCpuCycles();
var cpuCyclesPerLoopTotal = cntLoopsTotal == 0 ? 0 : cntCpuCyclesTotal / cntLoopsTotal;
Console.WriteLine("Stopping the stress test...");
Console.WriteLine($"* Total Runaway Time (seconds) {watchForTotal.Elapsed.TotalSeconds:n0}");
Console.WriteLine($"* Total Runway Time (Seconds) {watchForTotal.Elapsed.TotalSeconds:n0}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the README file as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to ask, was "Runaway" intentional? In the running output we have "Runway" so it seemed like a typo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what's the perfect term, I just picked a term that seems to fit the intention http://osr507doc.xinuos.com/en/HANDBOOK/runaway_proc.html. (e.g. if user keeps this process, it'll consume lots of system resources)

The comment here is more emphasizing on "whatever term we choose, we have to be consistent".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just switched it to "Running Time" in both spots. Seemed less likely to cause any confusion but I don't absolutely love it 😄

Let me do README updates in a follow-up as this is getting a bit big already?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to ask, was "Runaway" intentional? In the running output we have "Runway" so it seemed like a typo.

#3496 (comment) 😀

Copy link
Contributor

@rajkumar-rangaraj rajkumar-rangaraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CodeBlanch CodeBlanch merged commit 73b6e30 into open-telemetry:main Feb 23, 2024
37 checks passed
@CodeBlanch CodeBlanch deleted the stress-test-improvements branch February 23, 2024 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants