Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cluster CSV parser tool and test thresholds #2468

Closed
2 tasks done
Selutario opened this issue Jan 24, 2022 · 3 comments · Fixed by #2631
Closed
2 tasks done

Update cluster CSV parser tool and test thresholds #2468

Selutario opened this issue Jan 24, 2022 · 3 comments · Fixed by #2631

Comments

@Selutario
Copy link
Contributor

Selutario commented Jan 24, 2022

In #2032, a tool was developed in charge of iterating, loading and processing CSVs with data on the use of resources of the wazuh-clusterd process.

However, multiple developments (wazuh/wazuh#10767, wazuh/wazuh#10807, wazuh/wazuh#10920, wazuh/wazuh#11364) have been carried out on the Wazuh cluster that provides it with multiprocessing capabilities.

This has been a boost in its performance. However, it involves creating new child processes that also use resources. These child processes are not taken into account by this tool, hence the need to update it.

Also, when the tool is updated to aggregate information from all processes, the resource thresholds that were set in the performance test (#1939) will probably fall far short. Therefore, it will be necessary to update it.

  • Update CSV parser tool.
  • Update cluster performance test.
@yanazaeva
Copy link
Contributor

yanazaeva commented Mar 1, 2022

Issue update

A method was added to the CSVParser class in order to sum up the parent and children processes. Not sure about how to present the CPU total, as seeing it as an addition can be confusing when it exceeds 100%.

10 workers - 50000 agents

In order to establish the new thresholds, we will use the data from the following issue:

Besides, we will use the data extracted from an environment created yesterday with 50000 agents and 10 workers. This information can be obtained here.

Tasks

  • Setup phase:
Real value Old threshold New threshold
Agent info sync Worker Mean(s) 1.942 10 4
Max(s) 10.083 40 21
Master Mean(s) 0.774 5 1.5
Max(s) 4.222 30 8.5
Integrity check Worker Mean(s) 3.547 15 7.5
Max(s) 12.757 50 26
Master Mean(s) 1.49 10 3
Max(s) 7.831 50 16
Integrity sync Worker Mean(s) 0.71 20 1.5
Max(s) 4.923 50 10
Master Mean(s) 1.771 30 3.6
Max(s) 10.102 80 20.2
  • Stable phase:
  Real value Old threshold New threshold
Agent info sync Worker Mean(s) 2.024 1.43 4.1
Max(s) 2.585 10 5.1
Master Mean(s) 0.67 0.5242 1.4
Max(s) 1.462 10 3
Integrity check Worker Mean(s) 3.356 3.32 6.7
Max(s) 4.45 9 9
Master Mean(s) 1.25 1.45 2.5
Max(s) 2.269 9 4.5

Resources

  • Setup phase
Real value Old threshold New threshold
USS(KB) Worker Mean 94531 153600 103984
Max 188680 307200 208700
Reg. cof. 298 1024 330
Master Mean 147485 1048576 168979
Max 285784 2097152 315000
Reg. cof. 497 102400 550
CPU(%) Worker Mean 11.06 10 12.166
Max 54.2 110 60
Reg. cof. 0.011 0.5 0.05
Master Mean 42.37 60 47
Max 120 110 132
Reg. cof. 0.045 1.5 0.05
FD Worker Mean 59.85 30 66
Max 72.6 150 79.2
Reg. cof. -0.05 1 0.015
Master Mean 90.38 100 100
Max 117 200 128.7
Reg. cof. -0.25 5 0.01
  • Stable phase
Real value Old threshold New threshold
USS(KB) Worker Mean 130176 117506.60276 145000
Max 130176 189895.2 201480
Reg. cof. 7477 30 8224
Master Mean 179239 240663.67742 197162
Max 186164 309394.8 204781
Reg. cof. 190.54 30 209.6
CPU(%) Worker Mean 12.76 15 14.036
Max 34.8 50 38.28
Reg. cof. 0.749 0.05 0.8239
Master Mean 46.38 47 51.018
Max 73.7 70 81.07
Reg. cof. -0.081 0.05 0.0891
FD Worker Mean 63.04 16 70
Max 64 20 70.4
Reg. cof. 2.47 0.0040 2.717
Master Mean 45.4 26.77512 50
Max 48 65 52.8
Reg. cof. -0.01 0.001 0.05

Below we can find the results for both environments:

(wazuh_qa_env) yanazaeva@pop-os:~/git/wazuh-qa/deps/wazuh_testing$ python3.9 -m pytest ../../tests/performance/test_cluster/test_cluster_performance/test_cluster_performance.py --artifacts_path=/docs/threshold_issue/rc2/50000A10W --n_workers=10 --n_agents=50000 -s
=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.9.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /home/yanazaeva/git/wazuh-qa
plugins: testinfra-5.0.0, metadata-1.11.0, html-3.1.1
collected 1 item                                                                                                                                                                                                  

../../tests/performance/test_cluster/test_cluster_performance/test_cluster_performance.py Setup phase took 0:22:09s (2021/12/23 14:27:43 - 2021/12/23 14:49:52).
Stable phase took 0:00:41s (2021/12/23 14:49:52 - 2021/12/23 14:50:33).
.

================================================================================================ 1 passed in 0.35s ================================================================================================
(wazuh_qa_env) yanazaeva@pop-os:~/git/wazuh-qa/deps/wazuh_testing$ python3.9 -m pytest ../../tests/performance/test_cluster/test_cluster_performance/test_cluster_performance.py --artifacts_path=/docs/threshold_issue/pipeline_50000A_10W/50000A10W --n_workers=10 --n_agents=50000 -s
=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.9.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /home/yanazaeva/git/wazuh-qa
plugins: testinfra-5.0.0, metadata-1.11.0, html-3.1.1
collected 1 item                                                                                                                                                                                                  

../../tests/performance/test_cluster/test_cluster_performance/test_cluster_performance.py Setup phase took 0:19:04s (2022/03/02 11:57:11 - 2022/03/02 12:16:15).
Stable phase took 0:05:34s (2022/03/02 12:16:15 - 2022/03/02 12:21:49).
.

================================================================================================ 1 passed in 0.39s ================================================================================================

25 workers - 50000 agents

Using the highest values extracted from these two issues, we changed the threshold for this environment in the following way:

Tasks

  • Setup phase:
Real value Old threshold New threshold
Agent info sync Worker Mean(s) 3.81 20 8
Max(s) 24.431 60 50
Master Mean(s) 1.55 10 3.1
Max(s) 15.228 50 31
Integrity check Worker Mean(s) 6.618 30 13.5
Max(s) 27.152 150 55
Master Mean(s) 4.106 20 8.3
Max(s) 24.132 150 50
Integrity sync Worker Mean(s) 1.6 20 3.2
Max(s) 10.574 150 22
Master Mean(s) 5.421 40 11
Max(s) 26.903 250 54
  • Stable phase:
  Real value Old threshold New threshold
Agent info sync Worker Mean(s) 1.149 2.05 3.3
Max(s) 4.252 10 8.5
Master Mean(s) 0.49 0.37 1
Max(s) 2.46 10 5
Integrity check Worker Mean(s) 2.98 4.21 6
Max(s) 4.92 9 10
Master Mean(s) 1.45 2.19 3
Max(s) 3.228 9 6.5

Resources

  • Setup phase
Real value Old threshold New threshold
USS(KB) Worker Mean 105871 262144 116458
Max 189728 524288 208700
Reg. cof. 756 1024 831
Master Mean 196930 2097152 216623
Max 972660 4194304 1069926
Reg. cof. 302 524288 332.64
CPU(%) Worker Mean 8.499 10 9.35
Max 43.5 110 47.85
Reg. cof. 0.027 0.5 0.06
Master Mean 72.141 75 79.42
Max 182.7 110 200.1
Reg. cof. 0.635 1.5 0.12
FD Worker Mean 63.32 20 70
Max 66 150 72.6
Reg. cof. 0.017 1 0.02
Master Mean 93.94 200 103.4
Max 145.0 500 160
Reg. cof. -0.23 10 0.05
  • Stable phase
Real value Old threshold New threshold
USS(KB) Worker Mean 143019 113246.83736 157321
Max 212060 190260.4 233266
Reg. cof. 825 30 910
Master Mean 284936 734003 313429
Max 507268 1048576 557994
Reg. cof. -839 30 850
CPU(%) Worker Mean 11.50 15 12.65
Max 40.4 50 44.44
Reg. cof. 0.09 0.05 0.12
Master Mean 48.03 84.33028 53
Max 95 100 104.5
Reg. cof. -0.09 0.05 0.12
FD Worker Mean 65.046 16 72
Max 66 20 72.6
Reg. cof. 0.3 0.0040 0.33
Master Mean 52.36 44.385 59
Max 64 75 70.5
Reg. cof. -0.09 0.001 0.11

We can see the output of the test below:

(wazuh_qa_env) yanazaeva@pop-os:~/git/wazuh-qa/deps/wazuh_testing$ python3.9 -m pytest ../../tests/performance/test_cluster/test_cluster_performance/test_cluster_performance.py --artifacts_path=/docs/threshold_issue/rc3/50000A25W --n_workers=25 --n_agents=50000 -s
=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.9.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /home/yanazaeva/git/wazuh-qa
plugins: testinfra-5.0.0, metadata-1.11.0, html-3.1.1
collected 1 item                                                                                                                                                                                                  

../../tests/performance/test_cluster/test_cluster_performance/test_cluster_performance.py Setup phase took 0:18:26s (2022/01/21 13:59:33 - 2022/01/21 14:17:59).
Stable phase took 0:09:55s (2022/01/21 14:17:59 - 2022/01/21 14:27:54).
.

================================================================================================ 1 passed in 0.63s ================================================================================================
(wazuh_qa_env) yanazaeva@pop-os:~/git/wazuh-qa/deps/wazuh_testing$ python3.9 -m pytest ../../tests/performance/test_cluster/test_cluster_performance/test_cluster_performance.py --artifacts_path=/docs/threshold_issue/rc4/50000A-25W --n_workers=25 --n_agents=50000 -s
=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.9.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /home/yanazaeva/git/wazuh-qa
plugins: testinfra-5.0.0, metadata-1.11.0, html-3.1.1
collected 1 item                                                                                                                                                                                                  

../../tests/performance/test_cluster/test_cluster_performance/test_cluster_performance.py Setup phase took 0:21:27s (2022/02/15 15:42:54 - 2022/02/15 16:04:21).
Stable phase took 0:08:53s (2022/02/15 16:04:21 - 2022/02/15 16:13:14).
.

================================================================================================ 1 passed in 0.67s ================================================================================================

@davidjiglesias
Copy link
Member

Hey team! Please add your planning poker estimate with ZenHub @AdriiiPRodri @Selutario @yanazaeva

@yanazaeva
Copy link
Contributor

Issue update

The threshold was updated accordingly to these results:
50000A10w.zip
50000A25W.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants