Remoted process at 100% of CPU usage in a Wazuh master node without agents #3369

QU3B1M · 2022-09-27T19:24:09Z

Target version	Related issue	Related PR/dev branch
4.5	#14733	In Progress

Description

This change should fix a remoted performance issue where when the cluster has a lot of multigroups configured the cluster synchronization consumes a lot of CPU taking a lot of time to finish

Configurations

Proposed test cases

The cluster syncronization does not makes the nodes CPU collapse
Tier 1 test cases
- No node cpu should collapse after the master node restart with 200 multigroups
  1. Install and setup a wazuh cluster of one master and two workers
  /var/ossec/bin/cluster_control -l NAME TYPE VERSION ADDRESS master master 4.3.8 192.168.56.10 worker2 worker 4.3.8 192.168.56.12 worker1 worker 4.3.8 192.168.56.11
  1. Check the system cpu usage of every node
  2. Add 3 agents and 7 groups and permute to get 200 multigroups
  It could be done adding the registres to the db itself sqlite3 /var/ossec/queue/db/global.db tables: agent, group, belongs
  1. Restart the master node
  /var/ossec/bin/wazuh-control restart
  1. Check the system cpu usage of every node is not too high
  It depends of the machine cpu, but it should not be at 100% usage
- No node cpu should collapse after update a file in each group having 200 multigroups
  1. Install and setup a wazuh cluster of one master and two workers
  /var/ossec/bin/cluster_control -l NAME TYPE VERSION ADDRESS master master 4.3.8 192.168.56.10 worker2 worker 4.3.8 192.168.56.12 worker1 worker 4.3.8 192.168.56.11
  1. Check the system cpu usage of every node
  2. Add 3 agents and 7 groups and permute to get 200 multigroups
  It could be done adding the registres to the db itself sqlite3 /var/ossec/queue/db/global.db tables: agent, group, belongs
  1. Add a new file in each group directory
  echo "testing" > /var/ossec/etc/shared/<groupName>/<fileName>.txt
  1. Check the system cpu usage of every node is not at 100% usage (it can take up to 10 seconds to update)
Tier 2 test cases
- No node cpu should collapse after a worker restart with 200 multigroups
  1. Install and setup a wazuh cluster of one master and two workers
  /var/ossec/bin/cluster_control -l NAME TYPE VERSION ADDRESS master master 4.3.8 192.168.56.10 worker2 worker 4.3.8 192.168.56.12 worker1 worker 4.3.8 192.168.56.11
  1. Check the system cpu usage of every node
  2. Add 3 agents and 7 groups and permute to get 200 multigroups
  It could be done adding the registres to the db itself sqlite3 /var/ossec/queue/db/global.db tables: agent, group, belongs
  1. Restart a worker node
  /var/ossec/bin/wazuh-control restart
  1. Check the system cpu usage of every node is the same than before
  It may have variations, but it should not be high
- No node cpu should collapse after the master node restart with 300 multigroups
  1. Install and setup a wazuh cluster of one master and two workers
  /var/ossec/bin/cluster_control -l NAME TYPE VERSION ADDRESS master master 4.3.8 192.168.56.10 worker2 worker 4.3.8 192.168.56.12 worker1 worker 4.3.8 192.168.56.11
  1. Check the system cpu usage of every node
  2. Add 3 agents and 7 groups and permute to get 300 multigroups
  It could be done adding the registres to the db itself sqlite3 /var/ossec/queue/db/global.db tables: agent, group, belongs
  1. Restart the master node
  /var/ossec/bin/wazuh-control restart
  1. Check the system cpu usage of every node is not too high
  It depends of the machine cpu, but it should not be at 100% usage
- No node cpu should collapse after the master node restart with 400 multigroups
  1. Install and setup a wazuh cluster of one master and two workers
  /var/ossec/bin/cluster_control -l NAME TYPE VERSION ADDRESS master master 4.3.8 192.168.56.10 worker2 worker 4.3.8 192.168.56.12 worker1 worker 4.3.8 192.168.56.11
  1. Check the system cpu usage of every node
  2. Add 3 agents and 7 groups and permute to get 400 multigroups
  It could be done adding the registres to the db itself sqlite3 /var/ossec/queue/db/global.db tables: agent, group, belongs
  1. Restart the master node
  /var/ossec/bin/wazuh-control restart
  1. Check the system cpu usage of every node is not too high
  It depends of the machine cpu, but it should not be at 100% usage

Considerations

The text was updated successfully, but these errors were encountered:

QU3B1M added team/qa test-development labels Sep 27, 2022

QU3B1M self-assigned this Sep 27, 2022

QU3B1M mentioned this issue Sep 27, 2022

QA Planning: Remoted process at 100% of CPU usage in a Wazuh master node without agents #3303

Closed

4 tasks

damarisg added this to the Development 4.5 milestone Sep 27, 2022

jmv74211 removed the status/not-tracked label Sep 28, 2022

QU3B1M changed the title ~~Tests development - Remoted process at 100% of CPU usage in a Wazuh master node without agents~~ Remoted process at 100% of CPU usage in a Wazuh master node without agents Oct 4, 2022

QU3B1M pushed a commit that referenced this issue Nov 4, 2022

fix(#3369)!: adapted to multi_groups changes

f3558d0

TomasTurina linked a pull request Nov 4, 2022 that will close this issue

Adapt to multi_groups modifications #3565

Merged

QU3B1M added a commit that referenced this issue Nov 4, 2022

docs(#3369): added changes to changelog

2e1d85b

QU3B1M added a commit that referenced this issue Nov 7, 2022

style(#3369): fixed callback regex

390fa08

QU3B1M mentioned this issue Nov 7, 2022

Adapt to multi_groups modifications #3565

Merged

QU3B1M mentioned this issue Dec 29, 2022

Add coverage to multigroups processing and wazuh-db performance improvements #3718

Closed

2 tasks

TomasTurina added a commit that referenced this issue Dec 30, 2022

fix(#3369)!: adapted to multi_groups changes

5383dd5

TomasTurina pushed a commit that referenced this issue Dec 30, 2022

docs(#3369): added changes to changelog

4a2166a

TomasTurina pushed a commit that referenced this issue Dec 30, 2022

style(#3369): fixed callback regex

846b2ba

jmv74211 closed this as completed in #3565 Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remoted process at 100% of CPU usage in a Wazuh master node without agents #3369

Remoted process at 100% of CPU usage in a Wazuh master node without agents #3369

QU3B1M commented Sep 27, 2022

Remoted process at 100% of CPU usage in a Wazuh master node without agents #3369

Remoted process at 100% of CPU usage in a Wazuh master node without agents #3369

Comments

QU3B1M commented Sep 27, 2022

Description

Configurations

Proposed test cases

Considerations