Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Performance about Service & NetworkPolicy #4605

Closed
zsxsoft opened this issue Oct 14, 2024 · 4 comments · Fixed by #4626
Closed

[BUG] Performance about Service & NetworkPolicy #4605

zsxsoft opened this issue Oct 14, 2024 · 4 comments · Fixed by #4626
Labels
bug Something isn't working network policy performance Anything that can make Kube-OVN faster

Comments

@zsxsoft
Copy link

zsxsoft commented Oct 14, 2024

Kube-OVN Version

v1.12.26

Kubernetes Version

v1.27.4

Operation-system/Kernel Version

TencentOS Server 4.0
6.6.6-2401.0.1.tl4.4.x86_64

Description

I have a cluster with ~300 Pods and ~100 NetworkPolicies. I've noticed that every time I create a Service, a significant number of UpdateNp logs are added to the kube-ovn-controller.log, and at the same time, the Dashboard shows a Work Queue Latency reaching about 1 minute.

Then I checked the code:
https:/kubeovn/kube-ovn/blob/v1.12.26/pkg/controller/network_policy.go#L855-L878

The above code seems to indicate that whenever a Service is created, all Pods within the corresponding Namespace are retrieved, and then all NetworkPolicies are matched to create an 'UpdateNp' queue. This not only results in an O(n^2) time complexity, in my cluster, is equivalent to updating all NetworkPolicies

#1167

Since Pods already match and respond to UpdateNp requests when they are created, and no additional operations are performed on NetworkPolicies during the creation of a Service, I don't quite understand the purpose here.
Moreover, in large-scale clusters, creating/deleting Services should be a relatively common operation. In this case, I'm afraid that all requests for creating/deleting Pods will be blocked after the creation of a Service.

Is there any solution?

Steps To Reproduce

  1. Create a large number of NetworkPolicies and Pods
  2. Create/Delete a Service.

Current Behavior

/

Expected Behavior

/

@zsxsoft zsxsoft added the bug Something isn't working label Oct 14, 2024
@dosubot dosubot bot added network policy performance Anything that can make Kube-OVN faster labels Oct 14, 2024
Copy link

dosubot bot commented Oct 14, 2024

Hey @zsxsoft, I'm here to help you with any bugs, questions, or becoming a contributor! 😊

To address the performance issue related to Service creation and NetworkPolicy updates in Kube-OVN, you can consider the following optimizations:

  1. Increase Probe Interval: Adjusting the default probe interval for large clusters can help reduce the load and improve performance. This change has been implemented in a recent update [1].

  2. Optimize Protocol Checks: The protocol check in NetworkPolicy updates has been streamlined to reduce latency, which is part of the recent optimizations [1].

  3. Review Pull Request When netpol is added to a workload, the workload's POD can be accessed using service #1167: This pull request specifically addresses the O(n^2) complexity issue by optimizing how Services interact with NetworkPolicies. It suggests that since Pods already handle UpdateNp requests upon creation, the additional processing during Service creation might be unnecessary. Reviewing the changes in this pull request could provide insights into further optimizations.

  4. Consider Caching Mechanisms: Implementing caching mechanisms could help reduce the need to re-evaluate all NetworkPolicies when a Service is created, thus improving performance.

These steps aim to mitigate the high latency and potential blocking of operations in large clusters.

To continue talking to Dosu, mention @dosu.

@oilbeater
Copy link
Collaborator

The purpose is to add the svc ClusterIP to networkpolicy generated ACLs. The current implementation do generate lots of duplicated events, I think we can use set to store all the networkpolicies that need to be updated and then generate related events to reduce the duplication.

@oilbeater
Copy link
Collaborator

@zsxsoft can you try #4626 this patch

@zsxsoft
Copy link
Author

zsxsoft commented Oct 17, 2024

I updated my workflow to circumvent this issue temporarily, can't test it for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working network policy performance Anything that can make Kube-OVN faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants