Reset avc_cache_threshold to 512 as higher values cause performance issues #9923

fcami · 2018-09-05T12:55:21Z

Reset avc_cache_threshold to 512 which is the default in RHEL7.
Higher values cause performance problems and any improvement is marginal for most workloads.
Rationale: SELinuxProject/selinux-kernel#34 (comment)
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1548428

openshift-ci-robot · 2018-09-05T12:55:28Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fcami
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: jmencak

If they are not already assigned, you can assign the PR to them by writing /assign @jmencak in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

roles/tuned/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

papr-bot · 2018-09-05T12:55:32Z

Can one of the admins verify this patch?
I understand the following commands:

bot, add author to whitelist
bot, test pull request
bot, test pull request once

eparis · 2018-09-05T13:24:33Z

I'd like to see a value in the 4k-16k range. 512 is a big inadequate when you have 200 containers on a node. Though clearly 64k goes way to far the other direction.

jeremyeder · 2018-09-05T13:58:44Z

@fcami we need to be data-driven. Let's close this PR until we have run the tests. I created a JIRA card for our team to re-qualify this setting. Until then, there is no reason at all that a customer could not adjust that value if they run into issues. OK?

So, we will post a PR to openshift-ansible change this setting once those tests are complete (in the 4.0 timeframe).

eparis · 2018-09-05T14:10:36Z

/ok-to-test

we have data. 65k bad. 512 better. I'd rather, if we're stabbing in the dark, go in the middle. But I can wait until 4.0.

fcami · 2018-09-05T14:13:44Z

@jeremyeder I'm concerned that 64K is much too high a default. Maybe 512 is too low, but then, could we ship 3.11 with an intermediate value? It is not easy to find out that SELinux is the root cause when a cluster with much churn uses too much CPU kernel-side.

DanyC97 · 2018-09-05T20:21:16Z

my 0.02 $ from a user of Origin running few prod clusters with 2k pods each.

@jeremyeder how would you expect a user will be able to get to the bottom of this issue w/o having any metrics nor any KB with the info provided in the BZ above?

Imo i do agree with @fcami @eparis to sort it out now and maybe get a KB published asap rather than wait for 4.0.

…ult. Too high values cause performance problems. Rationale: SELinuxProject/selinux-kernel#34 (comment) Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1548428

fcami · 2018-09-05T20:37:27Z

Updated to 4K as per @eparis ' comment "in the 4k-16k range".

jeremyeder · 2018-09-06T15:18:06Z

Just so you're all aware, this setting has been in place since before 3.0 GA and we've only rarely heard about any concerns. So I don't see how this is an emergency.

Regardless, I've already assigned the JIRA research card to a member of my team @ekuric and discussed with him having a recommendation in place for 3.11, before 3.11 code-freeze next Wednesday. He will update this issue accordingly.

Feel free to write a Kbase article to address releases prior to 3.11.

sdodson · 2018-09-06T16:25:25Z

/hold
Whenever @eparis and @jeremyeder say this is good to go i'm fine pulling it in. I just want to avoid someone on my team slapping lgtm on this and it going in.

fcami · 2018-09-06T16:44:59Z

Closing since this will be fixed in 3.11 as per previous comments.

openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Sep 5, 2018

openshift-ci-robot requested review from jeremyeder and jmencak September 5, 2018 12:55

openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 5, 2018

openshift-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 5, 2018

Lower avc_cache_threshold from 65536 to 4096 - 512 being RHEL7's defa…

4ee264a

…ult. Too high values cause performance problems. Rationale: SELinuxProject/selinux-kernel#34 (comment) Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1548428

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 6, 2018

fcami closed this Sep 6, 2018

fcami deleted the avc_cache_threshold branch September 6, 2018 16:45

jeremyeder mentioned this pull request Sep 12, 2018

reducing /sys/fs/selinux/avc/cache_threshold to 8192 instead of 65535 #10027

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reset avc_cache_threshold to 512 as higher values cause performance issues #9923

Reset avc_cache_threshold to 512 as higher values cause performance issues #9923

fcami commented Sep 5, 2018 •

edited

Loading

openshift-ci-robot commented Sep 5, 2018

papr-bot commented Sep 5, 2018

eparis commented Sep 5, 2018

jeremyeder commented Sep 5, 2018

eparis commented Sep 5, 2018

fcami commented Sep 5, 2018

DanyC97 commented Sep 5, 2018

fcami commented Sep 5, 2018

jeremyeder commented Sep 6, 2018

sdodson commented Sep 6, 2018

fcami commented Sep 6, 2018

Reset avc_cache_threshold to 512 as higher values cause performance issues #9923

Reset avc_cache_threshold to 512 as higher values cause performance issues #9923

Conversation

fcami commented Sep 5, 2018 • edited Loading

openshift-ci-robot commented Sep 5, 2018

papr-bot commented Sep 5, 2018

eparis commented Sep 5, 2018

jeremyeder commented Sep 5, 2018

eparis commented Sep 5, 2018

fcami commented Sep 5, 2018

DanyC97 commented Sep 5, 2018

fcami commented Sep 5, 2018

jeremyeder commented Sep 6, 2018

sdodson commented Sep 6, 2018

fcami commented Sep 6, 2018

fcami commented Sep 5, 2018 •

edited

Loading