Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No way to enable audit logging in rke2 outside of cis-X.Y profiles #1183

Closed
mitchtys opened this issue Jun 24, 2021 · 14 comments
Closed

No way to enable audit logging in rke2 outside of cis-X.Y profiles #1183

mitchtys opened this issue Jun 24, 2021 · 14 comments
Assignees
Labels
kind/dev-validation Dev will be validating this issue kind/internal

Comments

@mitchtys
Copy link

Is your feature request related to a problem? Please describe.

Currently the only way to get k8s audit logs in rke2 is to add something akin to:
profile: cis-1.5|cis-1.6

When building an rke2 cluster. However, if one doesn't want a cis hardened cluster, there is no intutive/easy way to enable audit logs.

The best way seems to be as follows append the apiserver args manually:

kube-apiserver-arg:
  - audit-log-format=json
  - audit-log-maxage=5
  - audit-log-maxbackup=5
  - audit-log-maxsize=100
  - audit-log-path=/var/lib/rancher/rke2/server/logs/audit.log
  - audit-policy-file=/etc/rancher/rke2/audit-policy.yaml

Touch /var/lib/rancher/rke2/server/logs/audit.log

Then restart rke2-server.

I tested with this spam policy config:

# cat /etc/rancher/rke2/audit-policy.yaml 
---
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata

Describe the solution you'd like

Ignoring how, there should be a way to enable audit logs outside of cis mode such that audit-log-path if a file is able to be written to. As far as controlling backups/etc... just using the apiserver-args should be fine.

Describe alternatives you've considered

Tried manually adding static mounts, pointing audit-log-path to a new file (this fails as it needs to be rewritten to as a file, and without a mount of the parent dir fails).

One thought was to cat the audit log file out to the audit-log-path file but I doubt that would work.

Other chicanery that ultimately didn't work well and isn't worth polluting electrons with.

Additional context

@brandond
Copy link
Member

As discussed in Slack, we have a --audit-policy-file flag, but it only works when a CIS profile is enabled. We should modify this to set up audit logging when this flag is set to a non-empty string, and have the CIS profile code simply set this to the default value if not already set by the user.

There are some other restrictions - audit logging currently only works properly if the audit log file is set to /var/lib/rancher/rke2/server/logs/audit.log, due to how we set up the mounts in the kube-apiserver static pod. That should be fine if we document it properly.

@mitchtys
Copy link
Author

Sounds good, presumably if --audit-policy-file is set, that would order after any kube-apiserver-arg values correct? That way if someone did specify both, --audit-policy-file took precedence.

@brandond
Copy link
Member

Yeah, user args take precedence over those set by the supervisor.

@brandond
Copy link
Member

brandond commented Aug 6, 2021

@cjellick did this need to get backported to 1.21? I only see a PR for master.

@bmdepesa bmdepesa added the kind/dev-validation Dev will be validating this issue label Aug 12, 2021
@galal-hussein
Copy link
Contributor

Validated with master commit id 9bb5c27f29cf3dd831653543d274c948a225385

steps to validate:

  • Install rke2 server
  • edit the config.yaml to include:
audit-policy-file: /tmp/testaudit.yaml
  • Start rke2 server

I can see that /tmp/testaudit.yaml being filled with the following:

apiVersion: audit.k8s.io/v1
kind: Policy
metadata:
  creationTimestamp: null
rules:
- level: None
  • Change the level to Metadata
  • Restart rke2 server

I can see logs in /var/lib/rancher/rke2/server/logs/audit.log

@dereknola
Copy link
Member

dereknola commented Feb 8, 2022

/backport v1.21.10+rke2r1

@eHildy
Copy link

eHildy commented Sep 8, 2022

As discussed in Slack, we have a --audit-policy-file flag, but it only works when a CIS profile is enabled. We should modify this to set up audit logging when this flag is set to a non-empty string, and have the CIS profile code simply set this to the default value if not already set by the user.

There are some other restrictions - audit logging currently only works properly if the audit log file is set to /var/lib/rancher/rke2/server/logs/audit.log, due to how we set up the mounts in the kube-apiserver static pod. That should be fine if we document it properly.

@brandond this is actually an issue on STIGd systems. The partition STIG says that all audit logs my be on a separate partition that is mounted at /var/log/audit.

We've been trying to change this location by adding an extra mount to kube-apiserver and changing the log path. It works, but only until you restart the node.

@brandond
Copy link
Member

brandond commented Sep 8, 2022

We've been trying to change this location by adding an extra mount to kube-apiserver and changing the log path. It works, but only until you restart the node.

@eHildy How are you doing this? You're not modifying the apiserver static pod manifest directly, are you? That is not supported. You should use the kube-apiserver-arg and kube-apiserver-extra-mount options to change the contents of the apiserver static pod.

@eHildy
Copy link

eHildy commented Sep 8, 2022

We've been trying to change this location by adding an extra mount to kube-apiserver and changing the log path. It works, but only until you restart the node.

@eHildy How are you doing this? You're not modifying the apiserver static pod manifest directly, are you? That is not supported. You should use the kube-apiserver-arg and kube-apiserver-extra-mount options to change the contents of the apiserver static pod.

@brandond That's how we're doing it, yes. Seems to work fine until you restart nodes. Then you get command failed" err="ensureLogFile: open /var/log/audit/rke2.log: read-only file system, even when the file perms on the dirs and the logs themselves are exactly the same: 755.

@eHildy
Copy link

eHildy commented Sep 8, 2022

@brandond I do appologize for mentioning that issue on this closed issue. I'm just remarking that the location is fine, unless you're trying to accredit a system for the government and they enforce STIG partition rules. Audit log files can't be in that location in that case.

@brandond
Copy link
Member

brandond commented Sep 8, 2022

Which way are you doing it - passing additional args to RKE2, or manually modifying the static pod manifests?

The comment you quoted is very old, and should't be relevant since the PRs linked above were merged. You should be able to put it anywhere, if you configure RKE2 correctly.

@eHildy
Copy link

eHildy commented Sep 8, 2022

@brandond we use a yaml file to create a cluster in Rancher and set kube-apiserver-extra-mount to - /var/log/audit:/var/log/audit.

Then we set kube-apiserver-arg in that same yaml file to - audit-log-path=/var/log/audit/rke2.log

Works when you provision the cluster, but dies after a node reboots, but not necessarily immediately. Some nodes survive several restarts before it happens, but once it does, that node is toast. You can't get rke2-server.service to fire up again.

@brandond
Copy link
Member

brandond commented Sep 8, 2022

I can't reproduce any errors with unwritable log files, even after rebooting the node multiple times. I'm testing this with the following config.yaml:

token: token
audit-policy-file: /etc/rancher/rke2/audit.yaml
kube-apiserver-arg: audit-log-path=/var/log/audit/rke2.log
kube-apiserver-extra-mount: /var/log/audit:/var/log/audit

This isn't the sort of thing that would just break itself over time. Since this is a hardened system, I am guessing that you're also running selinux? Have you confirmed that you have added selinux policies to allow the apiserver pod to write to the audit log in your custom location? If it's getting broken by a reboot, I suspect that you have something that's running and resetting contexts or something else like that as the system is coming up. Do you see anything in the selinux audit log?

@eHildy
Copy link

eHildy commented Sep 9, 2022

@brandond thanks again for the help btw.
There's nothing of note in the SELinux logs, and we even turned SELinx off. This seems to be a volume mount issue, so we're going to play with switching those mounts around.

BTW, these are CentOS 7 nodes, so it may happen for you if you try it there.

A little more info: we use the same volume mount in kubelet and it has not trouble at all

kube-apiserver-arg:
  - anonymous-auth=false
  - audit-log-path=/var/log/audit/rke2.log
  - enable-admission-plugins=NodeRestriction,PodSecurityPolicy,ValidatingAdmissionWebhook
  - request-timeout=60s
  - tls-min-version=VersionTLS12
  - feature-gates=JobTrackingWithFinalizers=true,DynamicKubeletConfig=false
  - tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
kube-apiserver-extra-mount:
  - /var/log/audit:/var/log/audit
kube-controller-manager-arg:
  - tls-min-version=VersionTLS12
  - feature-gates=JobTrackingWithFinalizers=true,DynamicKubeletConfig=false
kube-scheduler-arg:
  - tls-min-version=VersionTLS12
etcd-arg:
  - auto-tls=false
  - peer-auto-tls=false
kubelet-arg:
  - eviction-hard=imagefs.available<5%,nodefs.available<5%
  - eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10%
  - image-gc-high-threshold=85
  - image-gc-low-threshold=80
  - streaming-connection-idle-timeout=5m
  - log-file=/var/log/audit/kubelet.log
  - feature-gates=DynamicKubeletConfig=false
kubelet-extra-mount:
  - /var/log/audit:/var/log/audit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/dev-validation Dev will be validating this issue kind/internal
Projects
None yet
Development

No branches or pull requests

9 participants