Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Egress network policy blocks readiness probe #6476

Closed
ErikEngerd opened this issue Jul 31, 2022 · 14 comments
Closed

Egress network policy blocks readiness probe #6476

ErikEngerd opened this issue Jul 31, 2022 · 14 comments

Comments

@ErikEngerd
Copy link

I have deployed the nginx ingress controller using helm.

NAME    NAMESPACE       REVISION        UPDATED                                         STATUS          CHART                   APP VERSION
nginx   nginx           53              2022-07-03 20:30:34.187933627 +0200 CEST        deployed        nginx-ingress-0.13.2    2.2.2      

When securing my cluster using network policies, I noticed that the nginx ingress controller was failing its readiness probe. This was also failing with only a single network policy present for egress. This network policy is:

---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: default-allow-nothing
  namespace: nginx
spec:
  podSelector: {}
  policyTypes:
    - Egress

The output of the kubectl describe of the affected pod is

Warning  Unhealthy  5m28s (x22 over 5m47s)  kubelet            Readiness probe failed: Get "http://10.200.208.50:8081/nginx-ready": dial tcp 10.200.208.50:8081: connect: connection refused


What did you expect to happen?

Upon deleting an nginx pod it should come back and pass its readiness probe. The reason is that it is connecting to its own IP. Even when adding explicit egress rules for TCP port 8081, it still fails.

In general network policies should not affect readiness probes.

How can we reproduce it (as minimally and precisely as possible)?

helm repo add nginx-stable https://helm.nginx.com/stable
helm install -n nginx --create-namespace nginx nginx-stable/nginx-ingress -f values.yaml

Wait until nginx is successfully running and all pods are running.

Now create the network policy:

kubectl apply -f networkpolicy.yaml # the policy shown above

And delete one of the nginx pod.
The pod now fails because of a a readiness probe.

Anything else we need to know?

The network policy is the only network policy that I used for this test. It does not allow communication to any other pods in the system. However, my default network policies did cover egress and ingress rules to the pods that it needs to reach. The example is just a minimal working example.

Kubernetes version

$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.3", GitCommit:"aef86a93758dc3cb2c658dd9657ab4ad4afc21cb", GitTreeState:"clean", BuildDate:"2022-07-13T14:30:46Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.7", GitCommit:"42c05a547468804b2053ecf60a3bd15560362fc2", GitTreeState:"clean", BuildDate:"2022-05-24T12:24:41Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

kubeadm install on centos 7

OS version

Server OS version:

# On Linux:
$ cat /etc/os-release
# cat /etc/os-release 
NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

$ uname -a
Linux panther 4.18.0-301.1.el8.x86_64 #1 SMP Tue Apr 13 16:24:22 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

</details>


### Install tools

<details>
helm was used in the example
</details>


### Container runtime (CRI) and version (if applicable)

<details>
# docker -v
Docker version 20.10.6, build 370c289
</details>


### Related plugins (CNI, CSI, ...) and versions (if applicable)

<details>
CNI: calico

> kubectl exec -it -n kube-system calicoctl -- calicoctl version
Client Version:    v3.23.1
Git commit:        967e24543
Cluster Version:   v3.23.1
Cluster Type:      typha,kdd,k8s,operator,bgp,kubeadm

</details>

@ErikEngerd
Copy link
Author

A simple deployment with an httpd container and readiness check works together with a NetworkPolicy that disables all egress:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpd
  labels:
    app: httpd
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpd
  template:
    metadata:
      labels:
        app: httpd
    spec:
      containers:
        - name: httpd
          image: httpd:2.4
          ports: 
            - containerPort: 80
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /
              port: 80
              scheme: HTTP
            periodSeconds: 1
            successThreshold: 1
            timeoutSeconds: 1
---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: default-allow-nothing
spec:
  podSelector: {}
  policyTypes:
    - Egress

It also works when replacing the deployment by a daemonset which is what I was using for nginx.

@ErikEngerd
Copy link
Author

The following values.yaml was used for installing calico.

The values.yaml used for nginx installation:

controller:
  ## The name of the Ingress Controller daemonset or deployment.
  ## Autogenerated if not set or set to "".
  # name: nginx-ingress

  ## The kind of the Ingress Controller installation - deployment or daemonset.
  kind: daemonset

  ## Deploys the Ingress Controller for NGINX Plus.
  nginxplus: false

  # Timeout in milliseconds which the Ingress Controller will wait for a successful NGINX reload after a change or at the initial start.
  nginxReloadTimeout: 60000

  ## Support for App Protect
  appprotect:
    ## Enable the App Protect module in the Ingress Controller.
    enable: false
    ## Sets log level for App Protect. Allowed values: fatal, error, warn, info, debug, trace
    # logLevel: fatal

  ## Support for App Protect Dos
  appprotectdos:
    ## Enable the App Protect Dos module in the Ingress Controller.
    enable: false
    ## Enable debugging for App Protect Dos.
    debug: false
    ## Max number of nginx processes to support.
    maxWorkers: 0
    ## Max number of ADMD instances.
    maxDaemons: 0
    ## RAM memory size to consume in MB.
    memory: 0

  ## Enables the Ingress Controller pods to use the host's network namespace.
  hostNetwork: false

  ## Enables debugging for NGINX. Uses the nginx-debug binary. Requires error-log-level: debug in the ConfigMap via `controller.config.entries`.
  nginxDebug: false

  ## The log level of the Ingress Controller.
  logLevel: 1

  ## A list of custom ports to expose on the NGINX ingress controller pod. Follows the conventional Kubernetes yaml syntax for container ports.
  customPorts: []

  image:
    ## The image repository of the Ingress Controller.
    repository: nginx/nginx-ingress

    ## The tag of the Ingress Controller image.
    tag: "2.2.2"

    ## The pull policy for the Ingress Controller image.
    pullPolicy: IfNotPresent

  config:
    ## The name of the ConfigMap used by the Ingress Controller.
    ## Autogenerated if not set or set to "".
    # name: nginx-config

    ## The annotations of the Ingress Controller configmap.
    annotations: {}

    ## The entries of the ConfigMap for customizing NGINX configuration.
    entries: # CUSTOM
      ssl-ciphers: "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384"
      ssl-protocols: "TLSv1.2 TLSv1.3"
      ssl-dh-param: "ingress-nginx/lb-dhparam"
      hsts: "true"
      hsts-max-age: "63072000"
      hsts-include-subdomains: "true"
      ssl-reject-handshake: "true"
      enable-ocsp: "true"
      client-max-body-size: 1000m
      #proxy-protocol: "True"
      #real-ip-header: "proxy_protocol"
      #set-real-ip-from: "192.168.178.1/24"
      #server-snippets: |
      #  more_set_headers 'Server: ';

  ## It is recommended to use your own TLS certificates and keys
  defaultTLS:
    ## The base64-encoded TLS certificate for the default HTTPS server. If not specified, a pre-generated self-signed certificate is used.
    ## Note: It is recommended that you specify your own certificate.
    cert: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN2akNDQWFZQ0NRREFPRjl0THNhWFhEQU5CZ2txaGtpRzl3MEJBUXNGQURBaE1SOHdIUVlEVlFRRERCWk8KUjBsT1dFbHVaM0psYzNORGIyNTBjbTlzYkdWeU1CNFhEVEU0TURreE1qRTRNRE16TlZvWERUSXpNRGt4TVRFNApNRE16TlZvd0lURWZNQjBHQTFVRUF3d1dUa2RKVGxoSmJtZHlaWE56UTI5dWRISnZiR3hsY2pDQ0FTSXdEUVlKCktvWklodmNOQVFFQkJRQURnZ0VQQURDQ0FRb0NnZ0VCQUwvN2hIUEtFWGRMdjNyaUM3QlBrMTNpWkt5eTlyQ08KR2xZUXYyK2EzUDF0azIrS3YwVGF5aGRCbDRrcnNUcTZzZm8vWUk1Y2Vhbkw4WGM3U1pyQkVRYm9EN2REbWs1Qgo4eDZLS2xHWU5IWlg0Rm5UZ0VPaStlM2ptTFFxRlBSY1kzVnNPazFFeUZBL0JnWlJVbkNHZUtGeERSN0tQdGhyCmtqSXVuektURXUyaDU4Tlp0S21ScUJHdDEwcTNRYzhZT3ExM2FnbmovUWRjc0ZYYTJnMjB1K1lYZDdoZ3krZksKWk4vVUkxQUQ0YzZyM1lma1ZWUmVHd1lxQVp1WXN2V0RKbW1GNWRwdEMzN011cDBPRUxVTExSakZJOTZXNXIwSAo1TmdPc25NWFJNV1hYVlpiNWRxT3R0SmRtS3FhZ25TZ1JQQVpQN2MwQjFQU2FqYzZjNGZRVXpNQ0F3RUFBVEFOCkJna3Foa2lHOXcwQkFRc0ZBQU9DQVFFQWpLb2tRdGRPcEsrTzhibWVPc3lySmdJSXJycVFVY2ZOUitjb0hZVUoKdGhrYnhITFMzR3VBTWI5dm15VExPY2xxeC9aYzJPblEwMEJCLzlTb0swcitFZ1U2UlVrRWtWcitTTFA3NTdUWgozZWI4dmdPdEduMS9ienM3bzNBaS9kclkrcUI5Q2k1S3lPc3FHTG1US2xFaUtOYkcyR1ZyTWxjS0ZYQU80YTY3Cklnc1hzYktNbTQwV1U3cG9mcGltU1ZmaXFSdkV5YmN3N0NYODF6cFErUyt1eHRYK2VBZ3V0NHh3VlI5d2IyVXYKelhuZk9HbWhWNThDd1dIQnNKa0kxNXhaa2VUWXdSN0diaEFMSkZUUkk3dkhvQXprTWIzbjAxQjQyWjNrN3RXNQpJUDFmTlpIOFUvOWxiUHNoT21FRFZkdjF5ZytVRVJxbStGSis2R0oxeFJGcGZnPT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=

    ## The base64-encoded TLS key for the default HTTPS server. Note: If not specified, a pre-generated key is used.
    ## Note: It is recommended that you specify your own key.
    key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBdi91RWM4b1JkMHUvZXVJTHNFK1RYZUprckxMMnNJNGFWaEMvYjVyYy9XMlRiNHEvClJOcktGMEdYaVN1eE9ycXgrajlnamx4NXFjdnhkenRKbXNFUkJ1Z1B0ME9hVGtIekhvb3FVWmcwZGxmZ1dkT0EKUTZMNTdlT1l0Q29VOUZ4amRXdzZUVVRJVUQ4R0JsRlNjSVo0b1hFTkhzbysyR3VTTWk2Zk1wTVM3YUhudzFtMApxWkdvRWEzWFNyZEJ6eGc2clhkcUNlUDlCMXl3VmRyYURiUzc1aGQzdUdETDU4cGszOVFqVUFQaHpxdmRoK1JWClZGNGJCaW9CbTVpeTlZTW1hWVhsMm0wTGZzeTZuUTRRdFFzdEdNVWozcGJtdlFmazJBNnljeGRFeFpkZFZsdmwKMm82MjBsMllxcHFDZEtCRThCay90elFIVTlKcU56cHpoOUJUTXdJREFRQUJBb0lCQVFDZklHbXowOHhRVmorNwpLZnZJUXQwQ0YzR2MxNld6eDhVNml4MHg4Mm15d1kxUUNlL3BzWE9LZlRxT1h1SENyUlp5TnUvZ2IvUUQ4bUFOCmxOMjRZTWl0TWRJODg5TEZoTkp3QU5OODJDeTczckM5bzVvUDlkazAvYzRIbjAzSkVYNzZ5QjgzQm9rR1FvYksKMjhMNk0rdHUzUmFqNjd6Vmc2d2szaEhrU0pXSzBwV1YrSjdrUkRWYmhDYUZhNk5nMUZNRWxhTlozVDhhUUtyQgpDUDNDeEFTdjYxWTk5TEI4KzNXWVFIK3NYaTVGM01pYVNBZ1BkQUk3WEh1dXFET1lvMU5PL0JoSGt1aVg2QnRtCnorNTZud2pZMy8yUytSRmNBc3JMTnIwMDJZZi9oY0IraVlDNzVWYmcydVd6WTY3TWdOTGQ5VW9RU3BDRkYrVm4KM0cyUnhybnhBb0dCQU40U3M0ZVlPU2huMVpQQjdhTUZsY0k2RHR2S2ErTGZTTXFyY2pOZjJlSEpZNnhubmxKdgpGenpGL2RiVWVTbWxSekR0WkdlcXZXaHFISy9iTjIyeWJhOU1WMDlRQ0JFTk5jNmtWajJTVHpUWkJVbEx4QzYrCk93Z0wyZHhKendWelU0VC84ajdHalRUN05BZVpFS2FvRHFyRG5BYWkyaW5oZU1JVWZHRXFGKzJyQW9HQkFOMVAKK0tZL0lsS3RWRzRKSklQNzBjUis3RmpyeXJpY05iWCtQVzUvOXFHaWxnY2grZ3l4b25BWlBpd2NpeDN3QVpGdwpaZC96ZFB2aTBkWEppc1BSZjRMazg5b2pCUmpiRmRmc2l5UmJYbyt3TFU4NUhRU2NGMnN5aUFPaTVBRHdVU0FkCm45YWFweUNweEFkREtERHdObit3ZFhtaTZ0OHRpSFRkK3RoVDhkaVpBb0dCQUt6Wis1bG9OOTBtYlF4VVh5YUwKMjFSUm9tMGJjcndsTmVCaWNFSmlzaEhYa2xpSVVxZ3hSZklNM2hhUVRUcklKZENFaHFsV01aV0xPb2I2NTNyZgo3aFlMSXM1ZUtka3o0aFRVdnpldm9TMHVXcm9CV2xOVHlGanIrSWhKZnZUc0hpOGdsU3FkbXgySkJhZUFVWUNXCndNdlQ4NmNLclNyNkQrZG8wS05FZzFsL0FvR0FlMkFVdHVFbFNqLzBmRzgrV3hHc1RFV1JqclRNUzRSUjhRWXQKeXdjdFA4aDZxTGxKUTRCWGxQU05rMXZLTmtOUkxIb2pZT2pCQTViYjhibXNVU1BlV09NNENoaFJ4QnlHbmR2eAphYkJDRkFwY0IvbEg4d1R0alVZYlN5T294ZGt5OEp0ek90ajJhS0FiZHd6NlArWDZDODhjZmxYVFo5MWpYL3RMCjF3TmRKS2tDZ1lCbyt0UzB5TzJ2SWFmK2UwSkN5TGhzVDQ5cTN3Zis2QWVqWGx2WDJ1VnRYejN5QTZnbXo5aCsKcDNlK2JMRUxwb3B0WFhNdUFRR0xhUkcrYlNNcjR5dERYbE5ZSndUeThXczNKY3dlSTdqZVp2b0ZpbmNvVlVIMwphdmxoTUVCRGYxSjltSDB5cDBwWUNaS2ROdHNvZEZtQktzVEtQMjJhTmtsVVhCS3gyZzR6cFE9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=

    ## The secret with a TLS certificate and key for the default HTTPS server.
    ## The value must follow the following format: `<namespace>/<name>`.
    ## Used as an alternative to specifying a certificate and key using `controller.defaultTLS.cert` and `controller.defaultTLS.key` parameters.
    ## Format: <namespace>/<secret_name>
    secret:

  wildcardTLS:
    ## The base64-encoded TLS certificate for every Ingress/VirtualServer host that has TLS enabled but no secret specified.
    ## If the parameter is not set, for such Ingress/VirtualServer hosts NGINX will break any attempt to establish a TLS connection.
    cert: ""

    ## The base64-encoded TLS key for every Ingress/VirtualServer host that has TLS enabled but no secret specified.
    ## If the parameter is not set, for such Ingress/VirtualServer hosts NGINX will break any attempt to establish a TLS connection.
    key: ""

    ## The secret with a TLS certificate and key for every Ingress/VirtualServer host that has TLS enabled but no secret specified.
    ## The value must follow the following format: `<namespace>/<name>`.
    ## Used as an alternative to specifying a certificate and key using `controller.wildcardTLS.cert` and `controller.wildcardTLS.key` parameters.
    ## Format: <namespace>/<secret_name>
    secret:

  ## The node selector for pod assignment for the Ingress Controller pods.
  nodeSelector: {}

  ## The termination grace period of the Ingress Controller pod.
  terminationGracePeriodSeconds: 30

  ## The resources of the Ingress Controller pods.
  resources: {}
    # limits:
    #   cpu: 100m
    #   memory: 64Mi
    # requests:
    #   cpu: 100m
    #   memory: 64Mi

  ## The tolerations of the Ingress Controller pods.
  tolerations: []

  ## The affinity of the Ingress Controller pods.
  affinity: {}

  ## The volumes of the Ingress Controller pods.
  volumes: []
  # - name: extra-conf
  #   configMap:
  #     name: extra-conf

  ## The volumeMounts of the Ingress Controller pods.
  volumeMounts: []
  # - name: extra-conf
  #   mountPath: /etc/nginx/conf.d/extra.conf
  #   subPath: extra.conf

  ## InitContainers for the Ingress Controller pods.
  initContainers: []
  # - name: init-container
  #   image: busybox:1.34
  #   command: ['sh', '-c', 'echo this is initial setup!']

  ## Extra containers for the Ingress Controller pods.
  extraContainers: []
  # - name: container
  #   image: busybox:1.34
  #   command: ['sh', '-c', 'echo this is a sidecar!']

  ## The number of replicas of the Ingress Controller deployment.
  replicaCount: 1

  ## A class of the Ingress Controller.

  ## IngressClass resource with the name equal to the class must be deployed. Otherwise,
  ## the Ingress Controller will fail to start.
  ## The Ingress Controller only processes resources that belong to its class - i.e. have the "ingressClassName" field resource equal to the class.

  ## The Ingress Controller processes all the resources that do not have the "ingressClassName" field for all versions of kubernetes.
  ingressClass: nginx

  ## New Ingresses without an ingressClassName field specified will be assigned the class specified in `controller.ingressClass`.
  setAsDefaultIngress: true

  ## Namespace to watch for Ingress resources. By default the Ingress Controller watches all namespaces.
  watchNamespace: ""

  ## Enable the custom resources.
  enableCustomResources: true

  ## Enable preview policies. This parameter is deprecated. To enable OIDC Policies please use controller.enableOIDC instead.
  enablePreviewPolicies: false

  ## Enable OIDC policies.
  enableOIDC: false

  ## Enable TLS Passthrough on port 443. Requires controller.enableCustomResources.
  enableTLSPassthrough: false

  ## Enable cert manager for Virtual Server resources. Requires controller.enableCustomResources.
  enableCertManager: false

  globalConfiguration:
    ## Creates the GlobalConfiguration custom resource. Requires controller.enableCustomResources.
    create: false

    ## The spec of the GlobalConfiguration for defining the global configuration parameters of the Ingress Controller.
    spec: {}
      # listeners:
      # - name: dns-udp
      #   port: 5353
      #   protocol: UDP
      # - name: dns-tcp
      #   port: 5353
      #   protocol: TCP

  ## Enable custom NGINX configuration snippets in Ingress, VirtualServer, VirtualServerRoute and TransportServer resources.
  enableSnippets: false

  ## Add a location based on the value of health-status-uri to the default server. The location responds with the 200 status code for any request.
  ## Useful for external health-checking of the Ingress Controller.
  healthStatus: false

  ## Sets the URI of health status location in the default server. Requires controller.healthStatus.
  healthStatusURI: "/nginx-health"

  nginxStatus:
    ## Enable the NGINX stub_status, or the NGINX Plus API.
    enable: true

    ## Set the port where the NGINX stub_status or the NGINX Plus API is exposed.
    port: 8080

    ## Add IPv4 IP/CIDR blocks to the allow list for NGINX stub_status or the NGINX Plus API. Separate multiple IP/CIDR by commas.
    allowCidrs: "127.0.0.1"

  service:
    ## Creates a service to expose the Ingress Controller pods.
    create: true

    ## The type of service to create for the Ingress Controller.
    type: LoadBalancer

    ## The externalTrafficPolicy of the service. The value Local preserves the client source IP.
    externalTrafficPolicy: Local

    ## The annotations of the Ingress Controller service.
    annotations: {}

    ## The extra labels of the service.
    extraLabels: {}

    ## The static IP address for the load balancer. Requires controller.service.type set to LoadBalancer. The cloud provider must support this feature.
    loadBalancerIP: ""

    ## The list of external IPs for the Ingress Controller service.
    externalIPs: []

    ## The IP ranges (CIDR) that are allowed to access the load balancer. Requires controller.service.type set to LoadBalancer. The cloud provider must support this feature.
    loadBalancerSourceRanges: []

    ## The name of the service
    ## Autogenerated if not set or set to "".
    # name: nginx-ingress

    httpPort:
      ## Enables the HTTP port for the Ingress Controller service.
      enable: true

      ## The HTTP port of the Ingress Controller service.
      port: 80

      ## The custom NodePort for the HTTP port. Requires controller.service.type set to NodePort.
      nodePort: ""

      ## The HTTP port on the POD where the Ingress Controller service is running.
      targetPort: 80

    httpsPort:
      ## Enables the HTTPS port for the Ingress Controller service.
      enable: true

      ## The HTTPS port of the Ingress Controller service.
      port: 443

      ## The custom NodePort for the HTTPS port. Requires controller.service.type set to NodePort.
      nodePort: ""

      ## The HTTPS port on the POD where the Ingress Controller service is running.
      targetPort: 443

    ## A list of custom ports to expose through the Ingress Controller service. Follows the conventional Kubernetes yaml syntax for service ports.
    customPorts: []

  serviceAccount:
    ## The name of the service account of the Ingress Controller pods. Used for RBAC.
    ## Autogenerated if not set or set to "".
    # name: nginx-ingress

    ## The name of the secret containing docker registry credentials.
    ## Secret must exist in the same namespace as the helm release.
    imagePullSecretName: ""

  reportIngressStatus:
    ## Updates the address field in the status of Ingress resources with an external address of the Ingress Controller.
    ## You must also specify the source of the external address either through an external service via controller.reportIngressStatus.externalService,
    ## controller.reportIngressStatus.ingressLink or the external-status-address entry in the ConfigMap via controller.config.entries.
    ## Note: controller.config.entries.external-status-address takes precedence over the others.
    enable: true

    ## Specifies the name of the service with the type LoadBalancer through which the Ingress Controller is exposed externally.
    ## The external address of the service is used when reporting the status of Ingress, VirtualServer and VirtualServerRoute resources.
    ## controller.reportIngressStatus.enable must be set to true.
    ## The default is autogenerated and matches the created service (see controller.service.create).
    # externalService: nginx-ingress

    ## Specifies the name of the IngressLink resource, which exposes the Ingress Controller pods via a BIG-IP system.
    ## The IP of the BIG-IP system is used when reporting the status of Ingress, VirtualServer and VirtualServerRoute resources.
    ## controller.reportIngressStatus.enable must be set to true.
    ingressLink: ""

    ## Enable Leader election to avoid multiple replicas of the controller reporting the status of Ingress resources. controller.reportIngressStatus.enable must be set to true.
    enableLeaderElection: true

    ## Specifies the name of the ConfigMap, within the same namespace as the controller, used as the lock for leader election. controller.reportIngressStatus.enableLeaderElection must be set to true.
    ## Autogenerated if not set or set to "".
    # leaderElectionLockName: "nginx-ingress-leader-election"

    ## The annotations of the leader election configmap.
    annotations: {}

  pod:
    ## The annotations of the Ingress Controller pod.
    annotations: {}

    ## The additional extra labels of the Ingress Controller pod.
    extraLabels: {}

  ## The PriorityClass of the ingress controller pods.
  priorityClassName:

  readyStatus:
    ## Enables readiness endpoint "/nginx-ready". The endpoint returns a success code when NGINX has loaded all the config after startup.
    enable: true

    ## Set the port where the readiness endpoint is exposed.
    port: 8081

  ## Enable collection of latency metrics for upstreams. Requires prometheus.create.
  enableLatencyMetrics: false

rbac:
  ## Configures RBAC.
  create: true

prometheus:
  ## Expose NGINX or NGINX Plus metrics in the Prometheus format.
  create: true

  ## Configures the port to scrape the metrics.
  port: 9113

  ## Specifies the namespace/name of a Kubernetes TLS Secret which will be used to protect the Prometheus endpoint.
  secret: ""

  ## Configures the HTTP scheme used.
  scheme: http

nginxServiceMesh:
  ## Enables integration with NGINX Service Mesh.
  ## Requires controller.nginxplus
  enable: false

  ## Enables NGINX Service Mesh workload to route egress traffic through the Ingress Controller.
  ## Requires nginxServiceMesh.enable
  enableEgress: false


@ErikEngerd
Copy link
Author

The difference between nginx and my example deployment is that nginx has the cap_net_bind capability and uses host ports. Also interesting us that only the first readiness probe fails, but not subsequent ones.

@ErikEngerd
Copy link
Author

For completeness, here is the yaml of one of the nginx containers:

apiVersion: v1
items:
- apiVersion: v1
  kind: Pod
  metadata:
    annotations:
      cni.projectcalico.org/containerID: 589ce9e7e7af1b4e7575c30ee054c04c156ce76f336387a066f2a1ab4f0080c2
      cni.projectcalico.org/podIP: 10.200.208.39/32
      cni.projectcalico.org/podIPs: 10.200.208.39/32
      prometheus.io/port: "9113"
      prometheus.io/scheme: http
      prometheus.io/scrape: "true"
    creationTimestamp: "2022-07-31T19:36:46Z"
    generateName: nginx-nginx-ingress-
    labels:
      app: nginx-nginx-ingress
      controller-revision-hash: c8f5dbd5c
      pod-template-generation: "3"
    name: nginx-nginx-ingress-jxsq9
    namespace: nginx
    ownerReferences:
    - apiVersion: apps/v1
      blockOwnerDeletion: true
      controller: true
      kind: DaemonSet
      name: nginx-nginx-ingress
      uid: 1cf1afa7-7e81-41fa-9d24-28ccd2f1fb9f
    resourceVersion: "12371415"
    uid: 396e9cb9-d320-4dc1-8286-f1fd31e58578
  spec:
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchFields:
            - key: metadata.name
              operator: In
              values:
              - cobra
    containers:
    - args:
      - -nginx-plus=false
      - -nginx-reload-timeout=60000
      - -enable-app-protect=false
      - -enable-app-protect-dos=false
      - -nginx-configmaps=$(POD_NAMESPACE)/nginx-nginx-ingress
      - -default-server-tls-secret=$(POD_NAMESPACE)/nginx-nginx-ingress-default-server-tls
      - -ingress-class=nginx
      - -health-status=false
      - -health-status-uri=/nginx-health
      - -nginx-debug=false
      - -v=1
      - -nginx-status=true
      - -nginx-status-port=8080
      - -nginx-status-allow-cidrs=127.0.0.1
      - -report-ingress-status
      - -external-service=nginx-nginx-ingress
      - -enable-leader-election=true
      - -leader-election-lock-name=nginx-nginx-ingress-leader-election
      - -enable-prometheus-metrics=true
      - -prometheus-metrics-listen-port=9113
      - -prometheus-tls-secret=
      - -enable-custom-resources=true
      - -enable-snippets=false
      - -enable-tls-passthrough=false
      - -enable-preview-policies=false
      - -enable-cert-manager=false
      - -enable-oidc=false
      - -ready-status=true
      - -ready-status-port=8081
      - -enable-latency-metrics=false
      env:
      - name: POD_NAMESPACE
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.namespace
      - name: POD_NAME
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.name
      image: nginx/nginx-ingress:2.2.2
      imagePullPolicy: IfNotPresent
      name: nginx-nginx-ingress
      ports:
      - containerPort: 80
        hostPort: 80
        name: http
        protocol: TCP
      - containerPort: 443
        hostPort: 443
        name: https
        protocol: TCP
      - containerPort: 9113
        name: prometheus
        protocol: TCP
      - containerPort: 8081
        name: readiness-port
        protocol: TCP
      readinessProbe:
        failureThreshold: 3
        httpGet:
          path: /nginx-ready
          port: readiness-port
          scheme: HTTP
        periodSeconds: 1
        successThreshold: 1
        timeoutSeconds: 1
      resources: {}
      securityContext:
        allowPrivilegeEscalation: true
        capabilities:
          add:
          - NET_BIND_SERVICE
          drop:
          - ALL
        runAsUser: 101
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-g4kx2
        readOnly: true
    dnsPolicy: ClusterFirst
    enableServiceLinks: true
    nodeName: cobra
    preemptionPolicy: PreemptLowerPriority
    priority: 0
    restartPolicy: Always
    schedulerName: default-scheduler
    securityContext: {}
    serviceAccount: nginx-nginx-ingress
    serviceAccountName: nginx-nginx-ingress
    terminationGracePeriodSeconds: 30
    tolerations:
    - effect: NoExecute
      key: node.kubernetes.io/not-ready
      operator: Exists
    - effect: NoExecute
      key: node.kubernetes.io/unreachable
      operator: Exists
    - effect: NoSchedule
      key: node.kubernetes.io/disk-pressure
      operator: Exists
    - effect: NoSchedule
      key: node.kubernetes.io/memory-pressure
      operator: Exists
    - effect: NoSchedule
      key: node.kubernetes.io/pid-pressure
      operator: Exists
    - effect: NoSchedule
      key: node.kubernetes.io/unschedulable
      operator: Exists
    volumes:
    - name: kube-api-access-g4kx2
      projected:
        defaultMode: 420
        sources:
        - serviceAccountToken:
            expirationSeconds: 3607
            path: token
        - configMap:
            items:
            - key: ca.crt
              path: ca.crt
            name: kube-root-ca.crt
        - downwardAPI:
            items:
            - fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
              path: namespace
  status:
    conditions:
    - lastProbeTime: null
      lastTransitionTime: "2022-07-31T19:36:46Z"
      status: "True"
      type: Initialized
    - lastProbeTime: null
      lastTransitionTime: "2022-07-31T19:36:48Z"
      status: "True"
      type: Ready
    - lastProbeTime: null
      lastTransitionTime: "2022-07-31T19:36:48Z"
      status: "True"
      type: ContainersReady
    - lastProbeTime: null
      lastTransitionTime: "2022-07-31T19:36:46Z"
      status: "True"
      type: PodScheduled
    containerStatuses:
    - containerID: docker://2bdddc89560d07c8dc2937d68cb21f81c9055d3a3ffa34872eff248d07983854
      image: nginx/nginx-ingress:2.2.2
      imageID: docker-pullable://nginx/nginx-ingress@sha256:b6dec42d12651b12ced39434336b1022029c8534488aa2084b46f138c9f700ba
      lastState: {}
      name: nginx-nginx-ingress
      ready: true
      restartCount: 0
      started: true
      state:
        running:
          startedAt: "2022-07-31T19:36:47Z"
    hostIP: 192.168.178.50
    phase: Running
    podIP: 10.200.208.39
    podIPs:
    - ip: 10.200.208.39
    qosClass: BestEffort
    startTime: "2022-07-31T19:36:46Z"
- apiVersion: v1
  kind: Pod
  metadata:
    annotations:
      cni.projectcalico.org/containerID: 5ab3d7dd14fb1280d74d5311012ea3ec1fd0831d70867eaa3c34cc03fdd6b095
      cni.projectcalico.org/podIP: 10.200.24.45/32
      cni.projectcalico.org/podIPs: 10.200.24.45/32
      prometheus.io/port: "9113"
      prometheus.io/scheme: http
      prometheus.io/scrape: "true"
    creationTimestamp: "2022-07-31T13:33:27Z"
    generateName: nginx-nginx-ingress-
    labels:
      app: nginx-nginx-ingress
      controller-revision-hash: c8f5dbd5c
      pod-template-generation: "3"
    name: nginx-nginx-ingress-zxgrd
    namespace: nginx
    ownerReferences:
    - apiVersion: apps/v1
      blockOwnerDeletion: true
      controller: true
      kind: DaemonSet
      name: nginx-nginx-ingress
      uid: 1cf1afa7-7e81-41fa-9d24-28ccd2f1fb9f
    resourceVersion: "12312486"
    uid: 1e4e79b5-1454-4b83-9986-d707e859a069
  spec:
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchFields:
            - key: metadata.name
              operator: In
              values:
              - weasel
    containers:
    - args:
      - -nginx-plus=false
      - -nginx-reload-timeout=60000
      - -enable-app-protect=false
      - -enable-app-protect-dos=false
      - -nginx-configmaps=$(POD_NAMESPACE)/nginx-nginx-ingress
      - -default-server-tls-secret=$(POD_NAMESPACE)/nginx-nginx-ingress-default-server-tls
      - -ingress-class=nginx
      - -health-status=false
      - -health-status-uri=/nginx-health
      - -nginx-debug=false
      - -v=1
      - -nginx-status=true
      - -nginx-status-port=8080
      - -nginx-status-allow-cidrs=127.0.0.1
      - -report-ingress-status
      - -external-service=nginx-nginx-ingress
      - -enable-leader-election=true
      - -leader-election-lock-name=nginx-nginx-ingress-leader-election
      - -enable-prometheus-metrics=true
      - -prometheus-metrics-listen-port=9113
      - -prometheus-tls-secret=
      - -enable-custom-resources=true
      - -enable-snippets=false
      - -enable-tls-passthrough=false
      - -enable-preview-policies=false
      - -enable-cert-manager=false
      - -enable-oidc=false
      - -ready-status=true
      - -ready-status-port=8081
      - -enable-latency-metrics=false
      env:
      - name: POD_NAMESPACE
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.namespace
      - name: POD_NAME
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.name
      image: nginx/nginx-ingress:2.2.2
      imagePullPolicy: IfNotPresent
      name: nginx-nginx-ingress
      ports:
      - containerPort: 80
        hostPort: 80
        name: http
        protocol: TCP
      - containerPort: 443
        hostPort: 443
        name: https
        protocol: TCP
      - containerPort: 9113
        name: prometheus
        protocol: TCP
      - containerPort: 8081
        name: readiness-port
        protocol: TCP
      readinessProbe:
        failureThreshold: 3
        httpGet:
          path: /nginx-ready
          port: readiness-port
          scheme: HTTP
        periodSeconds: 1
        successThreshold: 1
        timeoutSeconds: 1
      resources: {}
      securityContext:
        allowPrivilegeEscalation: true
        capabilities:
          add:
          - NET_BIND_SERVICE
          drop:
          - ALL
        runAsUser: 101
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-f7knt
        readOnly: true
    dnsPolicy: ClusterFirst
    enableServiceLinks: true
    nodeName: weasel
    preemptionPolicy: PreemptLowerPriority
    priority: 0
    restartPolicy: Always
    schedulerName: default-scheduler
    securityContext: {}
    serviceAccount: nginx-nginx-ingress
    serviceAccountName: nginx-nginx-ingress
    terminationGracePeriodSeconds: 30
    tolerations:
    - effect: NoExecute
      key: node.kubernetes.io/not-ready
      operator: Exists
    - effect: NoExecute
      key: node.kubernetes.io/unreachable
      operator: Exists
    - effect: NoSchedule
      key: node.kubernetes.io/disk-pressure
      operator: Exists
    - effect: NoSchedule
      key: node.kubernetes.io/memory-pressure
      operator: Exists
    - effect: NoSchedule
      key: node.kubernetes.io/pid-pressure
      operator: Exists
    - effect: NoSchedule
      key: node.kubernetes.io/unschedulable
      operator: Exists
    volumes:
    - name: kube-api-access-f7knt
      projected:
        defaultMode: 420
        sources:
        - serviceAccountToken:
            expirationSeconds: 3607
            path: token
        - configMap:
            items:
            - key: ca.crt
              path: ca.crt
            name: kube-root-ca.crt
        - downwardAPI:
            items:
            - fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
              path: namespace
  status:
    conditions:
    - lastProbeTime: null
      lastTransitionTime: "2022-07-31T13:33:27Z"
      status: "True"
      type: Initialized
    - lastProbeTime: null
      lastTransitionTime: "2022-07-31T13:33:33Z"
      status: "True"
      type: Ready
    - lastProbeTime: null
      lastTransitionTime: "2022-07-31T13:33:33Z"
      status: "True"
      type: ContainersReady
    - lastProbeTime: null
      lastTransitionTime: "2022-07-31T13:33:27Z"
      status: "True"
      type: PodScheduled
    containerStatuses:
    - containerID: docker://a9da6fe6f36af5a0b56a20967cb41f0ee10df9d2b40d290b9827d569c813dff0
      image: nginx/nginx-ingress:2.2.2
      imageID: docker-pullable://nginx/nginx-ingress@sha256:b6dec42d12651b12ced39434336b1022029c8534488aa2084b46f138c9f700ba
      lastState: {}
      name: nginx-nginx-ingress
      ready: true
      restartCount: 0
      started: true
      state:
        running:
          startedAt: "2022-07-31T13:33:32Z"
    hostIP: 192.168.178.91
    phase: Running
    podIP: 10.200.24.45
    podIPs:
    - ip: 10.200.24.45
    qosClass: BestEffort
    startTime: "2022-07-31T13:33:27Z"
kind: List
metadata:
  resourceVersion: ""

@caseydavenport
Copy link
Member

@ErikEngerd thanks for raising this. I'm looking at it now. I definitely would not expect egress policy to impact a pod's ability to access its own host, perhaps unless you have host endpoints created.

I think you missed posting your Calico deployment values.yaml / manifest - do you know if FELIX_DEFAULTENDPOINTTOHOST is set on the calico/node DaemonSet? Normally we set that to "ACCEPT" by default, which ensures this packet path is functioning.

@caseydavenport
Copy link
Member

caseydavenport commented Aug 3, 2022

Get "http://10.200.208.50:8081/nginx-ready": dial tcp 10.200.208.50:8081: connect: connection refused

Calico network policy doesn't reject connections, so it's unlikely this is a result of Calico's network policy dropping the traffic. We blackhole, so the symptom would be a timeout rather than a connection refused.

This suggests to me that the nginx pod is actually failing to serve its readiness endpoint for some reason. Do you know if the nginx pod requires egress access in some manner in order to successfully start its readiness endpoint?

@ErikEngerd
Copy link
Author

I am using the following calico version

> kubectl exec -it -n kube-system calicoctl -- calicoctl version
Client Version:    v3.23.1
Git commit:        967e24543
Cluster Version:   v3.23.1
Cluster Type:      typha,kdd,k8s,operator,bgp,kubeadm

I installed calico using the tigera operator:

curl -v https://docs.projectcalico.org/archive/v3.19/manifests/tigera-operator.yaml > tigera-operator.yaml
kubectl apply -f tigera-operator.yaml

Then I applied the following custom resource

# This section includes base Calico installation configuration.
# For more information, see: https://projectcalico.docs.tigera.io/v3.23/reference/installation/api#operator.tigera.io/v1.Installati
on
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  # Configures Calico networking.
  calicoNetwork:
    nodeAddressAutodetectionV4:
      cidrs:
        - "192.168.178.0/24"
    # Note: The ipPools section cannot be modified post-install.
    ipPools:
    - blockSize: 26
      cidr: 10.200.0.0/16
      encapsulation: VXLANCrossSubnet
      natOutgoing: Enabled
      nodeSelector: all()

@ErikEngerd
Copy link
Author

Looking at the calico daemonset, it appears that the setting is already as you describe:

> k describe daemonset -n calico-system calico-node
Name:           calico-node                                                                              
Selector:       k8s-app=calico-node                                                                                
Node-Selector:  kubernetes.io/os=linux
Labels:         <none>
Annotations:    deprecated.daemonset.template.generation: 1
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 3
Number of Nodes Misscheduled: 0
Pods Status:  3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           k8s-app=calico-node
  Annotations:      hash.operator.tigera.io/cni-config: 3956e91e3004cfa053f9095050ed56f1cf12b904
                    hash.operator.tigera.io/tigera-ca-private: 1d32553cc9d339cd6761fda4c38aaadf18829ae5
  Service Account:  calico-node
  Init Containers:
   flexvol-driver:
    Image:        docker.io/calico/pod2daemon-flexvol:v3.23.1
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:
      /host/driver from flexvol-driver-host (rw)
   install-cni:
    Image:      docker.io/calico/cni:v3.23.1
    Port:       <none>
    Host Port:  <none>
    Command:
      /opt/cni/bin/install
    Environment:
      CNI_CONF_NAME:            10-calico.conflist
      SLEEP:                    false
      CNI_NET_DIR:              /etc/cni/net.d
      CNI_NETWORK_CONFIG:       <set to the key 'config' of config map 'cni-config'>  Optional: false
      KUBERNETES_SERVICE_HOST:  10.96.0.1
      KUBERNETES_SERVICE_PORT:  443
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
  Containers:
   calico-node:
    Image:      docker.io/calico/node:v3.23.1
    Port:       <none>
    Host Port:  <none>
    Liveness:   http-get http://localhost:9099/liveness delay=0s timeout=10s period=10s #success=1 #failure=3
    Readiness:  exec [/bin/calico-node -bird-ready -felix-ready] delay=0s timeout=5s period=10s #success=1 #failure=3
    Environment:
      DATASTORE_TYPE:                     kubernetes
      WAIT_FOR_DATASTORE:                 true
      CLUSTER_TYPE:                       k8s,operator,bgp
      CALICO_DISABLE_FILE_LOGGING:        false
      FELIX_DEFAULTENDPOINTTOHOSTACTION:  ACCEPT
      FELIX_HEALTHENABLED:                true
      FELIX_HEALTHPORT:                   9099
      NODENAME:                            (v1:spec.nodeName)
      NAMESPACE:                           (v1:metadata.namespace)
      FELIX_TYPHAK8SNAMESPACE:            calico-system
      FELIX_TYPHAK8SSERVICENAME:          calico-typha
      FELIX_TYPHACAFILE:                  /etc/pki/tls/certs/tigera-ca-bundle.crt
      FELIX_TYPHACERTFILE:                /node-certs/tls.crt
      FELIX_TYPHAKEYFILE:                 /node-certs/tls.key
      FELIX_TYPHACN:                      typha-server
      CALICO_MANAGE_CNI:                  true
      CALICO_IPV4POOL_CIDR:               10.200.0.0/16
      CALICO_IPV4POOL_VXLAN:              CrossSubnet
      CALICO_IPV4POOL_BLOCK_SIZE:         26
      CALICO_IPV4POOL_NODE_SELECTOR:      all()
      CALICO_NETWORKING_BACKEND:          bird
      IP:                                 autodetect
      IP_AUTODETECTION_METHOD:            cidr=192.168.178.0/24
      IP6:                                none
      FELIX_IPV6SUPPORT:                  false
      KUBERNETES_SERVICE_HOST:            10.96.0.1
      KUBERNETES_SERVICE_PORT:            443
    Mounts:
      /etc/pki/tls/certs/ from tigera-ca-bundle (ro)
      /host/etc/cni/net.d from cni-net-dir (rw)
      /lib/modules from lib-modules (ro)
      /node-certs from node-certs (ro)
      /run/xtables.lock from xtables-lock (rw)
      /var/lib/calico from var-lib-calico (rw)
      /var/log/calico/cni from cni-log-dir (rw)
      /var/run/calico from var-run-calico (rw)
      /var/run/nodeagent from policysync (rw)
  Volumes:
   lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:  
   xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
   policysync:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/nodeagent
    HostPathType:  DirectoryOrCreate
   tigera-ca-bundle:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      tigera-ca-bundle
    Optional:  false
   node-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  node-certs
    Optional:    false
   var-run-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/calico
    HostPathType:  
   var-lib-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/calico
    HostPathType:  
   cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:  
   cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:  
   cni-log-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/calico/cni
    HostPathType:  
   flexvol-driver-host:
    Type:               HostPath (bare host directory volume)
    Path:               /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
    HostPathType:       DirectoryOrCreate
  Priority Class Name:  system-node-critical
Events:
  Type    Reason            Age   From                  Message
  ----    ------            ----  ----                  -------
  Normal  SuccessfulCreate  11m   daemonset-controller  Created pod: calico-node-8h5hj
  Normal  SuccessfulCreate  11m   daemonset-controller  Created pod: calico-node-4bwgx
  Normal  SuccessfulCreate  11m   daemonset-controller  Created pod: calico-node-kkdv2

@ErikEngerd
Copy link
Author

If I apply the following network policy, then the readiness probe fails. This egress rule allows traffic to 192.168.178.0/24 which is the network of the nodes.

---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: default-allow-nothing
  namespace: nginx
spec:
  podSelector: {}
  policyTypes:
   # - Ingress
    - Egress
  egress:
    - to:
      - ipBlock:
          cidr: 192.168.178.0/24
      

@ErikEngerd
Copy link
Author

As far as the readiness probe is concerned, it is a standard HTTP check based on the container port 8081. Excerpt from one of the nginx pods above:

ports:
      - containerPort: 80
        hostPort: 80
        name: http
        protocol: TCP
      - containerPort: 443
        hostPort: 443
        name: https
        protocol: TCP
      - containerPort: 9113
        name: prometheus
        protocol: TCP
      - containerPort: 8081
        name: readiness-port
        protocol: TCP
      readinessProbe:
        failureThreshold: 3
        httpGet:
          path: /nginx-ready
          port: readiness-port
          scheme: HTTP
        periodSeconds: 1
        successThreshold: 1
        timeoutSeconds: 1

It is using containerPort 8081 which is not a host port like 80 and 443. Or perhaps the implementation of the 8081 healthcheck uses the host port 80 or 443.

@ErikEngerd
Copy link
Author

I have done some more experimentation. I think I have figured it out now.
The following post provided a hint: kubernetes/ingress-nginx#5058
Apparently, when nginx starts up it tries to connect to the API server. Therefore, I had to open port 6443 to the API server which listens on an IP address from 192.168.178.0/24.

The following network policy then works:
`kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: default-allow-nothing
namespace: nginx
spec:
podSelector: {}
policyTypes:

- Ingress

- Egress

egress:
- to:
- ipBlock:
cidr: 192.168.178.0/24
ports:
- port: 6443``

@ErikEngerd
Copy link
Author

I think the issue can be closed. The only annoying thing is that network policies don't really allow the use case of allowing egress to the API server without hardcoding IPs in the egress rule, since the api server is not listening on a cluster IP. Would be nice if this could be fixed somehow in standardization.

@caseydavenport
Copy link
Member

The only annoying thing is that network policies don't really allow the use case of allowing egress to the API server without hardcoding IPs in the egress rule, since the api server is not listening on a cluster IP

@ErikEngerd have you looked at using Services in egress rules?

You can use a Calico policy as described here: https://projectcalico.docs.tigera.io/security/service-policy

e.g.

  egress:
    - action: Allow
      destination:
        services:
          name: kubernetes
          namespace: default

@ErikEngerd
Copy link
Author

ErikEngerd commented Aug 3, 2022

I am aware that Calico provides much more advanced network policies than the standard but I am a bit reluctant to become dependant on Calico-specific functionaliry. Also, some services such as Gke autopilot do not allow to choose a specific network provider (even though they appear to use calico under the hood).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants