ServiceMonitor not deployed with Operator v2 #13321

JakeSCahill · 2023-09-07T16:45:57Z

Version & Environment

redpanda version: v23.2.8
operator version: v23.2.7

Tested on kind

What went wrong?

Deploying the following Redpanda resource does not result in a ServiceMonitor being deployed:

apiVersion: cluster.redpanda.com/v1alpha1
kind: Redpanda
metadata:
  name: redpanda
spec:
  chartRef: {}
  clusterSpec:
    monitoring:
      enabled: true
      scrapeInterval: 30s

The same config works when using plain Helm without Operator.

What should have happened instead?

A ServiceMonitor resource should be created.

How to reproduce the issue?

Install Operator v2 and deploy the following resource:

apiVersion: cluster.redpanda.com/v1alpha1
kind: Redpanda
metadata:
  name: redpanda
spec:
  chartRef: {}
  clusterSpec:
    monitoring:
      enabled: true
      scrapeInterval: 30s

Additional information

scrapeInterval is also required by the Redpanda CRD but it is not in the Helm chart as there's a default of 30s.

The text was updated successfully, but these errors were encountered:

alejandroEsc · 2023-09-21T14:38:31Z

This seems like it is more of a case where you need to make sure you have the servicemonitor CRD installed. When you have prometheus installed you should have that crd available AND it will deploy the object:

k
r describe servicemonitor redpanda
Name:         redpanda
Namespace:    redpanda
Labels:       app.kubernetes.io/component=redpanda
              app.kubernetes.io/instance=redpanda
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=redpanda
              helm.sh/chart=redpanda-5.4.7
              helm.toolkit.fluxcd.io/name=redpanda
              helm.toolkit.fluxcd.io/namespace=redpanda
Annotations:  meta.helm.sh/release-name: redpanda
              meta.helm.sh/release-namespace: redpanda
API Version:  monitoring.coreos.com/v1
Kind:         ServiceMonitor
Metadata:
  Creation Timestamp:  2023-09-21T14:33:33Z
  Generation:          1
  Resource Version:    3060
  UID:                 a5846517-f911-46e2-ba49-8df32b6f5ded
Spec:
  Endpoints:
    Interval:     30s
    Path:         /public_metrics
    Scheme:       https
    Target Port:  admin
    Tls Config:
      Insecure Skip Verify:  true
  Selector:
    Match Labels:
      app.kubernetes.io/instance:       redpanda
      app.kubernetes.io/name:           redpanda
      monitoring.redpanda.com/enabled:  true
Events:                                 <none>

The above was created with the values file given. I will add some api changes to account for items that can be defaulted as part of the solution here. However it should be noted that the helm chart has the following:

{{- if and (.Capabilities.APIVersions.Has "monitoring.coreos.com/v1") .Values.monitoring.enabled }}

Which checks to see if you are able to create the object first before trying to install and then checks if you want it installed. Personally this is nice but can leave customers to an unexpected state where they think they are being monitored but are not, and there is no warning given so they will set themselves up and continue until they realize they do not have the cluster they expect. I suspect that an upgrade may fix this.

Personally I would prefer to fail fast specially if the expectation is to set up the servicemonitor if they explicitly enable it. I will ask the team for further feedback.

JakeSCahill added kind/bug Something isn't working area/k8s labels Sep 7, 2023

Deflaimun mentioned this issue Sep 11, 2023

Use ServiceMonitor config to deploy Prometheus in K8s redpanda-data/docs#4

Merged

alejandroEsc self-assigned this Sep 21, 2023

alejandroEsc added the P0 Needs done immediately! label Sep 21, 2023

alejandroEsc mentioned this issue Sep 21, 2023

k8s: remove field as required for service monitoring #13590

Merged

7 tasks

alejandroEsc closed this as completed in #13590 Sep 21, 2023

vbotbuildovich mentioned this issue Sep 21, 2023

[v23.2.x] ServiceMonitor not deployed with Operator v2 #13594

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ServiceMonitor not deployed with Operator v2 #13321

ServiceMonitor not deployed with Operator v2 #13321

JakeSCahill commented Sep 7, 2023 •

edited

Loading

alejandroEsc commented Sep 21, 2023

ServiceMonitor not deployed with Operator v2 #13321

ServiceMonitor not deployed with Operator v2 #13321

Comments

JakeSCahill commented Sep 7, 2023 • edited Loading

Version & Environment

What went wrong?

What should have happened instead?

How to reproduce the issue?

Additional information

alejandroEsc commented Sep 21, 2023

JakeSCahill commented Sep 7, 2023 •

edited

Loading