Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[helm] strategy.type=Recreate not fully supported by all deployments #35227

Open
2 tasks done
jiajie-chen opened this issue Oct 27, 2023 · 4 comments
Open
2 tasks done
Labels
area:helm-chart Airflow Helm Chart good first issue kind:bug This is a clearly a bug

Comments

@jiajie-chen
Copy link

jiajie-chen commented Oct 27, 2023

Official Helm Chart version

1.11.0 (latest released)

Apache Airflow version

v2.7.1

Kubernetes Version

v1.27.4 (minikube v1.31.2)

Helm Chart configuration

This causes errors:

workers:
  strategy:
    type: Recreate
# disable persistence to install `Deployment`
  persistence:
    enabled: false
logs:
  persistence:
    enabled: false

But this works:

scheduler:
  strategy:
    type: Recreate
# disable persistence to install `Deployment`
logs:
  persistence:
    enabled: false

Docker Image customizations

Using Helm chart defaults.

What happened

When installing the Helm chart with workers.strategy.type=Recreate and persistence disabled, the installation fails.

Helm raises the error:

Error: INSTALLATION FAILED: 1 error occurred:
        * Deployment.apps "airflow-worker" is invalid: spec.strategy.rollingUpdate: Forbidden: may not be specified when strategy `type` is 'Recreate'

This also applies for triggerer and dagProcessor.

However, the scheduler and webserver are able to install with Recreate.

What you think should happen instead

Setting strategy.type=Recreate should behave the same for all Airflow Deployments.

The final spec should show something like:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: airflow-worker
  namespace: airflow
spec:
  strategy:
    type: Recreate
    # no `rollingUpdate`

How to reproduce

Run an installation on a new cluster/namespace (will error):

helm install airflow apache-airflow/airflow \
  --namespace airflow --create-namespace \
  --set 'workers.strategy.type=Recreate' \
  --set 'workers.persistence.enabled=false' \
  --set 'logs.persistence.enabled=false'

However, this doesn't cause errors:

helm install airflow apache-airflow/airflow \
  --namespace airflow --create-namespace \
  --set 'scheduler.strategy.type=Recreate' \
  --set 'logs.persistence.enabled=false'

Anything else

Setting the strategy to this achieves the same functionality as Recreate:

workers:
  strategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: "100%"

But a consistent fix would be ideal - either not supporting Recreate or allowing it for all deployments.

Edit: using Helm's default key deletion works as well (may not work if using Airflow as a subchart - see below)

workers:
  strategy:
    type: Recreate
    rollingUpdate: null

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@jiajie-chen jiajie-chen added area:helm-chart Airflow Helm Chart kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Oct 27, 2023
@hakuno
Copy link
Contributor

hakuno commented Oct 31, 2023

It is because it has a default value right there:

  strategy:
    rollingUpdate:
      maxSurge: "100%"
      maxUnavailable: "50%"

Your error message says that

Deployment.apps "airflow-worker" is invalid: spec.strategy.rollingUpdate: Forbidden: may not be specified when strategy type is 'Recreate'

Please, can you add this? Or something like that.

--set 'workers.strategy.rollingUpdate={}'

Thanks in advance.

@jiajie-chen
Copy link
Author

jiajie-chen commented Oct 31, 2023

@hakuno

Thanks for the advice! I did see that by default workers.strategy.rollingUpdate has a value and scheduler.strategy has null value ~, when recreating the issue and submitting a ticket.

Trying with --set 'workers.strategy.rollingUpdate={}' raised the same INSTALLATION FAILED error from Helm. Were you able to get this --set to work on your end?

I also tried --set 'workers.strategy.rollingUpdate=null' which did work. (updated my ticket with this info)

However, I think it would be nice to have consistent configuration defaults for strategy.
It can be confusing to remember that:

  • workers/triggerer/dagProcessor have a default strategy, so use Helm's merge semantics to delete strategy.rollingUpdate if needed
  • scheduler/webserver has a null default strategy

@jiajie-chen
Copy link
Author

I'm also concerned that using workers.strategy.rollingUpdate=null may not work if the Airflow chart is used as a subchart by a user.

It looks like Helm has a bug with default key deletion for subcharts:

I will try to create a reproduction of the issue with Airflow as a subchart, when I have time.

@hakuno
Copy link
Contributor

hakuno commented Nov 2, 2023

@jiajie-chen

However, I think it would be nice to have consistent configuration defaults for strategy.

Yes, I think so.

@eladkal eladkal added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:helm-chart Airflow Helm Chart good first issue kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants