Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ExecutionMode.KUBERNETES example DAG & setup CI #535

Closed
4 tasks done
Tracked by #1103
tatiana opened this issue Sep 14, 2023 · 6 comments · Fixed by #1127
Closed
4 tasks done
Tracked by #1103

Create ExecutionMode.KUBERNETES example DAG & setup CI #535

tatiana opened this issue Sep 14, 2023 · 6 comments · Fixed by #1127
Assignees
Labels
area:ci Related to CI, Github Actions, or other continuous integration tools area:testing Related to testing, like unit tests, integration tests, etc epic-assigned
Milestone

Comments

@tatiana
Copy link
Collaborator

tatiana commented Sep 14, 2023

To avoid our documentation becoming outdated in incompatible with the latest version of Cosmos, as described in #534, we should:

@tatiana tatiana added this to the 1.2.0 milestone Sep 14, 2023
@tatiana tatiana changed the title Create LoadMethod.KUBERNETES example DAG & setup CI Create ExecutionMode.KUBERNETES example DAG & setup CI Sep 14, 2023
@qimumu9406
Copy link

@tatiana thanks you! Outdated documentation blocks POC using Cosmos.Looking forward to updated documentation

tatiana added a commit that referenced this issue Sep 27, 2023
Fix behaviour when using `ExecutionMode.KUBERNETES`, broken between the
Cosmos releases 1.0.0 and 1.1.1.

Update the documentation to be representative of the 1.x Cosmos
interface:

https://astronomer.github.io/astronomer-cosmos/getting_started/kubernetes.html

Add unit tests to avoid regressions on these fixes.

Part of the documentation fixes are made in:
astronomer/cosmos-example#4

As a next step, we must ensure integration tests for running Cosmos on
K8s to avoid this breaking change moving forward (issue #535).

Closes: #493
Closes: #548
Closes: #534

Co-authored-by: Pádraic Slattery <[email protected]> (who created PR #551)
tatiana added a commit that referenced this issue Sep 27, 2023
Fix behaviour when using `ExecutionMode.KUBERNETES`, broken between the
Cosmos releases 1.0.0 and 1.1.1.

Update the documentation to be representative of the 1.x Cosmos
interface:

https://astronomer.github.io/astronomer-cosmos/getting_started/kubernetes.html

Add unit tests to avoid regressions on these fixes.

Part of the documentation fixes are made in:
astronomer/cosmos-example#4

As a next step, we must ensure integration tests for running Cosmos on
K8s to avoid this breaking change moving forward (issue #535).

Closes: #493
Closes: #548
Closes: #534

Co-authored-by: Pádraic Slattery <[email protected]>
@tatiana
Copy link
Collaborator Author

tatiana commented Sep 27, 2023

An option could be to bring the Kubernetes example DAG to be part of the astronomer-cosmos repo and automate the steps we describe in our docs in Github Actions, using: https:/marketplace/actions/kubernetes-kind-cluster

tatiana added a commit that referenced this issue Sep 27, 2023
Fix behaviour when using `ExecutionMode.KUBERNETES`, broken between the
Cosmos releases 1.0.0 and 1.1.1.

Update the documentation to be representative of the 1.x Cosmos
interface:

https://astronomer.github.io/astronomer-cosmos/getting_started/kubernetes.html

Add unit tests to avoid regressions on these fixes.

Part of the documentation fixes are made in:
astronomer/cosmos-example#4

As a next step, we must ensure integration tests for running Cosmos on
K8s to avoid this breaking change moving forward (issue #535).

Closes: #493
Closes: #548
Closes: #534

Co-authored-by: Pádraic Slattery <[email protected]>
@tatiana tatiana modified the milestones: 1.2.0, 1.3.0 Oct 10, 2023
@tatiana tatiana added area:testing Related to testing, like unit tests, integration tests, etc area:ci Related to CI, Github Actions, or other continuous integration tools labels Nov 8, 2023
@tatiana tatiana modified the milestones: 1.3.0, 1.4.0 Dec 7, 2023
@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Mar 9, 2024
Copy link

dosubot bot commented Mar 9, 2024

Hi, @tatiana,

I'm helping the Cosmos team manage their backlog and am marking this issue as stale. The issue involved creating an example DAG for ExecutionMode.KUBERNETES, adding it to a specified folder, referencing it in the documentation, setting up credentials for a K8s cluster for CI, and updating information for the CI to run the example DAG as part of integration tests. It seems that the issue has been resolved by bringing the Kubernetes example DAG to be part of the astronomer-cosmos repo and automating the steps described in the documentation using GitHub Actions.

Could you please confirm if this issue is still relevant to the latest version of the Cosmos repository? If it is, please let the Cosmos team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation. If you have any further questions or need assistance, feel free to reach out.

@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 16, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Mar 16, 2024
@tatiana tatiana reopened this May 13, 2024
@tatiana
Copy link
Collaborator Author

tatiana commented May 13, 2024

This ticket is still relevant - since approximately 30% of Cosmos users use this mode

@tatiana tatiana modified the milestones: 1.4.0, 1.5.0 May 13, 2024
@tatiana
Copy link
Collaborator Author

tatiana commented May 17, 2024

It seems Github actions would allow us to spin up a KinD cluster - so we could automate running the DAG that we mention in our example (but is currently in a separate repo):
https:/marketplace/actions/kubernetes-kind-cluster

@tatiana tatiana added the triage-needed Items need to be reviewed / assigned to milestone label May 17, 2024
@tatiana tatiana added epic-assigned and removed triage-needed Items need to be reviewed / assigned to milestone labels May 17, 2024
@tatiana tatiana removed this from the Cosmos 1.5.0 milestone Jun 6, 2024
@pankajastro
Copy link
Contributor

I drafted a PR (#535) for this. The automation script works fine locally, but the Postgres instance is not healthy when running in the GitHub Action.
CI Job: https:/astronomer/astronomer-cosmos/actions/runs/10151541725/job/28071022045?pr=1127

Debug log:

+ helm list
helm
NAME    	NAMESPACE	REVISION	UPDATED                                	STATUS  	CHART             	APP VERSION
postgres	default  	1       	2024-07-29 20:43:06.773876127 +0000 UTC	deployed	postgresql-15.5.20	16.3.0     
+ echo pod service
pod service
+ kubectl get pods --namespace default
NAME                    READY   STATUS             RESTARTS   AGE
postgres-postgresql-0   0/1     CrashLoopBackOff   3          67s
+ kubectl get svc --namespace default
NAME                     TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
kubernetes               ClusterIP   10.96.0.1     <none>        443/TCP    101s
postgres-postgresql      ClusterIP   10.96.88.53   <none>        5432/TCP   67s
postgres-postgresql-hl   ClusterIP   None          <none>        5432/TCP   67s
+ echo pg log
+ kubectl logs postgres-postgresql-0 -c postgresql
pg log
postgresql 20:43:56.14 INFO  ==> 
postgresql 20:43:56.15 INFO  ==> Welcome to the Bitnami postgresql container
postgresql 20:43:56.15 INFO  ==> Subscribe to project updates by watching https:/bitnami/containers
postgresql 20:43:56.15 INFO  ==> Submit issues and feature requests at https:/bitnami/containers/issues
postgresql 20:43:56.23 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
postgresql 20:43:56.24 INFO  ==> 
postgresql 20:43:56.34 INFO  ==> ** Starting PostgreSQL setup **
postgresql 20:43:56.44 INFO  ==> Validating settings in POSTGRESQL_* env vars..
postgresql 20:43:56.45 INFO  ==> Loading custom pre-init scripts...
postgresql 20:43:56.54 INFO  ==> Initializing PostgreSQL database...
postgresql 20:43:56.64 INFO  ==> pg_hba.conf file not detected. Generating it...
postgresql 20:43:56.64 INFO  ==> Generating local authentication configuration
+ kubectl describe pod postgres-postgresql-0
Name:         postgres-postgresql-0
Namespace:    default
Priority:     0
Node:         kind-control-plane/172.18.0.3
Start Time:   Mon, 29 Jul 2024 20:43:12 +0000
Labels:       app.kubernetes.io/component=primary
              app.kubernetes.io/instance=postgres
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=postgresql
              app.kubernetes.io/version=16.3.0
              controller-revision-hash=postgres-postgresql-55b4df756d
              helm.sh/chart=postgresql-15.5.20
              statefulset.kubernetes.io/pod-name=postgres-postgresql-0
Annotations:  container.seccomp.security.alpha.kubernetes.io/postgresql: runtime/default
Status:       Running
IP:           10.244.0.6
IPs:
  IP:           10.244.0.6
Controlled By:  StatefulSet/postgres-postgresql
Containers:
  postgresql:
    Container ID:   containerd://a49f8658b13518211e3afc542a4e4c75f735c6b4322d29f1506cb6879286052c
    Image:          docker.io/bitnami/postgresql:16.3.0-debian-12-r23
    Image ID:       docker.io/bitnami/postgresql@sha256:865e[341](https:/astronomer/astronomer-cosmos/actions/runs/10151541725/job/28071022045?pr=1127#step:4:342)baf49006e32b3e72254a15a81c939178cb9c48fcd9faf1c0ac4b49664
    Port:           5432/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 29 Jul 2024 20:43:56 +0000
      Finished:     Mon, 29 Jul 2024 20:43:56 +0000
    Ready:          False
    Restart Count:  3
    Limits:
      cpu:                150m
      ephemeral-storage:  2Gi
      memory:             192Mi
    Requests:
      cpu:                100m
      ephemeral-storage:  50Mi
      memory:             128Mi
    Liveness:             exec [/bin/sh -c exec pg_isready -U "postgres" -h 127.0.0.1 -p 5432] delay=30s timeout=5s period=10s #success=1 #failure=6
    Readiness:            exec [/bin/sh -c -e exec pg_isready -U "postgres" -h 127.0.0.1 -p 5432
[ -f /opt/bitnami/postgresql/tmp/.initialized ] || [ -f /bitnami/postgresql/.initialized ]
] delay=5s timeout=5s period=10s #success=1 #failure=6
    Environment:
      BITNAMI_DEBUG:                        false
      POSTGRESQL_PORT_NUMBER:               5432
      POSTGRESQL_VOLUME_DIR:                /bitnami/postgresql
      PGDATA:                               /bitnami/postgresql/data
      POSTGRES_PASSWORD:                    <set to the key 'postgres-password' in secret 'postgres-postgresql'>  Optional: false
      POSTGRESQL_ENABLE_LDAP:               no
      POSTGRESQL_ENABLE_TLS:                no
      POSTGRESQL_LOG_HOSTNAME:              false
      POSTGRESQL_LOG_CONNECTIONS:           false
      POSTGRESQL_LOG_DISCONNECTIONS:        false
      POSTGRESQL_PGAUDIT_LOG_CATALOG:       off
      POSTGRESQL_CLIENT_MIN_MESSAGES:       error
      POSTGRESQL_SHARED_PRELOAD_LIBRARIES:  pgaudit
    Mounts:
      /bitnami/postgresql from data (rw)
      /dev/shm from dshm (rw)
      /opt/bitnami/postgresql/conf from empty-dir (rw,path="app-conf-dir")
      /opt/bitnami/postgresql/tmp from empty-dir (rw,path="app-tmp-dir")
      /tmp from empty-dir (rw,path="tmp-dir")
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-postgres-postgresql-0
    ReadOnly:   false
  empty-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  dshm:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      Memory
    SizeLimit:   <unset>
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  62s                default-scheduler  Successfully assigned default/postgres-postgresql-0 to kind-control-plane
  Normal   Pulling    62s                kubelet            Pulling image "docker.io/bitnami/postgresql:16.3.0-debian-12-r23"
  Normal   Pulled     57s                kubelet            Successfully pulled image "docker.io/bitnami/postgresql:16.3.0-debian-12-r23" in 5.[400](https:/astronomer/astronomer-cosmos/actions/runs/10151541725/job/28071022045?pr=1127#step:4:401)199843s
  Normal   Pulled     21s (x3 over 53s)  kubelet            Container image "docker.io/bitnami/postgresql:16.3.0-debian-12-r23" already present on machine
  Normal   Created    19s (x4 over 54s)  kubelet            Created container postgresql
  Normal   Started    18s (x4 over 54s)  kubelet            Started container postgresql
  Warning  BackOff    12s (x9 over 52s)  kubelet            Back-off restarting failed container

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:ci Related to CI, Github Actions, or other continuous integration tools area:testing Related to testing, like unit tests, integration tests, etc epic-assigned
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants