Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable perioding resuming to fetch logs for KPOA #139

Merged
merged 17 commits into from
May 26, 2022
Merged

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Mar 21, 2022

This PR modifies Async KPO to periodicall resume the sync portion of the task to fetch and emit the latest logs before deferring again.

To control this behavior a new param is added: logging_interval. If None then Async KPO operates as it did before, never resuming the sync task until the pod is done running. If logging_interval is given a value (integer seconds) then the trigger will resume every logging_interval seconds to fetch latest logs and defer again (until pod is done).

Here's a task example you can test this with:

test_kubernetes_pod = KubernetesPodOperatorAsync(
    namespace='airflow',
    image='python:3.8-slim',
    cmds=[
        "bash",
        "-cx",
        (
            "i=0; "
            "while [ $i -ne 30 ]; "
            "do i=$(($i+1)); "
            "echo $i; "
            "sleep 1; "
            "done; "
            "mkdir -p /airflow/xcom/; "
            "echo '{\"message\": \"good afternoon!\"}' > /airflow/xcom/return.json"
        ),
    ],
    name="test_kubernetes_pod",
    in_cluster=False,
    task_id="test_kubernetes_pod",
    logging_interval=10,
    dag=dag,
    is_delete_operator_pod=False,
    # do_xcom_push=True,
)

@dstandish
Copy link
Contributor Author

@kaxil this is ready for review but not for merge because it needs to wait on another k8s provider release
tests passing locally now though

Copy link
Collaborator

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!! Have you tested in on Astro Cloud or locally via chart?

@dstandish
Copy link
Contributor Author

dstandish commented Mar 24, 2022

This looks great!! Have you tested in on Astro Cloud or locally via chart?

Only locally... it should be same now that we have gotten that service account issue resolved... but i can work on that if you think it's important to do.

Related note, are we gonna OSS the istio stuff?

@kaxil
Copy link
Collaborator

kaxil commented Mar 28, 2022

This looks great!! Have you tested in on Astro Cloud or locally via chart?

Only locally... it should be same now that we have gotten that service account issue resolved... but i can work on that if you think it's important to do.

@phanikumv Can you assign this to someone in the team to test too please?

Related note, are we gonna OSS the istio stuff?

Most-likely yes, in a way that is reusable for the OSS community.

@phanikumv
Copy link
Collaborator

@pankajastro please test this once merged to main as part of our integration test

@dstandish
Copy link
Contributor Author

this is on hold until next provider release

@kaxil kaxil added this to the 1.3.0 milestone Apr 5, 2022
@raphaelauv
Copy link
Contributor

@dstandish apache-airflow-providers-cncf-kubernetes 4.0.0 is out 👍

@kaxil
Copy link
Collaborator

kaxil commented Apr 13, 2022

apache-airflow-providers-cncf-kubernetes 4.0.0 depends on apache-airflow>=2.3.0

https://app.circleci.com/pipelines/github/astronomer/astronomer-providers/1165/workflows/dfda1e9e-426b-437e-8825-7c9e4bc4ccce/jobs/3125
😮

@kaxil
Copy link
Collaborator

kaxil commented Apr 13, 2022

https:/apache/airflow/blob/0cd8833df74f4b0498026c4103bab130e1fc1068/airflow/providers/cncf/kubernetes/provider.yaml#L45

I don't think this was intentional, we don't need to depend on Airflow 2.3.0 @dstandish - Let's fix this in OSS

Oh well, I forgot about this -> apache/airflow@6db30f3

@raphaelauv
Copy link
Contributor

raphaelauv commented Apr 13, 2022

@kaxil since astronomer-providers is not related to the KubernetesExecutor

it could be based on

kubernetes==21.7.0
apache-airflow-providers-cncf-kubernetes==4.0.0 --no-deps

apache/airflow#22836 (comment)

@kaxil
Copy link
Collaborator

kaxil commented Apr 13, 2022

@kaxil since astronomer-providers is not related to the KubernetesExecutor

it could be based on

kubernetes==21.7.0
apache-airflow-providers-cncf-kubernetes==4.0.0 --no-deps

apache/airflow#22836 (comment)

Even if we do that, all the users of this provider will have to do the same to install it :(

Let me think more about it and see if it is better to release a new version of Kubernetes Provider on Apache Airflow with just the logging related changes without the k8s client change.

@phanikumv phanikumv modified the milestones: 1.3.0, 1.4.0 May 9, 2022
@kaxil
Copy link
Collaborator

kaxil commented May 12, 2022

@dstandish Now that 2.3 is out and new provider version is out, can we resurrect this PR?

@phanikumv
Copy link
Collaborator

@pankajastro will fix the mypy issue and get this merged to main

@codecov
Copy link

codecov bot commented May 24, 2022

Codecov Report

Merging #139 (6bfd526) into main (0249916) will increase coverage by 0.23%.
The diff coverage is 97.77%.

@@            Coverage Diff             @@
##             main     #139      +/-   ##
==========================================
+ Coverage   96.79%   97.03%   +0.23%     
==========================================
  Files          56       56              
  Lines        2906     2938      +32     
==========================================
+ Hits         2813     2851      +38     
+ Misses         93       87       -6     
Impacted Files Coverage Δ
...viders/cncf/kubernetes/operators/kubernetes_pod.py 93.65% <96.66%> (+8.28%) ⬆️
...oviders/cncf/kubernetes/triggers/wait_container.py 98.63% <100.00%> (+6.56%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0249916...6bfd526. Read the comment docs.

@pankajastro
Copy link
Contributor

pankajastro commented May 24, 2022

@pankajastro will fix the mypy issue and get this merged to main

Mypy issue has been fixed. but test coverage is less than the threshold. I'll fix that too.

@pankajastro
Copy link
Contributor

pankajastro commented May 24, 2022

@pankajastro will fix the mypy issue and get this merged to main

The unit test coverage is passing now.

Copy link
Collaborator

@phanikumv phanikumv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @pankajastro on getting the mypy fixes done quickly.

@pankajastro
Copy link
Contributor

Tested. LGTM.

@phanikumv phanikumv merged commit 2448e4f into main May 26, 2022
@phanikumv phanikumv deleted the kpoa-logging branch May 26, 2022 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants