SYNQ dbt Airflow example

Prerequisites

To run this example you will need

docker
kubectl (to manage kubernetes)
kind (Kubernetes in docker)
helm (package manager for Kubernetes)

Make sure that docker works:

docker version

You will need the SYNQ_TOKEN to send dbt output to Synq.io

Installation

Create local kubernetes cluster

If you don't have a kind cluster locally, please create it.

kind create cluster
kubectl cluster-info --context kind-kind

Install Airflow with dbt support

In the next step we will install Airflow to our local Kubernetes cluster.

We will do that with Helm and we use airflow-helm chart for that https:/airflow-helm/charts/tree/main/charts/airflow .

As a part of installation 2 additional python packages are installed:

airflow-dbt that provides the Dbt*Operators
dbt-postgres that provides the dbt command and support for Postgres

Before we install it you might want to edit the Helm values.yml file. The installations is configured in a way that it will periodically pull this repository with git from https:/getsynq/synq-dbt-airflow.git and add it to the dags folder.

# Add the airflow helm repository
helm repo add airflow-stable https://airflow-helm.github.io/charts
# Install/upgrade Airflow with helm using the values from values.yml
helm upgrade --install \
  "$AIRFLOW_NAME" \
  airflow-stable/airflow \
  --namespace "airflow-dbt" \
  --version "8.6.1" \
  --values ./values.yml \
  --create-namespace \
  --wait

This can take a while (5 minutes).

Synq-dbt and DbtOperatorPlugin DAGs

Get the Synq dbt wrapper binary

To integrate with synq, you have to install the dbt wrapper program from synq. https:/getsynq/synq-dbt

For production we recommend you add the synq-dbt binary to the Airflow image with other dependencies that you need to run your DAGs. In the basic DbtPlugin example we use an Airflow image that has synq-dbt preinstalled.

In the advanced DbtPlugin example we have created a DAG that installs the synq dbt on to the worker, but it will not persist if the worker is restarted. Installation is triggered every DAG start.

Set the Synq token

The Synq dbt wrapper needs SYNQ_TOKEN to be set. The airflow dbt plugin is currently not supporting env passing via the Dbt* operators. So we have to set the SYNQ_TOKEN in 2 places:

Firstly, set the variable SYNQ_TOKEN in Airflow. In the top navbar go to Admin -> Variables and add a new variable:

Secondly, set the token as environment variable for the pods. This will only be needed until the dbt airflow plugin releases a new version. Currently the DbtOperator is not passing trough the environment variables like SYNQ_TOKEN

Edit the Helm values.yml file and update the token. Then upgrade the airflow release.

# Upgrade Airflow with helm using the values from values.yml
helm upgrade --install \
  "$AIRFLOW_NAME" \
  airflow-stable/airflow \
  --namespace "airflow-dbt" \
  --version "8.6.1" \
  --values ./values.yml \
  --create-namespace \
  --wait

Synq-dbt and KubernetesOperator DAGs

If you want to use synq-dbt with Kubernetes operator you have to add 2 things to the image/container that will be used in the kubernetes Job/Pod

synq-dbt
the dbt project

You have to pass the SYNQ_TOKEN to the Kubernetes Operator.

In the basic Kubernetes example we use a docker image that has the synq-dbt and dbt project in the image itself.

In the advanced example we install synq-dbt and we git clone our dbt project in seperate Kubernetes init containers.

How to use

Connect to Airflow instance

To connect to Airflow in Kubernetes we have to port forward.

kubectl -n airflow port-forward service/airflow-web 8080

Open your browser http://localhost:8080

Run the DAGs

The dbt project has 2 simple models that will create one table and one view in the airflow database/dbt_example schema.

Click on the "play" button of the DAG you want to start and select Trigger DAG

After a few seconds the DAG should complete successful.

Connect to Postgresql

kubectl -n airflow-dbt port-forward service/airflow-postgresql 5432

You can now use your database client to inspect the database.

Deletion

If you want to delete the whole setup, you just need to delete the kind cluster with: kind delete cluster

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.github/workflows		.github/workflows
dbt_example		dbt_example
doc/img		doc/img
.gitignore		.gitignore
Dockerfile.airflow		Dockerfile.airflow
Dockerfile.dbt		Dockerfile.dbt
README.md		README.md
airflow_dbt_plugin_advanced_dags.py		airflow_dbt_plugin_advanced_dags.py
airflow_dbt_plugin_compile_only.py		airflow_dbt_plugin_compile_only.py
airflow_dbt_plugin_dags.py		airflow_dbt_plugin_dags.py
kubernetes_advanced_dags.py		kubernetes_advanced_dags.py
kubernetes_basic_dags.py		kubernetes_basic_dags.py
values.yml		values.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SYNQ dbt Airflow example

Prerequisites

Installation

Create local kubernetes cluster

Install Airflow with dbt support

Synq-dbt and DbtOperatorPlugin DAGs

Get the Synq dbt wrapper binary

Set the Synq token

Synq-dbt and KubernetesOperator DAGs

How to use

Connect to Airflow instance

Run the DAGs

Connect to Postgresql

Deletion

About

Releases 3

Packages

Contributors 3

Languages

getsynq/synq-dbt-airflow

Folders and files

Latest commit

History

Repository files navigation

SYNQ dbt Airflow example

Prerequisites

Installation

Create local kubernetes cluster

Install Airflow with dbt support

Synq-dbt and DbtOperatorPlugin DAGs

Get the Synq dbt wrapper binary

Set the Synq token

Synq-dbt and KubernetesOperator DAGs

How to use

Connect to Airflow instance

Run the DAGs

Connect to Postgresql

Deletion

About

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 3

Languages

Packages