Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add and incorporate macro for turning jsons (and only jsons) into strings #129

Open
wants to merge 2 commits into
base: releases/v0.4.latest
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ dispatch:
- [Cross-database compatibility](#cross-database-compatibility)
- [array\_agg (source)](#array_agg-source)
- [ceiling (source)](#ceiling-source)
- [get\_json\_columns\_in\_relation (source)](#get_json_columns_in_relation-source)
- [first\_value (source)](#first_value-source)
- [json\_extract (source)](#json_extract-source)
- [json\_parse (source)](#json_parse-source)
Expand All @@ -81,6 +82,7 @@ dispatch:
- [remove\_prefix\_from\_columns (source)](#remove_prefix_from_columns-source)
- [source\_relation (source)](#source_relation-source)
- [union\_data (source)](#union_data-source)
- [Union Data Defined Sources Configuration](#union-data-defined-sources-configuration)
- [union\_relations (source)](#union_relations-source)
- [Variable Checks](#variable-checks)
- [empty\_variable\_warning (source)](#empty_variable_warning-source)
Expand Down Expand Up @@ -171,6 +173,19 @@ than, or equal to, the specified numeric expression. The ceiling macro is compat
**Args:**
* `num` (required): The integer field you wish to apply the ceiling function.

----
### get_json_columns_in_relation ([source](macros/get_json_columns_in_relation.sql))
In BigQuery warehouses, this macro returns the names of columns that are of type JSON (as opposed to a string), given a model or source's columns. For non-BigQuery destinations, it will always return an empty list, as JSON support has not yet been rolled out to other Fivetran destinations.

**Usage:**
```sql
{{ fivetran_utils.get_json_columns_in_relation(source_columns=adapter.get_columns_in_relation(ref('stg_fivetran_platform__connector_tmp'))) }}
```
**Args:**
* `source_columns` (required): The columns of the relation. This will likely be a call to `adapter.get_columns_in_relation`.

> In Fivetran modeling packages, the `get_json_columns_in_relation` macro is called within the [fivetran_utils.fill_staging_columns](macros/fill_staging_columns.sql) macro.

----
### first_value ([source](macros/first_value.sql))
This macro returns the value_expression for the first row in the current window frame with cross db functionality. This macro ignores null values. The default first_value calculation within the macro is the `first_value` function. The Redshift first_value calculation is the `first_value` function, with the inclusion of a frame_clause `{{ partition_field }} rows unbounded preceding`.
Expand Down Expand Up @@ -430,6 +445,8 @@ from source
* `source_columns` (required): Will call the [get_columns_in_relation](https://docs.getdbt.com/reference/dbt-jinja-functions/adapter/#get_columns_in_relation) macro as well requires a `ref()` or `source()` argument for the staging models within the `_tmp` directory.
* `staging_columns` (required): Created as a result of running the [generate_columns_macro](https:/fivetran/dbt_fivetran_utils#generate_columns_macro-source) for the respective table.

> This macro makes a call to `fivetran_utils.get_json_columns_in_relation()`, which returns source columns that are JSONs (BigQuery only). It will wrap each JSON field in [TO_JSON_STRING](https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#to_json_string) and convert each to a string.

----
### persist_pass_through_columns ([source](macros/persist_pass_through_columns.sql))
This macro is used to persist pass through columns from the staging model to the **transform** package. This is particularly helpful when a `select *` is not feasible.
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: 'fivetran_utils'
version: '0.4.8'
version: '0.4.9'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
2 changes: 1 addition & 1 deletion integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'fivetran_utils_integration_tests'
version: '0.4.8'
version: '0.4.9'
config-version: 2
profile: 'integration_tests'

Expand Down
13 changes: 11 additions & 2 deletions macros/fill_staging_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,19 @@

{%- set source_column_names = source_columns|map(attribute='name')|map('lower')|list -%}

{%- set json_columns = [] -%}
{% if target.type == 'bigquery' %}
{%- set json_columns = fivetran_utils.get_json_columns_in_relation(source_columns) -%}
{% endif %}

{%- for column in staging_columns %}
{% if column.name|lower in source_column_names -%}
{{ fivetran_utils.quote_column(column) }} as
{%- if 'alias' in column %} {{ column.alias }} {% else %} {{ fivetran_utils.quote_column(column) }} {%- endif -%}
{%- if column.name|lower in json_columns|lower -%}
TO_JSON_STRING( {{ fivetran_utils.quote_column(column) }} )
{%- else -%}
{{ fivetran_utils.quote_column(column) }}
{%- endif %}
as {%- if 'alias' in column %} {{ column.alias }} {% else %} {{ fivetran_utils.quote_column(column) }} {%- endif -%}
{%- else -%}
cast(null as {{ column.datatype }})
{%- if 'alias' in column %} as {{ column.alias }} {% else %} as {{ fivetran_utils.quote_column(column) }} {% endif -%}
Expand Down
31 changes: 31 additions & 0 deletions macros/get_json_columns_in_relation.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
{% macro get_json_columns_in_relation(source_columns) %}

{{ adapter.dispatch('get_json_columns_in_relation', 'fivetran_utils') (source_columns) }}

{%- endmacro %}

-- currently only need this for bigquery, so for everything else do nothing and just return an empty list
{% macro default__get_json_columns_in_relation(source_columns) %}

{{ return([]) }}

{% endmacro %}

-- we will return the columns that are of JSON type
{% macro bigquery__get_json_columns_in_relation(source_columns) %}

{% set json_columns = [] %}

{% set sc = source_columns|list %}

{% for col_index in range(sc|length) %}

{% if sc[col_index].dtype|lower == 'json' %}
{% do json_columns.append(sc[col_index].name) %}

{% endif %}
{% endfor %}

{{ return(json_columns) }}

{% endmacro %}