Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More possible macros to add #5

Open
emilieschario opened this issue Nov 13, 2020 · 8 comments
Open

More possible macros to add #5

emilieschario opened this issue Nov 13, 2020 · 8 comments

Comments

@emilieschario
Copy link

for cross db compatibility and making packages easier to use:

  • ceiling - ceiling in spark but ceil in snowflake
  • list_agg/string_agg is concat_ws({{ delimiter }}, collect_list({{ field_to_agg }})
@emilieschario
Copy link
Author

last_value(priority ignore nulls) over (partition by ticket_id order by date_day asc rows between unbounded preceding and current row) as priority

becomes

last_value(priority, true) over (partition by ticket_id order by date_day asc rows between unbounded preceding and current row) as priority

ignore nulls is what we're handling here. good for all/most window functions to do this too. 😄

@emilieschario
Copy link
Author

The Zendesk package (Fivetran) has

{% macro json_extract(string, string_path) -%}

{{ adapter.dispatch('json_extract', packages = _get_utils_namespaces()) (string, string_path) }}

{%- endmacro %}

{% macro default__json_extract(string, string_path) %}

  json_extract_path_text({{string}}, {{ "'" ~ string_path ~ "'" }} )
 
{% endmacro %}

{% macro bigquery__json_extract(string, string_path) %}

  json_extract({{string}}, {{ "'$." ~ string_path ~ "'" }} )

{% endmacro %}

I think the Spark SQL version of this would be https://spark.apache.org/docs/2.3.0/api/sql/#get_json_object

@jtcohen6
Copy link
Collaborator

I had a chance to catch up with @fivetran-kristin last week. Fivetran has done a cool thing and moved most of their SQL syntax macros into a dbt_fivetran_utils package.

Kristin is thinking about whether to add spark__ implementations of those macros into that package directly, or into this package, nested under macros/dbt_fivetran_utils/.

@emilieschario
Copy link
Author

@fivetran-kristin do you have more thoughts you want to share here? Would love to coordinate efforts.

@kristin-bagnall
Copy link

@emilieschario and @jtcohen6 - I think the best solution here is to nest all spark implementations of our macros into this package.

I've outlined more details on Fivetran's approach to Spark compatibility here. @jtcohen6 I've shared with you. @emilieschario I don't have your email so wasn't able to share. Can you either request access or provide your email address?

Thanks!

@emilieschario
Copy link
Author

Just requested access so that I don't have to post my email on the inter webs 😄

@jtcohen6
Copy link
Collaborator

jtcohen6 commented Nov 23, 2020

@fivetran-kristin I like it! Thanks for developing a solid plan here.

I added a comment in support of fivetran_utils_dispatch_list. It might be overkill for each package to allow itself to be shimmed in slightly different ways, but I prefer being more flexible/explicit now in the event it proves necessary. If, in 6-12 months time, it turns out to be entirely superfluous, we can just standardize all packages to take a single variable adapter_dispatch_list (with backward support for the old variable names), no harm done.

@emilieschario
Copy link
Author

  • ceiling - ceiling in spark but ceil in snowflake
  • list_agg/string_agg is concat_ws({{ delimiter }}, collect_list({{ field_to_agg }})
  • last_value
  • json_extract

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants