Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix quotation in split_part macro #338

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

3fonov
Copy link

@3fonov 3fonov commented Aug 14, 2024

It seems that there's extra quotes in split_part macro because if you run dbt_utils.get_url_paramter code for clickhouse results in extra quotes around field name.

Summary

Removed quotes around field_name.

It seems that there's extra quotes in split_part macro because if you
run dbt_utils.get_url_paramter code for clickhouse results in extra
quotes around field name.
@CLAassistant
Copy link

CLAassistant commented Aug 14, 2024

CLA assistant check
All committers have signed the CLA.

@3fonov
Copy link
Author

3fonov commented Aug 14, 2024

Link to issue: #337

Previous commit changed all sql to lowercase. Sorry. Fixed.
Changed from splitByChar to splitByStrig because sometimes you need more
than one char to split.
Copy link
Contributor

@BentsiLeviav BentsiLeviav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @3fonov
Thank you for your contribution!

I see that you indented many lines. There is a lint process configured in a Makefile. Please return back the old format or alternatively run make lint in the project root dir to see what you need to adjust back.

Thanks

@3fonov
Copy link
Author

3fonov commented Aug 21, 2024

@BentsiLeviav fixed. I've got sqlfmt running on save and it reformatted it.

Copy link
Contributor

@BentsiLeviav BentsiLeviav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @3fonov

I looked for the dbt original implementation of the split_part macro, and IMO, it is not safe to rely on the user to pass surrounding quotes, rather than taking care of it in the code.

As you can see in the get_url_parameter original implementation:

{%- set split = dbt.split_part(dbt.split_part(field, formatted_url_parameter, 2), "'&'", 1) -%}

The & is being double-quoted, and the original split_part implementation assumes the user is passing a valid, pre-quoted char/string:

split_part(
{{ string_text }},
{{ delimiter_text }},
{{ part_number }}
)

In order to not break previous API usage, I suggest adding a simple if that checks if the text/char is already quoted or needs to be quoted.

@3fonov
Copy link
Author

3fonov commented Aug 22, 2024

@BentsiLeviav not sure how i can check for quotation in SQL reliably.

Why not implement split_part macro same way that original does? Here is original, that you refered:

split_part(
{{ string_text }},
{{ delimiter_text }},
{{ part_number }}
)

and updated version for Clickhouse:

{% macro clickhouse__split_part(string_text, delimiter_text, part_number) %}
    splitByString({{delimiter_text}}, {{ string_text }})[{{ part_number }}]
{% endmacro %}

both of them expect to params to be quoted. And in dbt_utils they also quote it before using macro:

{%- set formatted_url_parameter = "'" + url_parameter + "='" -%}

{%- set split = dbt.split_part(dbt.split_part(field, formatted_url_parameter, 2), "'&'", 1) -%}

Don't think that we need to overcomplicated with checks and it's ok to make it similar to base implementation.

@BentsiLeviav
Copy link
Contributor

As I wrote before, it is not safe to assume the user is passing the char quoted, and although the dbt team implemented that way, it doesn't mean this is the right thing to do.

Don't think that we need to overcomplicated with checks and it's ok to make it similar to base implementation.

I'm afraid I have to disagree with you, If we want to break an API we better have a really good reason why we are doing it.

In addition, due to these changes the related test is failing, please run the tests and fix them where needed.

@3fonov
Copy link
Author

3fonov commented Aug 22, 2024

Checked documentation about split_part. It states that that make intentionally:

  • you can use column and pass text without quotes
  • you can use text and pass quoted text as argument
{{ dbt.split_part(string_text='column_to_split', delimiter_text='delimiter_column', part_number=1) }}
{{ dbt.split_part(string_text="'1|2|3'", delimiter_text="'|'", part_number=1) }}

I'll check tests later this week. Is there any documentation on how to run it locally?

@3fonov
Copy link
Author

3fonov commented Aug 23, 2024

@BentsiLeviav test fixed

========= 144 passed, 15 skipped, 44077 warnings in 63.43s (0:01:03) =========

Copy link
Author

@3fonov 3fonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants