Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update BQ labels on tables as they change #21

Closed
darrenhaken opened this issue May 28, 2020 · 8 comments
Closed

Update BQ labels on tables as they change #21

darrenhaken opened this issue May 28, 2020 · 8 comments
Labels
enhancement New feature or request good_first_issue Good for newcomers Stale

Comments

@darrenhaken
Copy link
Contributor

Describe the feature

Currently, labels apply when a table is created. However, for incremental models, the labels are not updated if they change in the future. This is also a problem as the labels become out of sync with the DBT project.

Describe alternatives you've considered

An alternative is to write a post-hook which applies the labels to the relation.
I'd prefer to contribute this feature back into the DBT project so other people can benefit from it.

Additional context

This feature is specific to BigQuery

Who will this benefit?

It will benefit users that access tables in BigQuery, the labels offer great metadata.
We also plan to have this visible in our metadata/catalog service.

@drewbanin
Copy link
Contributor

right on @darrenhaken - this would be a great change. Let us know if you're interested in contributing a change for the feature! I think the change will be self-contained within the BigQuery materialization code

@darrenhaken
Copy link
Contributor Author

I'd be interested in working on it. Can you sign post where the change would need to go?

@drewbanin
Copy link
Contributor

Sure thing - I think we'll want to:

  1. Add a macro in plugins/bigquery/.../adapter.sql that runs something like
    alter table {{table}} set options (labels=...)
  2. Call that macro in the incremental materialization, but only if the model is building incrementally. That code should go somewhere around here.
    • I think we should leave the existing code that sets the labels option in create_table_as and create_view_as statements as-is. See here and here

I can dig deeper into this if you'd benefit from any more direction, but those should be the general places to check out!

@yu-iskw
Copy link
Contributor

yu-iskw commented Dec 4, 2020

I think the issue applies to snapshots in BigQuery.

@jtcohen6 jtcohen6 transferred this issue from dbt-labs/dbt-core Oct 12, 2021
@jtcohen6 jtcohen6 added enhancement New feature or request good_first_issue Good for newcomers labels Oct 12, 2021
@dgreen161
Copy link
Contributor

dgreen161 commented Feb 27, 2022

@drewbanin @jtcohen6 I've got a version of this working where the labels config replaces what is already on the table.

One difference to the other materialization options is the case when all the labels have been removed. What should the behaviour be in this case? Looking at the BigQuery documentation, you aren't able to remove all the labels with SQL, you have to have at least one for it to work. You can have at least one label but removing them all can't be done, only with Python in this case.

Doing it within SQL, I see two options.

  1. Update the labels with a blank i.e. no_labels = true. This would remove all the previous labels but would require a dummy label to be put back in it's place which would stay until a --full-refresh is triggered.

Screenshot 2022-02-27 at 14 20 55

  1. Not updating the labels at all, leaving them all intact. This is my least favourite of the two since it wouldn't be possible to ever remove all the labels (until of course a --full-refresh is triggered). Despite this being my least favourite, it wouldn't be a regression on the current capabilities.

If done in Python, I think it would be a case of getting fetching all the labels in the table and then subsequently setting them all to None like this. If this was the prefered option, I might require some assistance to get this working.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@jeremygzva
Copy link

Hey there,
Has this issue been solved?

@jeremygzva
Copy link

jeremygzva commented Jun 27, 2024

For anyone landing on this issue, we've implement a hacky solution for this.

It consists of two macros

{% macro sync_labels() %}
    {%- set labels_dic = config.get('labels', {}) %}
    -- Get the labels from the config
    {%- set labels = [] %}
    {%- for key, value in labels_dic.items() %}
    -- Loop through the labels
        {%- set label_str = "('" ~ key.lower() ~ "','" ~ value.lower() ~ "')" %}
         -- CONCAT the key and value as a lowercase string with the correct format
        {%- do labels.append(label_str) %}
        -- Append the label to the list
    {%- endfor %}
    {%- set labels_string = labels | join(', ') %}
    -- Join the labels list into a string

    ALTER TABLE {{ this }}
    SET OPTIONS (
        labels = [{{ labels_string }}]
    );
    -- Apply the labels to the table
{% endmacro %}
{% macro sync_labels_for_incremental() %}
    {%- set model_materialization = config.get('materialized') %}
    {%- set labels_dic = config.get('labels', {}) %}
    {%- if model_materialization in('incremental') and is_incremental() %} -- For incremental models that run incrementally
        {%- if labels_dic %} -- If labels are provided, we apply them
            {{ sync_labels() }}
        {%- endif %}
    {%- endif %}
{% endmacro %}

Macro is then called as post_hook within the dbt_project.yml

models:
  my_dbt_project:
    +post-hook: ["{{ sync_labels_for_incremental() }}"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good_first_issue Good for newcomers Stale
Projects
None yet
Development

No branches or pull requests

6 participants