-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update BQ labels on tables as they change #21
Comments
right on @darrenhaken - this would be a great change. Let us know if you're interested in contributing a change for the feature! I think the change will be self-contained within the BigQuery materialization code |
I'd be interested in working on it. Can you sign post where the change would need to go? |
Sure thing - I think we'll want to:
I can dig deeper into this if you'd benefit from any more direction, but those should be the general places to check out! |
I think the issue applies to snapshots in BigQuery. |
@drewbanin @jtcohen6 I've got a version of this working where the labels config replaces what is already on the table. One difference to the other materialization options is the case when all the labels have been removed. What should the behaviour be in this case? Looking at the BigQuery documentation, you aren't able to remove all the labels with SQL, you have to have at least one for it to work. You can have at least one label but removing them all can't be done, only with Python in this case. Doing it within SQL, I see two options.
If done in Python, I think it would be a case of getting fetching all the labels in the table and then subsequently setting them all to |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days. |
Hey there, |
For anyone landing on this issue, we've implement a hacky solution for this. It consists of two macros {% macro sync_labels() %}
{%- set labels_dic = config.get('labels', {}) %}
-- Get the labels from the config
{%- set labels = [] %}
{%- for key, value in labels_dic.items() %}
-- Loop through the labels
{%- set label_str = "('" ~ key.lower() ~ "','" ~ value.lower() ~ "')" %}
-- CONCAT the key and value as a lowercase string with the correct format
{%- do labels.append(label_str) %}
-- Append the label to the list
{%- endfor %}
{%- set labels_string = labels | join(', ') %}
-- Join the labels list into a string
ALTER TABLE {{ this }}
SET OPTIONS (
labels = [{{ labels_string }}]
);
-- Apply the labels to the table
{% endmacro %}
{% macro sync_labels_for_incremental() %}
{%- set model_materialization = config.get('materialized') %}
{%- set labels_dic = config.get('labels', {}) %}
{%- if model_materialization in('incremental') and is_incremental() %} -- For incremental models that run incrementally
{%- if labels_dic %} -- If labels are provided, we apply them
{{ sync_labels() }}
{%- endif %}
{%- endif %}
{% endmacro %}
Macro is then called as post_hook within the models:
my_dbt_project:
+post-hook: ["{{ sync_labels_for_incremental() }}"] |
Describe the feature
Currently, labels apply when a table is created. However, for incremental models, the labels are not updated if they change in the future. This is also a problem as the labels become out of sync with the DBT project.
Describe alternatives you've considered
An alternative is to write a post-hook which applies the labels to the relation.
I'd prefer to contribute this feature back into the DBT project so other people can benefit from it.
Additional context
This feature is specific to BigQuery
Who will this benefit?
It will benefit users that access tables in BigQuery, the labels offer great metadata.
We also plan to have this visible in our metadata/catalog service.
The text was updated successfully, but these errors were encountered: