#1031 Add macro to add table description from schema.yml for BQ #1285

darrenhaken · 2019-02-10T13:40:22Z

WIP in progress. @drewbanin please take a look

darrenhaken · 2019-02-10T13:43:09Z

This relates to #1031

drewbanin

Left some comments, happy to discuss

plugins/bigquery/dbt/include/bigquery/macros/adapters.sql

darrenhaken · 2019-02-12T15:23:01Z

@drewbanin made the changes you asked and I'll start trying to test it

Also make changes to table_options macro based on testing against a real project.

…tephen-girard

darrenhaken · 2019-02-16T15:23:22Z

@drewbanin I got this working now. I also added the table_options to apply to views as well as table relations.

Can you take a look and tell me what you think?

drewbanin · 2019-02-18T17:59:39Z

@darrenhaken just gave this a look and tried it out locally -- it's really great!!

I have a couple of small recommendations, but overall, I really like the way this is taking shape.

Let's add persist_docs as a "known" config, that way it can be specified in the models: section of dbt_project.yml. I think you can do that by adding it to this list. By adding this to ExtendDictFields, users can set these configs hierarchically, eg:

models:
  persist_docs:
    relation: true
  specific_package:
    columns: true # future

Adding descriptions is effectively "free" on BQ, but it will require extra queries in databases like Snowflake and Redshift. This is a pretty powerful thing to be able to do with a one line code change!

Let's add a little love to dbt's error handling. Right now, if you provide an invalid value like

{{ config(persist_docs=true) }}

dbt will fail with an error like: 'bool object' has no attribute 'get'

I think instead, we can add some code to the table_options macro to handle invalid input. Maybe:

  {%- set raw_persist_docs = config.get('persist_docs', {}) -%}

   {%- if raw_persist_docs is mapping -%}
    {%- set raw_relation = raw_persist_docs.get('relation', false) -%}
    OPTIONS(
      {%- if raw_relation -%}
      description={{ model.description | tojson }}
      {% endif %}
    )
  {%- else -%}
    {{ exceptions.raise_compiler_error("Invalid value provided for 'persist_docs'. Expected dict but got value: " ~ raw_persist_docs) }}
  {% endif %}

So, this is looking really close! Happy to discuss :)

…tephen-girard

Change `table_options` to have better error handling.

darrenhaken · 2019-02-19T12:29:51Z

@drewbanin I've pushed changes based on your comments and now you can set persist_docs at a project level.

However, I've encountered an issue when testing out the error handling logic.
When I put {{ config(persist_docs=true) }} I get the following error:

 File "/Users/darren.haken/workarea/github/dbt/core/dbt/main.py", line 78, in main
    results, succeeded = handle_and_check(args)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/main.py", line 152, in handle_and_check
    task, res = run_from_args(parsed)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/main.py", line 208, in run_from_args
    results = run_from_task(task, cfg, parsed)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/main.py", line 216, in run_from_task
    result = task.run()
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/task/runnable.py", line 242, in run
    self._runtime_initialize()
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/task/runnable.py", line 50, in _runtime_initialize
    self.manifest = load_manifest(self.config)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/task/runnable.py", line 29, in load_manifest
    internal_manifest=internal_manifest)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/loader.py", line 173, in load_all
    internal_manifest)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/loader.py", line 166, in _load_from_projects
    loader.load(internal_manifest=internal_manifest)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/loader.py", line 139, in load
    self._load_nodes()
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/loader.py", line 78, in _load_nodes
    self._load_sql_nodes(ModelParser, NodeType.Model, 'source_paths')
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/loader.py", line 43, in _load_sql_nodes
    **kwargs
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/parser/base_sql.py", line 63, in load_and_parse
    return self.parse_sql_nodes(result, tags)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/parser/base_sql.py", line 82, in parse_sql_nodes
    node_parsed = self.parse_node(node, node_path, project, tags=tags)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/parser/base.py", line 224, in parse_node
    self._render_with_context(parsed_node, config)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/parser/base.py", line 160, in _render_with_context
    capture_macros=True)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/clients/jinja.py", line 279, in get_rendered
    return render_template(template, ctx, node)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/clients/jinja.py", line 266, in render_template
    return template.render(ctx)
  File "/Users/darren.haken/.virtualenvs/dbt/lib/python3.7/site-packages/jinja2/asyncsupport.py", line 76, in render
    return original_render(self, *args, **kwargs)
  File "/Users/darren.haken/.virtualenvs/dbt/lib/python3.7/site-packages/jinja2/environment.py", line 1008, in render
    return self.environment.handle_exception(exc_info, True)
  File "/Users/darren.haken/.virtualenvs/dbt/lib/python3.7/site-packages/jinja2/environment.py", line 780, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/darren.haken/.virtualenvs/dbt/lib/python3.7/site-packages/jinja2/_compat.py", line 37, in reraise
    raise value.with_traceback(tb)
  File "dbt-7eb4c5e97b7bfc0d0e7e2cab", line 2, in top-level template code
    from jinja2.runtime import LoopContext, TemplateReference, Macro, Markup, TemplateRuntimeError, missing, concat, escape, markup_join, unicode_join, to_string, identity, TemplateNotFound, Namespace
  File "/Users/darren.haken/.virtualenvs/dbt/lib/python3.7/site-packages/jinja2/sandbox.py", line 427, in call
    return __context.call(__obj, *args, **kwargs)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/context/parser.py", line 86, in __call__
    self.source_config.update_in_model_config(opts)
  File "/Users/darren.haken/workarea/github/dbt/core/dbt/parser/source_config.py", line 102, in update_in_model_config
    current.update(value)
TypeError: 'bool' object is not iterable

It looks like the error is coming from https:/fishtown-analytics/dbt/blob/61c345955ee6f6a80504d8e9abef9e7267e38b0e/core/dbt/parser/source_config.py#L97

drewbanin · 2019-02-19T15:24:06Z

@darrenhaken good catch! This change might be a tiny bit outside the scope of your PR, but it would be a good thing to fix for sure.

This code confirms that the config collector is a dict. We could add another assertion here which ensures that the value var is also a dict. That might look like:

                if not isinstance(current, dict) or not isinstance(value, dict):
                    dbt.exceptions.raise_compiler_error(
                        'Invalid config field: "{}" must be a dict'.format(key)
                    )

That would result in an error message like:

$ dbt run
Running with dbt=0.13.0-a2
Encountered an error:
Compilation Error
  Invalid config field: "persist_docs" must be a dict

This isn't the best error message in the world, but it's certainly better than TypeError: 'bool' object is not iterable :)

beckjake · 2019-02-19T15:32:10Z

Quick thought here, I agree totally on fixing this, but it should almost certainly be more like:

try:
    current.update(value)
except (ValueError, TypeError):
    dbt.exceptions.raise_compiler_error(
        'Invalid config field: "{}" must be a dict'.format(key)
    )

There are a lot of valid values we might end up wanting to insert there in the code that are not dict but are valid and convenient for update() calls, such iterators of key/value pairs.

edit: I do agree that the initial isinstance(config, dict) check is useful/important though. I don't want tricky types there.

darrenhaken · 2019-02-19T15:34:27Z

@beckjake like this?

    def update_in_model_config(self, config):
        config = self._translate_adapter_aliases(config)
        for key, value in config.items():
            if key in self.AppendListFields:
                current = self.in_model_config.get(key, [])
                if not isinstance(value, (list, tuple)):
                    value = [value]
                current.extend(value)
                self.in_model_config[key] = current
            elif key in self.ExtendDictFields:
                current = self.in_model_config.get(key, {})
                if not isinstance(current, dict):
                    dbt.exceptions.raise_compiler_error(
                        'Invalid config field: "{}" must be a dict'.format(key)
                    )
                try:
                    current.update(value)
                except (ValueError, TypeError):
                    dbt.exceptions.raise_compiler_error(
                        'Invalid config field: "{}" must be a dict'.format(key)
                    )
                self.in_model_config[key] = current
            else:  # key in self.ClobberFields or self.AdapterSpecificConfigs
                self.in_model_config[key] = value

or will the try replace the if statement?

beckjake · 2019-02-19T15:35:36Z

@darrenhaken Yup! Assuming I'm not missing anything and it works ok, of course

darrenhaken · 2019-02-19T15:39:44Z

@beckjake i think it makes sense to just do the try without the if

This resolves the issue that `{{ config(persist_docs=true) }}` would not raise a useful exception.

darrenhaken · 2019-02-19T15:42:21Z

OK I've pushed, can you take a look? @drewbanin / @beckjake . Did some testing and it seems to work

beckjake · 2019-02-19T15:43:01Z

if you skip the if block, I think you'll also want to add an AttributeError to the except block then, to handle the case of config=None.

darrenhaken · 2019-02-19T15:46:08Z

@beckjake so this?

 try:
                    current.update(value)
                except (ValueError, TypeError, AttributeError):
                    dbt.exceptions.raise_compiler_error(
                        'Invalid config field: "{}" must be a dict'.format(key)
                    )

darrenhaken · 2019-02-19T15:59:33Z

@beckjake pushed, please review when you can

beckjake · 2019-02-19T16:56:48Z

It looks like you'll have to update the docs generate tests, I think you'll need to change this bit:
https:/fishtown-analytics/dbt/blob/dev/stephen-girard/test/integration/029_docs_generate_tests/test_docs_generate.py#L785-L794 to include 'persist_docs': {},

Otherwise, this PR looks fine to me.

darrenhaken · 2019-02-19T16:57:53Z

Great, are there any other tests needed? I’m unsure what the testing approach/coverage is like around the Python changes.

…

On Tue, 19 Feb 2019 at 16:56, Jacob Beck ***@***.***> wrote: It looks like you'll have to update the docs generate tests, I think you'll need to change this bit: https:/fishtown-analytics/dbt/blob/dev/stephen-girard/test/integration/029_docs_generate_tests/test_docs_generate.py#L785-L794 to include 'persist_docs': {}, Otherwise, this PR looks fine to me. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1285 (comment)>, or mute the thread <https:/notifications/unsubscribe-auth/AA6Me9S1-xrj9woxCaV5yl2ZVB7hPXDeks5vPCzUgaJpZM4azD9s> .

drewbanin · 2019-02-19T17:20:21Z

@darrenhaken I'd recommend just updating some of the BQ test models to persist their docs. Some good candidates are:

You'll also need to provide documentation for these models in a schema.yml file to make sure the docs are persisted. As long as the models run without error, I'd be ok with merging this.

darrenhaken · 2019-02-28T14:19:14Z

@drewbanin I've made the changes but I wasn't sure how I ran the tests locally for the test suite. Will the CI build pick these up?

Also if you can review that would be great

drewbanin · 2019-02-28T19:21:33Z

@darrenhaken looks like these tests are failing in AppVeyor: https://ci.appveyor.com/project/DrewBanin/dbt/builds/22720326

I think the big problems is just that persist_docs is now being supplied as a model config, but the test suite isn't expecting it. This is a little tricky, as I think there are a bunch of different places we'll need to update the model config dicts.

Here's one example of a place where the integration test isn't expecting to see a persist_docs dict, but gets one anyway: https:/fishtown-analytics/dbt/blob/dev/stephen-girard/test/integration/029_docs_generate_tests/test_docs_generate.py#L864

I think the best way to proceed is to get the test suite running on BQ for you locally. Let's chat on Slack, happy to help you get set up! This should also be a helpful resource: https:/fishtown-analytics/dbt/blob/dev/stephen-girard/CONTRIBUTING.md#testing

So, this is super close! Let me know how i can help get it over the line :)

darrenhaken · 2019-03-01T15:32:15Z

@drewbanin to test locally do I need to fill in all the BigQuery fields:

BIGQUERY_TYPE=
BIGQUERY_PROJECT_ID=
BIGQUERY_PRIVATE_KEY_ID=
BIGQUERY_PRIVATE_KEY=
BIGQUERY_CLIENT_EMAIL=
BIGQUERY_CLIENT_ID=
BIGQUERY_AUTH_URI=
BIGQUERY_TOKEN_URI=
BIGQUERY_AUTH_PROVIDER_X509_CERT_URL=
BIGQUERY_CLIENT_X509_CERT_URL=

drewbanin · 2019-03-02T16:39:50Z

@darrenhaken yep! We should fix this, but it's a function of how we need to set up our CI tools. Circle/AppVeyor et al let you put in Environment Variables, but I don't think we can just paste in a whole Service Account JSON file (I can check on this).

Pasting in the service account file keys one-by-one is kind of annoying, but once it's done you never have to think about it again!

darrenhaken · 2019-03-02T16:47:59Z

@drewbanin Alright I'm giving it a shot now, I'll keep you posted 👍

…tephen-girard

pep8

…tephen-girard

…dbt into dev/stephen-girard

drewbanin · 2019-04-01T17:23:22Z

/azp run

azure-pipelines · 2019-04-01T17:23:34Z

Azure Pipelines successfully started running 1 pipeline(s).

brandfocus · 2019-06-08T21:00:01Z

Can I do anything to help get this merged?

darrenhaken · 2019-06-09T09:10:23Z

@drewbanin?

drewbanin · 2019-06-12T17:34:50Z

re-opened in #1532 to bring this up-to-date with recent dbt changes + test fixes

dbt-labs#1031 Add macro to add table description from schema.yml for BQ

f4c233a

drewbanin reviewed Feb 10, 2019

View reviewed changes

plugins/bigquery/dbt/include/bigquery/macros/adapters.sql Outdated Show resolved Hide resolved

plugins/bigquery/dbt/include/bigquery/macros/adapters.sql Outdated Show resolved Hide resolved

Make changes as per review from @drewbanin

562d47f

darrenhaken added 2 commits February 16, 2019 15:21

Add description for BQ tables and view relations.

1131917

Also make changes to table_options macro based on testing against a real project.

Merge remote-tracking branch 'upstream/dev/stephen-girard' into dev/s…

5661855

…tephen-girard

darrenhaken added 2 commits February 19, 2019 11:46

Merge remote-tracking branch 'upstream/dev/stephen-girard' into dev/s…

e672042

…tephen-girard

Add persist_docs to project level settings.

25d5fb1

Change `table_options` to have better error handling.

Add try catch for the updating config value.

101fd13

This resolves the issue that `{{ config(persist_docs=true) }}` would not raise a useful exception.

Add AttributionError to except block

a1b5375

Add tests for the BQ description in the relations

28fa237

ggiill mentioned this pull request Feb 28, 2019

Propagate docs to the database #1031

Closed

darrenhaken and others added 6 commits March 2, 2019 17:19

Merge remote-tracking branch 'upstream/dev/stephen-girard' into dev/s…

12bfeaa

…tephen-girard

Fix unit tests by adding persist_docs: {}

1e3bdc9

Fix BQ integration tests by adding persist_docs: {}

804a495

Add persist_docs to dbt's contract

a30cc5e

Merge branch 'dev/stephen-girard' into dev/stephen-girard

9771e63

Update parsed.py

6d53e67

pep8

drewbanin changed the base branch from dev/stephen-girard to dev/wilt-chamberlain March 27, 2019 14:31

darrenhaken added 2 commits April 1, 2019 14:30

Merge remote-tracking branch 'upstream/dev/stephen-girard' into dev/s…

d59a130

…tephen-girard

Merge branch 'dev/wilt-chamberlain' of github.com:fishtown-analytics/…

8270ef7

…dbt into dev/stephen-girard

beckjake mentioned this pull request Jun 12, 2019

merged and updated PR for #1031 #1532

Merged

beckjake merged commit 8270ef7 into dbt-labs:dev/wilt-chamberlain Jun 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#1031 Add macro to add table description from schema.yml for BQ #1285

#1031 Add macro to add table description from schema.yml for BQ #1285

darrenhaken commented Feb 10, 2019

darrenhaken commented Feb 10, 2019

drewbanin left a comment

darrenhaken commented Feb 12, 2019

darrenhaken commented Feb 16, 2019

drewbanin commented Feb 18, 2019

darrenhaken commented Feb 19, 2019

drewbanin commented Feb 19, 2019

beckjake commented Feb 19, 2019 •

edited

Loading

darrenhaken commented Feb 19, 2019 •

edited

Loading

beckjake commented Feb 19, 2019

darrenhaken commented Feb 19, 2019

darrenhaken commented Feb 19, 2019

beckjake commented Feb 19, 2019

darrenhaken commented Feb 19, 2019

darrenhaken commented Feb 19, 2019

beckjake commented Feb 19, 2019

darrenhaken commented Feb 19, 2019 via email

drewbanin commented Feb 19, 2019

darrenhaken commented Feb 28, 2019 •

edited

Loading

drewbanin commented Feb 28, 2019

darrenhaken commented Mar 1, 2019

drewbanin commented Mar 2, 2019

darrenhaken commented Mar 2, 2019

drewbanin commented Apr 1, 2019

azure-pipelines bot commented Apr 1, 2019

brandfocus commented Jun 8, 2019

darrenhaken commented Jun 9, 2019

drewbanin commented Jun 12, 2019 •

edited

Loading

#1031 Add macro to add table description from schema.yml for BQ #1285

#1031 Add macro to add table description from schema.yml for BQ #1285

Conversation

darrenhaken commented Feb 10, 2019

darrenhaken commented Feb 10, 2019

drewbanin left a comment

Choose a reason for hiding this comment

darrenhaken commented Feb 12, 2019

darrenhaken commented Feb 16, 2019

drewbanin commented Feb 18, 2019

darrenhaken commented Feb 19, 2019

drewbanin commented Feb 19, 2019

beckjake commented Feb 19, 2019 • edited Loading

darrenhaken commented Feb 19, 2019 • edited Loading

beckjake commented Feb 19, 2019

darrenhaken commented Feb 19, 2019

darrenhaken commented Feb 19, 2019

beckjake commented Feb 19, 2019

darrenhaken commented Feb 19, 2019

darrenhaken commented Feb 19, 2019

beckjake commented Feb 19, 2019

darrenhaken commented Feb 19, 2019 via email

drewbanin commented Feb 19, 2019

darrenhaken commented Feb 28, 2019 • edited Loading

drewbanin commented Feb 28, 2019

darrenhaken commented Mar 1, 2019

drewbanin commented Mar 2, 2019

darrenhaken commented Mar 2, 2019

drewbanin commented Apr 1, 2019

azure-pipelines bot commented Apr 1, 2019

brandfocus commented Jun 8, 2019

darrenhaken commented Jun 9, 2019

drewbanin commented Jun 12, 2019 • edited Loading

beckjake commented Feb 19, 2019 •

edited

Loading

darrenhaken commented Feb 19, 2019 •

edited

Loading

darrenhaken commented Feb 28, 2019 •

edited

Loading

drewbanin commented Jun 12, 2019 •

edited

Loading