Output run hooks (#696) #1440

beckjake · 2019-05-03T15:08:33Z

Fixes #696

Output information about the on-run-start and on-run-end hooks before and after running dbt.

One wrinkle, the output isn't exactly as described in the issue - it's a bit tricky to move that Concurrency line before hooks run. Instead you get this order.

Running with dbt=0.13.0
Found 163 macros, 0 analyses, 0 archives, 0 sources, 0 seed files, 0 tests, 2 operations, 1 models

07:58:48 |
07:58:48 | Running 1 on-run-start hook
07:58:48 | 1 of 1 START select * from information_schema.schemata
07:58:48 |
07:58:48 | Concurrency: 2 threads (target='default')
07:58:48 |
07:58:48 | 1 of 1 START view model dbt_postgres.x............................... [RUN]
07:58:48 | 1 of 1 OK created view model dbt_postgres.x.......................... [CREATE VIEW in 0.08s]
07:58:48 |
07:58:48 | Running 1 on-run-end hook
07:58:48 | 1 of 1 START select * from information_schema.schemata
07:58:48 |
07:58:48 |
07:58:48 | Finished running 2 hooks, 1 view models in 0.45s.

Completed successfully

Done. PASS=1 ERROR=0 SKIP=0 TOTAL=1

We definitely can move that Concurrency line up, and the output would probably look a bit nicer, but it involves moving that into the task level instead of the node runner level and rearranging some things.

drewbanin

Some comments about the hook output, let me know if you'd like to discuss!

I'm actually ok with the Concurrency .... line where it is -- the on-run-start|end hooks are run serially, so it actually doesn't make a ton of sense to show the Concurrency line before that.

Do we even need this line anymore? I think we added it when concurrency was new to dbt, and we wanted to make sure everyone knew about it :)

Let me think a little harder about what, if anything, we should be showing here... Maybe it's just Running 7 models or something like that instead?

drewbanin · 2019-05-06T20:33:19Z

core/dbt/task/run.py

 compiled = compile_node(adapter, self.config, hook,
 self.manifest, extra_context)
 statement = compiled.wrapped_sql
+ dbt.ui.printer.print_timestamped_line(
+ '{} of {} START {}'.format(idx, len(ordered_hooks), statement)


Can we format this like all of the other timestamped lines that dbt prints?

07:58:48 | 1 of 2 START hook: grant usage on schema......... [RUN] 07:58:48 | 1 of 2 OK hook: grant usage on schema......... [GRANT in 0.4s] 07:58:48 | 2 of 2 START hook: grant select on ......... [RUN] 07:58:48 | 2 of 2 OK hook: grant select on......... [GRANT in 2.8s]

I think we'll also want to reshape/truncate the hook SQL. It's not uncommon to do operational things (like create UDFs, or insert into an audit table in these hooks, and they currently take over the stdout!

I imagine we can do something like (pseudocode):

hook_text = hook.replace("\n", " ")[0:60] + "..."

This doesn't need to be the exact same SQL that dbt is actually running, it just needs to be a stand-in for a name! Maybe some day it will make sense to name hooks, but today is not that day :)

Let's also include a line for when the hooks completes, and a status if it's easy to grab. I think a lot of folks don't actually have insight into how long their hooks currently take, and I be it will be helpful to expose this info to them!

Here's an example hook for testing:

on-run-start: - "{{ load_udfs() }}"

-- macros/udfs.sql (imagine this did something useful....) CREATE OR REPLACE FUNCTION no_op() RETURNS void AS $$ $$ LANGUAGE SQL;

We can also be pretty aggressive with the truncation here -- the point is just to show that hooks are running, and to indicate how long they have taken. If folks want to dig further into that, they can find the full queries in the logs!

Truncate long run-hooks, replacing newlines with spaces Include run-hook status/timing in output Clean up run-hook execution a bit

beckjake · 2019-05-07T15:54:12Z

Do we even need this line anymore?

I find it useful sometimes, though I don't know if it needs to go to stdout.

drewbanin · 2019-05-09T15:00:47Z

I'm unsure why this happens, but it looks like the statuses in stdout aren't colored correctly (this is on-run-start):

I'd expect the SUCCESS 1 strings to be colored green here. Basically if an exception occurs, this should be red, otherwise it should be green.

drewbanin · 2019-05-09T15:40:02Z

In actually seeing this, I think the output is just too verbose to be really useful here. I knew it would be a lot of text, but I'm not thrilled with the way this actually looks. It's the -- Start a transaction. This is required to per... in particular line that's unappealing: If every hook was just grant usage on schema dbt_dbanin to role reporting then I'd be happy, but that's certainly not what 100% of hooks look like these days.

Let's plan to "name" hooks at some point in the future. That might look like:

on-run-end:
  - {name: "reporting grant", sql: "grant usage on schema {{ target.schema }} to role reporter}

For now though, I think it would be a good idea to just output the package, hook type, and hook index. So:

10:57:53 | Running 2 on-run-end hooks
10:57:53 | 1 of 2 START fishtown_internal_analytics.on_run_start.1.......................... [RUN]
10:57:53 | 1 of 2 OK fishtown_internal_analytics.on_run_start.1............................. [GRANT in 0.00s]
10:57:53 | 2 of 2 START fishtown_internal_analytics.on_run_start.2.......................... [RUN]
10:57:53 | 2 of 2 OK fishtown_internal_analytics.on_run_start.2............................. [GRANT in 0.00s]

The actual formatting of fishtown_internal_analytics.on_run_start.2 isn't so important -- if it's easier to do fishtown_internal_analytics.on_run_start-2 or something, that works just as well!

beckjake · 2019-05-09T16:46:45Z

I'd expect the SUCCESS 1 strings to be colored green here. Basically if an exception occurs, this should be red, otherwise it should be green.

I'm mostly amenable to this, but worth pointing out that it will never be red - failing a hook results in the run failing entirely and halting.

$ cat dbt_project.yml
name: 'minimal'
version: '1.0'
profile: 'local'

on-run-start:
 - 'select id from does_not_exist'
$ dbt run
Running with dbt=0.13.0
Found 1 models, 0 tests, 0 archives, 0 analyses, 101 macros, 1 operations, 0 seed files, 0 sources

10:45:49 |
10:45:49 | Running 1 on-run-start hook
10:45:49 | 1 of 1 START hook: select id from does_not_exist..................... [RUN]
Database error while running on-run-start
Encountered an error:
Database Error
  relation "does_not_exist" does not exist
  LINE 1: select id from does_not_exist
                         ^

drewbanin · 2019-05-09T17:13:49Z

I'm mostly amenable to this, but worth pointing out that it will never be red - failing a hook results in the run failing entirely and halting.

Hah, I was going to mention that, but wasn't sure if it was still the case. We should probably do something different there, but that can happen in a separate PR! If nothing else, this PR is a step in the right direction... users will be able to see which hook failed in the stdout now :)

core/dbt/task/run.py

Co-Authored-By: Drew Banin <[email protected]>

Order hooks in a deterministic way: - in the root project, hooks in index order - foreach dependency project in alphabetical order, hooks in index order Since these hooks always have a valid "index", no need to devise a default Add some tests

drewbanin

This was a little different than the original spec, but I'm super happy with where we ended up! Thanks for your help with iterating on this -- ship it!

Jacob Beck added 2 commits May 3, 2019 07:59

suppress some cache logger noise

c190858

add on-run-start/on-run-end hook logging

80482aa

drewbanin self-requested a review May 6, 2019 20:40

drewbanin requested changes May 6, 2019

View reviewed changes

PR feedback

9507669

Truncate long run-hooks, replacing newlines with spaces Include run-hook status/timing in output Clean up run-hook execution a bit

beckjake requested a review from drewbanin May 7, 2019 15:56

add color to hooks, give hooks names

ec8277b

drewbanin reviewed May 9, 2019

View reviewed changes

core/dbt/task/run.py Show resolved Hide resolved

core/dbt/task/run.py Outdated Show resolved Hide resolved

PR feedback

5d05bf0

beckjake requested a review from drewbanin May 9, 2019 19:20

drewbanin reviewed May 9, 2019

View reviewed changes

core/dbt/task/run.py Outdated Show resolved Hide resolved

beckjake and others added 2 commits May 9, 2019 14:30

Update core/dbt/task/run.py

58bf73d

Co-Authored-By: Drew Banin <[email protected]>

PR feedback

f6bf8d9

Order hooks in a deterministic way: - in the root project, hooks in index order - foreach dependency project in alphabetical order, hooks in index order Since these hooks always have a valid "index", no need to devise a default Add some tests

drewbanin approved these changes May 10, 2019

View reviewed changes

beckjake merged commit 210cf43 into dev/wilt-chamberlain May 10, 2019

beckjake deleted the feature/output-run-hooks branch May 10, 2019 03:06

jtcohen6 mentioned this pull request Feb 1, 2021

Handle errors more gracefully in on-run-start/end context #1266

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output run hooks (#696) #1440

Output run hooks (#696) #1440

beckjake commented May 3, 2019

drewbanin left a comment

drewbanin May 6, 2019

drewbanin May 6, 2019

beckjake commented May 7, 2019 •

edited

Loading

drewbanin commented May 9, 2019 •

edited

Loading

drewbanin commented May 9, 2019

beckjake commented May 9, 2019

drewbanin commented May 9, 2019

drewbanin left a comment

Output run hooks (#696) #1440

Output run hooks (#696) #1440

Conversation

beckjake commented May 3, 2019

drewbanin left a comment

Choose a reason for hiding this comment

drewbanin May 6, 2019

Choose a reason for hiding this comment

drewbanin May 6, 2019

Choose a reason for hiding this comment

beckjake commented May 7, 2019 • edited Loading

drewbanin commented May 9, 2019 • edited Loading

drewbanin commented May 9, 2019

beckjake commented May 9, 2019

drewbanin commented May 9, 2019

drewbanin left a comment

Choose a reason for hiding this comment

beckjake commented May 7, 2019 •

edited

Loading

drewbanin commented May 9, 2019 •

edited

Loading