DBT v0.18 introduced data processed. Can DBT_ML do this too? #10

switzer · 2020-10-09T18:12:18Z

Now in DBT 0.18, when executing dbt run, you get a message stating the amount of data processed, as follows:

09:27:32 | 4 of 19 OK created incremental model dbt_prod.my_table [MERGE (16.6m rows, 354.1 GB processed) in 48.58s]

When running a DBT_ML model, the message is similar to the following:

14:01:16 | 1 of 1 OK created model model dbt_prod.mdl_my_model................... [OK in 496.66s]

Can you add the amount of data processed as well, as is done in DBT?

The text was updated successfully, but these errors were encountered:

rbjerrum · 2020-10-12T14:28:47Z

Thank you for opening this issue @switzer! Unfortunately the bytes processed is retrieved from the google.cloud.bigquery.QueryJob object and is not available in the Jinja-context that packages have access to. In order to have this functionality, the BigQuery adapter would need to have a condition on the CREATE_MODEL-statement type when handling the response from BigQuery like it is done for other statement types.

@jtcohen6 What do you think about making a change like this to the BigQuery adapter?

rbjerrum · 2020-10-19T07:07:40Z

Related to dbt-labs/dbt-core#2747, currently planned for dbt v0.19.

jtcohen6 · 2020-10-19T13:03:26Z

the BigQuery adapter would need to have a condition on the CREATE_MODEL-statement type when handling the response from BigQuery like it is done for other statement types.

@rbjerrum I'd welcome this change to the dbt-bigquery plugin!

As you noted, in v0.19 we're seeking to implement a more generalizable solution for storing adapter-specific structured data in run_results.json. Right now, bytes processed is just stored as part of the status string. We still need to figure out the exact mechanism through which that information will be shared both to the artifact and CLI output.

switzer · 2024-09-08T10:54:32Z

Note - as you say, this has changed in the latest version of DBT. My model build process now response as follows:

1 of 1 OK created sql model model dbt_dev_ml.my_ml_model .... [None (2.2 GiB processed) in 160.37s]

Closing this issue.

rbjerrum added the enhancement New feature or request label Oct 12, 2020

rbjerrum added question Further information is requested and removed enhancement New feature or request labels Oct 12, 2020

switzer closed this as completed Sep 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DBT v0.18 introduced data processed. Can DBT_ML do this too? #10

DBT v0.18 introduced data processed. Can DBT_ML do this too? #10

switzer commented Oct 9, 2020 •

edited

Loading

rbjerrum commented Oct 12, 2020

rbjerrum commented Oct 19, 2020

jtcohen6 commented Oct 19, 2020

switzer commented Sep 8, 2024

DBT v0.18 introduced data processed. Can DBT_ML do this too? #10

DBT v0.18 introduced data processed. Can DBT_ML do this too? #10

Comments

switzer commented Oct 9, 2020 • edited Loading

rbjerrum commented Oct 12, 2020

rbjerrum commented Oct 19, 2020

jtcohen6 commented Oct 19, 2020

switzer commented Sep 8, 2024

switzer commented Oct 9, 2020 •

edited

Loading