Use test materialization when executing ✨generic✨ tests #3192

kwigley · 2021-03-24T13:26:31Z

Current Behavior

generic (aka schema) tests are currently executed using the python api directly.
https:/fishtown-analytics/dbt/blob/77c10713a325d2bee91d1822951ce5d91ccc3278/core/dbt/task/test.py#L84-L85

Desired Behavior

Schema tests should be executed and handled by the same test materialization used for data tests. The goal here is to maintain existing behavior handling results (simply interpreting number of rows returned by count(*) for the test)

Things to consider

Returning results for most schema tests is straight forward, we can simply remove count(*) and return the results of the query fulfilling the test, the materialization will take care of the rest. In some cases, this is not straight forward for tests that are simple pass/fail and do not necessarily return results.

Example:
https:/fishtown-analytics/dbt/blob/749f87397ec1e0a270b2e09bd8dbeb71862fdb81/test/integration/005_simple_seed_test/macros/schema_test.sql

We will have to create a way to interpret different types of results as part of this.

The text was updated successfully, but these errors were encountered:

ezequielberto · 2021-07-22T15:20:25Z

Hello!
I think one of the issues with this change is when one defines a test with threshold. For example, I had the following custom test macro before the update:

{% macro test_relationships(model, to, field) %}

{% set column_name = kwargs.get('column_name', kwargs.get('from')) %}
{% set tolerate_errors = kwargs.get('tolerate_errors', 0) %}

with validation_errors as (
    select
        left_.id as left_id,
        right_.id as right_id
    from (select {{ column_name }} as id from {{ model }}) as left_
        left join (select {{ field }} as id from {{ to }}) as right_
            on left_.id = right_.id
    where left_.id is not null
        and right_.id is null
),
count_errors as (
    select count(*) as n_errors
    from validation_errors
)
select
    case when n_errors <= {{ tolerate_errors }}
      then 0
      else n_errors
    end as result
from count_errors

{% endmacro %}

I want it to fail if I have more than n errors, and to know how many errors they were. It was easy before, since we retrieved the value of count(*).

We have several tests this way, so actually we'd rather stay with the old testing mode. Is there a way of choosing the testing method somewhere (e.g.: dbt_project.yml)?

jtcohen6 · 2021-07-22T20:53:24Z

@ezequielberto I'm hopeful that a migration to the new testing mechanisms in v0.20 is smoother than it may first appear. In fact, I think you have some options.

Leverage new error_if + warn_if configs: The behavior you're after with tolerate_errors is now available via built-in test configs. This could be as simple as:

with validation_errors as (
    select
        left_.id as left_id,
        right_.id as right_id
    from (select {{ column_name }} as id from {{ model }}) as left_
        left join (select {{ field }} as id from {{ to }}) as right_
            on left_.id = right_.id
    where left_.id is not null
        and right_.id is null
)
select * from validation_errors

Then replace tolerate_errors with error_if and/or warn_if in the test definition:

relationships:
  to: ...
  field: ...
  error_if: 10
  warn_if: 50

Light rewrite just to get this working: The test defined above is perfectly workable up until the last CTE, where it returns numeric 0 or 1 in a case when statement. Instead, you could move the case when comparison to a where filter:

select * from count_errors
where n_errors > {{ tolerate_errors }}

kwigley added enhancement New feature or request dbt tests Issues related to built-in dbt testing functionality labels Mar 24, 2021

kwigley changed the title ~~Use test materialization when executing schema tests~~ Use test materialization when executing ✨generic✨ tests Mar 24, 2021

jtcohen6 mentioned this issue Mar 29, 2021

[Q1C2] More consistent, configurable tests #3066

Closed

kwigley mentioned this issue Apr 7, 2021

return results instead of a count for tests #3232

Closed

kwigley self-assigned this Apr 8, 2021

jtcohen6 mentioned this issue Apr 15, 2021

Show unexpected values when accepted_values test fails #3265

Closed

kwigley mentioned this issue Apr 22, 2021

use test materialization for schema/generic tests #3286

Merged

4 tasks

kwigley closed this as completed in #3286 Apr 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use test materialization when executing ✨generic✨ tests #3192

Use test materialization when executing ✨generic✨ tests #3192

kwigley commented Mar 24, 2021 •

edited

Loading

ezequielberto commented Jul 22, 2021 •

edited

Loading

jtcohen6 commented Jul 22, 2021

Use test materialization when executing ✨generic✨ tests #3192

Use test materialization when executing ✨generic✨ tests #3192

Comments

kwigley commented Mar 24, 2021 • edited Loading

Current Behavior

Desired Behavior

Things to consider

ezequielberto commented Jul 22, 2021 • edited Loading

jtcohen6 commented Jul 22, 2021

kwigley commented Mar 24, 2021 •

edited

Loading

ezequielberto commented Jul 22, 2021 •

edited

Loading