Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
resolves #2172
Description
This PR implements the advanced node selection syntax, mostly as described in #2172. I've made some changes to accommodate yaml syntax things, but I tried to keep with the spirit of the thing.
Here's a selectors.yml example file (taken from tests):
It defines a selector that is the all models materialized as views or tagged
foo
, except without any models taggedbar
.I didn't use hologram for parsing here - hologram (well, actually Python's type system) doesn't support recursive type definitions. Hologram also does a poor job of handling ambiguous specifications like this one: it's very hard to get hologram to support both the semi-arbitrary
{'method': 'tag', 'value': 'foo'}
spec and the fully-arbitrary-but-one-key{'tag': 'foo'}
spec.Selector definitions
There are 3 ways to define a simple selector:
This parses to a list entry containing a dict like
{'tag': 'foo'}
. It's converted, but you can't use modifiers like@
or+
here. We could add support for modifiers by examining the keys and values and taking any prefixes/suffixes, or more likely doing':'.join([key, value])
and passing that in to the string parsing logic. The more I think about this, the more I think it would be good, if only for consistency with the next form.We could also add support for
exclude
in this syntax if we wanted, though I think it makes the subtle distinction betweentag: foo
andtag:foo
much more confusing.This parses to a string entry like
'tag:foo'
, which is then parsed like CLI arguments are. You can use modifiers like@
or+
here, though yaml will want you to quote them: (- "@tag:foo"
)This parses to a dictionary entry like
{'method': 'tag', 'value': 'foo', 'childrens_parents': True}
. This is what the string form is converted into, and then it goes down the same conversion route.Combinations
Internally, this code still uses the same basic ideas introduced in previous PRs: You can combine values as unions, differences, and intersections of sets of selector definitions. The
union
andintersection
combinations are themselves selector definitions, and can be used anywhere. Set differences are discussed below, but basicallyexclude
can exist anywhere a selector definition could, or within a simple selector definition (which makes it... not so simple).Only the third form of simple selector definition can be used with exclude. For example, to do what
dbt run --exclude @tag:foo
does:Of course, you can define a one-element union with exclusions if you prefer that syntax:
Set differences
The 'exclude' key is the only way to specify set differences. It accepts a list of definitions that are then unioned together. I could definitely be convinced that that's wrong and the value should instead be just a definition. My reasoning rests on the assertion that (at least in Python!)
difference(a, union(b, c, d))
is the same asdifference(a, b, c, d)
, which I feel reasonably confident about.I think it'd be reasonable to add a
difference
key that acts as an explicit set difference. It would be its own value, as opposed toexclude
, which I think of as modifying its parents:{method: 'fqn', value: '*', exclude: ['@tag:foo']}
actually becomesdifference("fqn:*", "@tag:foo")
.Exclude syntax
The syntax isn't immensely satisfying to me, especially around
exclude
. Currently, a union with exclusions hasexclude:
as one of its elements. Does it make more sense forexclude:
to live on the same level asunion
?:I don't feel like this is really better at all (I think the lack of indentation makes it hard to parse mentally at a glance), but I don't have great taste on this kind of thing.
Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.