Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model intersection syntax #2417

Conversation

Raalsky
Copy link
Contributor

@Raalsky Raalsky commented May 7, 2020

resolves #2167

Description

Extends nodes selection syntax with the intersection operator (,). In ex. the intersection of multiple tags: --models tag:abc,tag:def,tag:ghi which stands for every model which has tags abc and def and ghi.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label May 7, 2020
@drewbanin
Copy link
Contributor

I think these unit tests are a great place to start! Let us know if there's anything we can help out with @Raalsky :D

@Raalsky
Copy link
Contributor Author

Raalsky commented May 8, 2020

@drewbanin I was thinking about which syntax do we prefer: --models <spec>,<spec> vs. --models [<spec>,<spec>] suggested by @clrcrl and what about corner cases like --models <spec>, or --models ,<spec>. Do we need to specify expected behavior or just leave it as undefined behavior? 🤔

@drewbanin
Copy link
Contributor

drewbanin commented May 8, 2020

I think we'll want to go with --models <spec>,<spec> for now. I can definitely imagine adding parens or square brackets in the future, but we would probably need to build a proper parser for these selectors, which i don't think we need to do right now. I imagine that if we choose to add grouping with parens or brackets in the future, that would fit right into this syntax :)

I do think it would be a good idea to validate these strings. I'd say:

  • selectors may be joined with an intersection operator (,)
  • the intersection operator is a binary operator, and it must have valid selectors on the LHS and RHS
  • intersection operators may be chained, and they should be left-associative (but it doesn't actually matter for intersections!)

For the last point, left-associativity, this means:

--models abc,def,ghi

This should be computed as:

(abc ⋂ def) ⋂ ghi

The existing union operator (a space) should have lower precedence that the intersection operator, so:

--models abc,def ghi

This should compute:

(abc ⋂ def) ⋃ ghi

Do you buy all of that? And do you think I missed anything?

@Raalsky
Copy link
Contributor Author

Raalsky commented May 8, 2020

Thanks, a lot! @drewbanin

(abc ⋂ def) | ghi

Do you mean:
(abc ⋂ def) ⋂ ghi
?

I think that's everything I should know for now. Thanks

@drewbanin
Copy link
Contributor

Do you mean (abc ⋂ def) ⋂ ghi

I sure do! I was using | for intersection, but i thought i'd be fancy and look up the symbol instead :)

Just updated my comment above for clarity!

@drewbanin
Copy link
Contributor

hey @Raalsky - this PR is looking really slick! Just a heads up that we've entered the Release Candidate period for 0.17.0, so I unfortunately don't think we'll be able to get this in for that release. We can definitely get it merged for our next feature release, dev/marian-anderson though!

Let us know when this is ready for review :)

@Raalsky
Copy link
Contributor Author

Raalsky commented May 13, 2020

@drewbanin Yeah, I got notification from pypi 😜 . That's okay. I just want to rebase onto dev/octavius-catto before dev/marian-anderson.

@Raalsky Raalsky changed the base branch from dev/octavius-catto to dev/marian-anderson May 13, 2020 15:48
@Raalsky Raalsky marked this pull request as ready for review May 13, 2020 15:53
@Raalsky
Copy link
Contributor Author

Raalsky commented May 13, 2020

Integration tests seem to fail due to hooks tests? I don't know the exact reason.

I've also added a few of (unit/integration) tests for concatenation (dbt run --models <spec1> <spec2>) because I couldn't find any suitable.

I should apply some refactors to selector but let me know what you think.

I've switched to Ready for review.

@drewbanin
Copy link
Contributor

@beckjake do you have any leads on what's up with this test failure?

@beckjake
Copy link
Contributor

beckjake commented May 14, 2020

Huh. So, this is one of those "I don't know how this code ever worked" situations... I think it's because of this:



class TestPrePostModelHooksInConfig(BaseTestPrePost):
    @property
    def project_config(self):
        return {
            'macro-paths': ['macros'],
        }

Which should have a 'config-version': 2. But why was this passing before? And what made it fail now?

@Raalsky
Copy link
Contributor Author

Raalsky commented May 15, 2020

@beckjake Switch to 'config-version': 2 seems like a solution 😑

@beckjake
Copy link
Contributor

That test passed with no errors on linux and I even checked the log. There are no warnings about the wrong config-version. But running it locally and alone, it fails with that message. Upon inspection, it's obvious that it should fail with the message about the wrong version - it's running in strict mode but the result of the clone will be a project without a config-version: 2.

I think what's going on here is that we should be calling dbt.deprecations.reset_deprecations() in every test setUp or tearDown(). We currently aren't outside the explicit deprecations test, so as long as at least one non---strict job runs on the thread and the deprecation is triggered.

I just can't wait to see what tests have been silently and incorrectly passing once I flip this flag.

@beckjake
Copy link
Contributor

@Raalsky I've merged that fix in to the dev/marian-anderson branch. If you pull that in, it should fix the test failures (you might have a conflict since you also changed the same test)

@Raalsky Raalsky force-pushed the feature/model-intersection-syntax branch from cf10881 to 069cecb Compare May 15, 2020 21:27
@Raalsky Raalsky force-pushed the feature/model-intersection-syntax branch from c8d8951 to bf12001 Compare May 15, 2020 22:08
Copy link
Contributor

@beckjake beckjake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, let's get this merged for 0.18!

@beckjake beckjake merged commit 4e2ec6b into dbt-labs:dev/marian-anderson May 18, 2020
@Raalsky Raalsky deleted the feature/model-intersection-syntax branch May 18, 2020 14:22
@drewbanin
Copy link
Contributor

Thanks for your hard work on this one @Raalsky - this is an awesome feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Model intersection syntax
3 participants