Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split up adapters to encourage adapter-specific installation #3361

Closed
jtcohen6 opened this issue May 16, 2021 · 8 comments
Closed

Split up adapters to encourage adapter-specific installation #3361

jtcohen6 opened this issue May 16, 2021 · 8 comments
Assignees
Labels
1.0.0 Issues related to the 1.0.0 release of dbt enhancement New feature or request Epic install

Comments

@jtcohen6
Copy link
Contributor

Today, pip install dbt + brew install dbt actually install four python packages and their dependencies, including dbt-core:

  • dbt-postgres
  • dbt-redshift
  • dbt-snowflake
  • dbt-bigquery

In our docs, we now encourage users to install the specific adapter plugin they plan to use whenever possible, to avoid possible installation issues caused by another adapter plugin that they're not even using, but which is included among the four above.

That recommendation is backed up by each of those individual plugins being available on PyPi today. To further ease the path, we should consider creating and maintaining:

  • Separate Homebrew installations for specific plugins
  • Separate Docker images for specific plugins

At the same time, we should reinvest in a "just install dbt" experience, especially for the benefit of newer users who want to take it for a spin before knowing the exact adapter they plan to use. In my mind, the ideal would be an interactive experience that asks users which adapters they want to install (perhaps recommending Postgres by default), and which offers detailed tips for getting started, similar to current dbt init command. We would then repurpose pip install dbt/brew install dbt to offer that experience, in place of the bundled installation that exists today.

@jtcohen6 jtcohen6 added enhancement New feature or request 1.0.0 Issues related to the 1.0.0 release of dbt install labels May 16, 2021
@jtcohen6
Copy link
Contributor Author

Related question: Should we also move the four plugins included this repository into four separate repositories, ahead of v1.0?

@gshank
Copy link
Contributor

gshank commented Jun 4, 2021

I think it makes sense to spin off redshift, snowflake, and bigquery into their own repositories. I'm less certain about Postgres. The postgres plugin has a lot of the default behavior, so it feels like we'd end up having to prereq it for the other plugins. Also the dbt repository by itself wouldn't be functional, which feels kind of wrong.

Separate Docker images feels reasonable.

I'm not very familiar with how customizable the pip install and brew install processes are, but more seamless installation process is certainly a worthwhile goal.

@leahwicz
Copy link
Contributor

leahwicz commented Jun 4, 2021

Work we would need to do:

  • Create 3-4 new repos and pull the code into each

  • Split out the integration tests for those

  • Create docker images for each

  • Split out the Homebrew install

  • Hardest part- spitting up the integration tests

  • Not too hard - different Docker images

Open questions:

  • What about postgres?
  • Do we rename dbt to dbt core?
  • How would install work?

@leahwicz
Copy link
Contributor

leahwicz commented Jun 4, 2021

Maybe we can split this up into smaller issues/goals

This was referenced Jun 16, 2021
@jtcohen6 jtcohen6 changed the title Rework standard installation experience Split up adapters to encourage adapter-specific installation Jun 28, 2021
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Aug 30, 2021

We're going to give this a go in two weeks. Latest thinking:

  • Codebases:
    • Create new repos for dbt-redshift, dbt-snowflake, dbt-bigquery. (We'll want to transfer issues from this repo and update external links.)
    • Rename this repo to dbt-core. It will contain the code for dbt-core + dbt-postgres packages, as well as the better part of our testing suites (base components + unit tests + postgres integration tests, which have the greatest coverage).
  • Install: [potentially controversial!] Remove support for pip install dbt and brew install dbt entirely. Instead, users will pip install dbt-<adapter>, brew install dbt-<adapter>, or pull a Docker image for the desired adapter/version combo.
  • CI: We should have a workflow that installs and runs integration tests for all Labs-supported adapters with latest dbt-core changes. Rather than triggering this on every dbt-core PR, I think it makes sense to run these once per night and again during release, to determine if we've made an unexpected breaking change (similar to our performance regression tests).
  • Release: For minor version prereleases + final releases, we should cut a new version of every Labs-supported adapter plugin at the same time that we cut a new version of dbt-core. At the same time, our adapter plugins should pin minor versions only (e.g. dbt-core~=1.2), so that we don't need to cut a new plugin patch release for every new dbt-core patch release.

Things we still need to figure out:

  • Testing module: How should we define (+ distribute?) the base test suite, such that it can be inherited / built upon by each adapter plugin?
  • Homebrew: Can our hombrew-dbt repo support Homebrew distributions for every Labs-supported adapter? Would it be possible to brew install multiple adapters in the same environment? (I hope the answer is yes, I just don't know the answer for sure.)
  • Docker: We should create an image for each adapter-version combo. To which registry(ies) will we want to push these images? (We only push to DockerHub today, and we'll want to update the org name there from fishtown-analytics to dbt-labs. There are reasons to consider pushing to cloud-specific registries in addition.)

@leahwicz @kwigley Let's start scratching out answers, so that we're ready to start writing code on September 15. I'm also on board with splitting up this issue into a bunch of smaller ones, with an epic to track them.

@drewbanin If you disagree strongly with any of the above, I'd be eager to hear your thoughts!

& at the end of all that, I'm going to take another pass at dbt-labs/docs.getdbt.com#638 :)

@leahwicz
Copy link
Contributor

@jtcohen6 here are my thoughts on the points here:

  • Codebase: 👍 and 👍
  • Install: does this need to be decided and done before v1.0? I'm not against it, just wondering priority ordering here
  • CI: With the recent changes that @kwigley made, we don't run the integration tests on every PR anymore but we do on each commit to develop and release/*. I don't mind keeping it this way after the split out so it makes it easy to track down which change might have broken something pertaining to the adapters.
  • Release; The key here will be that our core release then triggers our adapter releases then b/c it will be too much work to go in and trigger each one. I don't think that should be hard though

I'm not sure if we know this, but if there is an adapter that we would consider "simpler" to split out, let's prioritize that one first as a test and then move on to the ones we think will be harder

@jtcohen6
Copy link
Contributor Author

@leahwicz Thanks for the thoughts!

  • The change to remove pip install dbt / brew install dbt isn't a technical requirement needed to unblock other steps. It is a big signal to the community, though, and the timing for it feels right—I want to reinforce both (a) the repo rename of dbtdbt-core and (b) our updating branding/trademark guidelines
  • dbt-redshift might be the simplest to split out, in the sense of having the fewest integration tests: ~45 for dbt-redshift vs. ~75 for dbt-snowflake and ~85 for dbt-bigquery. It does inherit from dbt-postgres, whereas the other two inherit directly from dbt-core; I don't think that adds much complexity, though. Ultimately, I trust your + @kwigley's judgment for how to order these.

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Nov 7, 2021

There are a few outstanding pieces, but we've achieved the brunt of the work outlined in this epic. I'm going to close this for now. After we cut v1.0.0-rc1, let's take stock of what packaging/release pieces remain outstanding, and open new issues for them.

@jtcohen6 jtcohen6 closed this as completed Nov 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0.0 Issues related to the 1.0.0 release of dbt enhancement New feature or request Epic install
Projects
None yet
Development

No branches or pull requests

5 participants