Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add schema versions and dbt versions to json output #2670

Closed
beckjake opened this issue Jul 30, 2020 · 3 comments · Fixed by #2767
Closed

Add schema versions and dbt versions to json output #2670

beckjake opened this issue Jul 30, 2020 · 3 comments · Fixed by #2767
Labels
1.0.0 Issues related to the 1.0.0 release of dbt artifacts enhancement New feature or request

Comments

@beckjake
Copy link
Contributor

beckjake commented Jul 30, 2020

Describe the feature

Add schema versions to our hologram output. I imagine we'd version them with the $schema keyword, but I'm open to using an explicit schema-version field instead.

I've separated this from #2671 as I think using $schema means we should host the schemas somewhere and that requires a bit of infrastructure-type work. Or if we don't want to host the schemas, at least requires some care when it comes to generating the URLs.

Describe alternatives you've considered

No versions! Anarchy!

Who will this benefit?

Consumers of dbt's json output.

@beckjake beckjake added enhancement New feature or request triage labels Jul 30, 2020
@jtcohen6 jtcohen6 added 1.0.0 Issues related to the 1.0.0 release of dbt and removed triage labels Jul 30, 2020
@jtcohen6 jtcohen6 added this to the 0.19.0 milestone Aug 5, 2020
@jtcohen6
Copy link
Contributor

URL something like: schemas.getdbt.com/dbt/... We don't need to host anything there just yet.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Sep 10, 2020

In a world where we have schema versions for dbt artifacts, the --state flag should raise a nice error if trying to compare against an artifact of a different version from that which the current invocation would produce. It raises a much uglier error today (slack thread).

N.B. This isn't a prerequisite to resolving the issue. It feels most relevant here and I wanted to avoid losing the thought.

@beckjake
Copy link
Contributor Author

beckjake commented Sep 16, 2020

I've created schema URLs of this form: https://schemas.getdbt.com/dbt/{name}/v{version}.json

So for example, https://schemas.getdbt.com/dbt/catalog/v1.json

We don't happen to host those files, but we could.

For $schema, I misunderstood the spec - that refers to the json-schema schema itself. However, there is a $id field in the schema that is what I was thinking of.

I'm going to add two fields here for a certain type of object (RPC responses, catalog.json, manifest.json, run_results.json, and sources.json):

  • the schema itself is going to get a $id field that is a URI. Most users won't see this (unless we choose to host it!)
  • Those objects will also have a required schema property named dbt_schema_version. It will be required to be a constant of the schema's $id field. This is of course also reflected in the schema itself.

Reading an object triggers a check on the dbt_schema_version field that raises a special exception, which can be caught by dbt. dbt can catch, decorate, and re-raise that error with a filename for better error messaging.

In the future, when we implement the metadata field (#2761) we'll push that field down into the metadata, but for now it lives at the root level with most of the other metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0.0 Issues related to the 1.0.0 release of dbt artifacts enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants