Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1987] [Feature] utility to show manifest differences #6819

Closed
3 tasks done
noel opened this issue Jan 31, 2023 · 9 comments
Closed
3 tasks done

[CT-1987] [Feature] utility to show manifest differences #6819

noel opened this issue Jan 31, 2023 · 9 comments
Labels
awaiting_response enhancement New feature or request stale Issues that have gone stale

Comments

@noel
Copy link

noel commented Jan 31, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

When slim ci does not work as expected it is difficult to troubleshoot what exactly is different between two manifests.
It would be great if there was a way to output the exact key in the manifest which is different. This would make debugging simpler.

Describe alternatives you've considered

dbt ls -s state:modified.
This process at least narrows down the location of the difference, but you still need to check at key differences between the two manifests

Who will this benefit?

Anyone using slim ci.

Are you interested in contributing this feature?

No response

Anything else?

No response

@noel noel added enhancement New feature or request triage labels Jan 31, 2023
@github-actions github-actions bot changed the title [Feature] utility to show manifest differences [CT-1987] [Feature] utility to show manifest differences Jan 31, 2023
@dbeatty10
Copy link
Contributor

Thanks for this suggestion @noel !

It makes sense to figure out ways to simplify debugging when two manifests differ in some unknown way.

I wonder if something like jq could help out here?

Example

Let's say that you have two manifests manifest-1.json and manifest-2.json and jq is installed. Then run the following:

diff -I "created_at" <(jq . manifest-1.json) <(jq . manifest-2.json) > my_diff

Here's my_diff for a project similar to this one. Can you tell what I changed between each invocation?

Click here to reveal diff
5,6c5,6
<     "generated_at": "2023-02-01T00:55:39.494863Z",
<     "invocation_id": "0e612d69-17aa-4331-aa77-559def7bf77b",
---
>     "generated_at": "2023-02-01T00:56:09.170195Z",
>     "invocation_id": "3649e233-054e-49f9-8f39-111212939828",
848c848,854
<         "post-hook": [],
---
>         "post-hook": [
>           {
>             "sql": "select 1",
>             "transaction": true,
>             "index": null
>           }
>         ],
862,863c868,873
<       "unrendered_config": {},
<       "created_at": 1675212940.54368,
---
>       "unrendered_config": {
>         "post-hook": [
>           "select 1"
>         ]
>       },
>       "created_at": 1675212970.255466,
908c918,924
<         "post-hook": [],
---
>         "post-hook": [
>           {
>             "sql": "select 1",
>             "transaction": true,
>             "index": null
>           }
>         ],
922,923c938,943
<       "unrendered_config": {},
<       "created_at": 1675212940.545138,
---
>       "unrendered_config": {
>         "post-hook": [
>           "select 1"
>         ]
>       },
>       "created_at": 1675212970.258373,
968c988,994
<         "post-hook": [],
---
>         "post-hook": [
>           {
>             "sql": "select 1",
>             "transaction": true,
>             "index": null
>           }
>         ],
982,983c1008,1013
<       "unrendered_config": {},
<       "created_at": 1675212940.546604,
---
>       "unrendered_config": {
>         "post-hook": [
>           "select 1"
>         ]
>       },
>       "created_at": 1675212970.261195,
1028c1058,1064
<         "post-hook": [],
---
>         "post-hook": [
>           {
>             "sql": "select 1",
>             "transaction": true,
>             "index": null
>           }
>         ],
1042,1043c1078,1083
<       "unrendered_config": {},
<       "created_at": 1675212940.547963,
---
>       "unrendered_config": {
>         "post-hook": [
>           "select 1"
>         ]
>       },
>       "created_at": 1675212970.263995,

Explanation of the command

  • This gist gist demonstrates the diff
  • This post to ignore all the different created_at timestamps

More ergonomic, using a shell alias

The command above is a bit of a bear. There's no way my fingers could possibly remember it.

Different shells have different ways of creating parametrized aliases, but I'll demonstrate using zsh (as described here):

  1. Open ~/.zshrc
  2. Add something like this (replacing diffman with your preferred alias):
    diffman() {
        diff -I "created_at" <(jq . $1) <(jq . $2)
    } 
    
  3. Save and close
  4. Run source ~/.zshrc to re-load the shell
  5. diffman manifest-1.json manifest-2.json

Admittedly, I haven't done a ton of manifest diff'ing myself, so I'll be interested to hear your feedback on how useful (or not!) an approach like this might be.

@github-actions
Copy link
Contributor

github-actions bot commented May 2, 2023

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label May 2, 2023
@noel
Copy link
Author

noel commented May 2, 2023

Still seems valuable and maybe even more so with the new features in 1.5+

@dbeatty10
Copy link
Contributor

@noel Interested to hear more what you mean by:

even more so with the new features in 1.5+

Did you try out these ideas for manifest diff'ing, by any chance?

@noel
Copy link
Author

noel commented May 2, 2023

I meant that things like model versions and cross project dependencies may make this even harder. I still have to play around with 1.5, but I suspect there can be scenarios that should/shouldnt trigger during a state:modified run and when they do it could be difficult to pin point the reason why a model did/didnt run.

Regarding the manifest diffing, I guess I forgot to report back when I initially looked at this, but the output was too verbose and difficult to interpret. I can show you if you are interested as I just reran it on a branch that has a single change yet I got a lot of output from diff

@dbeatty10
Copy link
Contributor

@noel Providing that example of how to generate a diff between manifests is probably the best we can offer here.

You're right that those diffs can be pretty verbose! Performing some kind of diff between manifests is a decent place to start troubleshooting, but it will be more art than science to sort the signal from the noise and interpret the differences.

Troubleshooting / debugging problems with state:modified will always have some inherent complexity to it that we can't make go away -- it's necessarily hard.

For those reasons, this seems like something we'd choose not to take on. What do you think @jtcohen6 ?

@github-actions github-actions bot removed the stale Issues that have gone stale label May 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Aug 1, 2023

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Aug 1, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting_response enhancement New feature or request stale Issues that have gone stale
Projects
None yet
Development

No branches or pull requests

2 participants