Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text vs JSON and File vs CLI logs #4354

Closed
1 task done
nathaniel-may opened this issue Nov 29, 2021 · 2 comments · Fixed by #4379
Closed
1 task done

Text vs JSON and File vs CLI logs #4354

nathaniel-may opened this issue Nov 29, 2021 · 2 comments · Fixed by #4379
Assignees
Labels
enhancement New feature or request logging tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@nathaniel-may
Copy link
Contributor

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

re this comment, text log lines might want to be different per destination. We may want text log lines to filter from json log lines, but I'm not sure about that. Just something to consider.

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

@nathaniel-may nathaniel-may added enhancement New feature or request triage tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality labels Nov 29, 2021
@jtcohen6 jtcohen6 removed the triage label Nov 29, 2021
@jtcohen6
Copy link
Contributor

jtcohen6 commented Nov 29, 2021

To summarize myself over in #4293:

Broadly, I find that I want:

  • color, concision, clean spacing on the CLI
  • colorless + compact verbosity in the File + JSON-formatted

Edit since then: Let's let users determine whether to use color or not based on the colorizing configs. There's good reasons to preserve color in JSON-formatted log messages, if they're going to be eventually displayed in a CLI-like setting.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Dec 1, 2021

@gshank This is the issue we can use for the last bit of cosmetic work.

Let's keep JSON-formatted logs as they are. They include everything, as they should. The intended consumer of these is a machine, and users can spin up their own scripts (or even just jq) to modify/filter down what's shown.

From my perspective, the big idea is differentiating between standard text formatted logs for the CLI (info-level) and File (debug-level). These are both intended for use by human beings, and for fundamentally different purposes:

  • CLI logs are interactive. They should show what's happening, while it's happening, and not too much more. (Conversely, we should avoid extended periods where we show nothing—this adds to frustration when things are slow, such as full re-parses of big projects.) CLI logs should also surface warnings and errors in a way that makes them easy to spot, and easy to take further action.
  • Sometimes, that further action should look like diving into more detailed logs, stored in logs/dbt.log. These should have everything that a human being might need to debug what went wrong with a dbt project. If we've failed to include useful information, it's lost to the sands of time—or until they re-run / reproduce the error with JSON-formatted logging.

So my inclination is that, relative to how this works presently (in main):

  • CLI: Text-formatted log lines should include short timestamps and messages — that's all
  • File: Text-formatted messages should include longer timestamps, log level (info/warn/error), thread name, and a way to visually see where one run stopped and the next one started. (Would it make sense to include invocation_id too, or would that add too much clutter?)
2021-12-01 15:55:36.159022 | info  | MainThread | Running with dbt=1.0.0-rc3
2021-12-01 15:55:36.159022 | debug | MainThread |

======================================

Running dbt with arguments:
- ...

======================================
2021-12-01 15:55:37.432664 | debug  | Thread-3  | Using postgres connection "model.testy.my_view"

Or something like that.

In practice, I think this will look like splitting up create_text_log_line into two functions (one for CLI and one for File), adding an arg to create_log_line for CLI/File, and passing that arg from these two sites accordingly.

Tricky things:

  • I think we're going to struggle to get MainReportVersion + MainReportArgs into the File, because they're currently fired before we've parsed dbt_project.yml, i.e. before we know the log_path (in case it's user-configured).
  • We do have the ability to construct different messages whether an event is firing to the CLI or the File, by using/overriding the file_msg and cli_msg methods. (The base types are implemented to use the same message for both by default.)

More things to think about:

  • Some logs are so verbose that we shouldn't include them by default, e.g. adapter cache events. Today, we turn those off by default, and allow users to turn them on with the --log-cache-events config. (In practice, we only do this to debug very specific errors.) In the future, we could think about encoding these as "trace"-level, and allowing users to opt in.
  • We should enable users to opt into / relevel logs from dependencies. This is especially important to see "trace-level" details of connection errors for specific database clients. We support this in a pretty janky way today (hard-coded in dbt-core, ugh).

Open to hearing thoughts from everyone!

@gshank gshank self-assigned this Dec 1, 2021
gshank added a commit that referenced this issue Dec 2, 2021
* [#4354] Different output for console and file logs

* Tweak some log formats

* Change loging of thread names
leahwicz pushed a commit that referenced this issue Dec 2, 2021
* [#4354] Different output for console and file logs

* Tweak some log formats

* Change loging of thread names
leahwicz added a commit that referenced this issue Dec 2, 2021
* [#4354] Different output for console and file logs

* Tweak some log formats

* Change loging of thread names

Co-authored-by: Gerda Shank <[email protected]>
iknox-fa pushed a commit that referenced this issue Feb 8, 2022
* [#4354] Different output for console and file logs

* Tweak some log formats

* Change loging of thread names

automatic commit by git-black, original commits:
  c220b1e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request logging tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants