Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Include exception type and reason in structured logging json #10209

Closed
anoadragon453 opened this issue Jun 18, 2021 · 6 comments · Fixed by #11028
Closed

Include exception type and reason in structured logging json #10209

anoadragon453 opened this issue Jun 18, 2021 · 6 comments · Fixed by #11028
Labels
T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@anoadragon453
Copy link
Member

Currently when structured logging is enabled (using either the JsonFormatter or TerseJsonFormatter), logger.exception lines don't include a traceback nor any exception information at all (other than what's presented in the string passed to logger.exception.

This is due to us ignoring the exc_text and exc_info attributes of the log record when during formatting:

# The properties of a standard LogRecord that should be ignored when generating
# JSON logs.
_IGNORED_LOG_RECORD_ATTRIBUTES = {
"args",
"asctime",
"created",
"exc_info",
# exc_text isn't a public attribute, but is used to cache the result of formatException.
"exc_text",

While it's not quite feasible to include full tracebacks in structured logging (one should use a tool like Sentry.io for that), it may be useful to include the exception type ((i.e psycopg2.OperationalError) and text (i.e FATAL: sorry, too many clients already) as structured fields. This will give sysadmins a clue as to what might be going wrong, whereas Sentry is useful for pointing out the line where the issue occurred (and thus where developers may be able to go in and fix the issue).

@anoadragon453 anoadragon453 added the T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. label Jun 18, 2021
@clokep
Copy link
Member

clokep commented Jun 18, 2021

I think the text should already be included? exc_text is the full stack trace, I think, while something like FATAL: sorry, too many clients already is just the message?

@anoadragon453
Copy link
Member Author

The logger call in question was this one:

logger.exception(
"Background process '%s' threw an exception",
desc,
)

Which resulted in:

{
  "log": "Background process 'start_pushers' threw an exception",
  "namespace": "synapse.metrics.background_process_metrics",
  "level": "ERROR",
  "time": 1623655604.36,
  "request": "start_pushers-0",
  "server_name": "example.com"
}

Which doesn't seem to include any useful exception information... unless it's getting lost in this specific instance?

@clokep
Copy link
Member

clokep commented Jun 18, 2021

Yeah that's not great. Thanks for clarifying!

@clokep
Copy link
Member

clokep commented Jun 18, 2021

It might also be useful to include the file / line where the exception happened, if that info exists...but again I don't think we want to re-implement Sentry here. 😄

@RyanSquared
Copy link

Knowing at least the exception type would significantly help us at Beeper and is more than enough. For more complex debugging, I would definitely prefer using Sentry instead. The reason we (Beeper) would like this feature was mostly so we can look at logs on startup, identify an error based on the exception type, and go quickly fix that.

@bradtgmurray
Copy link
Contributor

Also an issue with request handling failures:

This code block here:

logger.error(
"Failed handle request via %r: %r",
request.request_metrics.name,
request,
exc_info=(f.type, f.value, f.getTracebackObject()), # type: ignore
)

precise output:

2021-07-09 15:20:37,478 - synapse.http.server - 93 - ERROR - PUT-6 - Failed handle request via 'RoomTypingRestServlet': <SynapseRequest at 0x7f8b33668640 method='PUT' uri='<redacted by brad for github>' clientproto='HTTP/1.1' site=8083>
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
    callback_return = await self._async_render(request)
  File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render
    callback_return = await raw_callback_return
  File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/v1/room.py", line 1161, in on_PUT
    raise Exception("Got /typing request on instance that is not typing writer")
Exception: Got /typing request on instance that is not typing writer

structured output:

{"log":"Failed handle request via 'RoomTypingRestServlet': <SynapseRequest at 0x7f257b020280 method='PUT' uri='<redacted by brad for github>' clientproto='HTTP/1.1' site=8083>","namespace":"synapse.http.server","level":"ERROR","time":1625844440.35,"request":"PUT-15","ip_address":"::ffff:192.168.39.92","site_tag":8083,"requester":"<redacted by brad for github>","authenticated_entity":"@bradtest:beeper-brad.com","method":"PUT","url":"<redacted by brad for github>","protocol":"HTTP/1.1","user_agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36","server_name":"<redacted by brad for github>"}

without a backtrace

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants