-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display BigQuery error stream when a load fails during dbt seed. #1079
Display BigQuery error stream when a load fails during dbt seed. #1079
Conversation
@joshtemple nice! I just tried to kick off tests for this PR, but I think GitHub is still working it's way through webhooks. This provisionally looks good to me :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the general idea, I do have concerns about the type(e)(...)
pattern.
dbt/adapters/bigquery/impl.py
Outdated
@@ -278,7 +278,8 @@ def poll_until_job_completes(cls, job, timeout): | |||
raise dbt.exceptions.RuntimeException("BigQuery Timeout Exceeded") | |||
|
|||
elif job.error_result: | |||
raise job.exception() | |||
e = job.exception() | |||
raise type(e)(message=e.message, errors=job.errors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about calling type
to get a class object and just assuming that it works to call as a constructor. I mean, I know it's ok here, but job
and job.exception()
come from google, not us.
Is this interface (in particular, the fact that the __init__
of the exception class returned by job.exception()
accepts an errors
keyword argument) considered stable in any way?
I think I would prefer something like:
msg = '{}\n{}'.format(e.message, '\n'.join(str(e) for e in job.errors)).strip()
raise dbt.exceptions.RuntimeException(msg)
I haven't tested it, and I'm not 100% sure on the type of job.errors
, but I assume something like that would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, totally fair, I went back and forth on that myself. In the end I decided not to hardcode a dbt exception since I wasn't sure about the implications of that downstream for logging. If you're more comfortable with raising a RuntimeException
as you outlined, I'll change it.
Google API Errors inherit from a base class (GoogleAPICallError) that accepts errors
and message
as keyword arguments, so it should be safe to assume we can pass those args. Alternatively, we could hardcode a generic GoogleAPICallError
exception (see here) or BadRequest
(which is what is actually raised in this case) which would ensure we can pass those args, rather than using type
.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless it has a negative downstream impact (triggering the exception handler in the wrong way, comes to mind) I would prefer to raise a dbt-native exception. At some point we'll convert it anyway for display, might as well get it done early.
…loud API exception.
Made the change. Only slight difference now is that the error message displays
|
woop woop! Nice work @joshtemple :) I'm going to let the tests run, and then will merge this in. This will go out in out 0.12.0 release! |
Creates and raises a new exception, augmenting the
errors
attribute of the exception with the detailed error stream from thejob
object. This errors attribute is unpacked downstream by thehandle_error
method.I tested this out with a toy CSV file, adding a leading comma in the header row to induce a BigQuery load API error.
Before this change, the error is displayed as follows:
After this change, the full error details are included:
Fixes #1076