-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data processed info into dbt run logs for all statement types #2530
Add data processed info into dbt run logs for all statement types #2530
Conversation
Sure thing! Have you taken a look at the contributing guide, specifically the section about testing? It looks like the failure in CircleCI is related to pep8 style. You can start by testing for this locally with a combination of As far as the code itself: Did you also want to add bytes processed to the |
@jtcohen6 Thank you for the response. |
@jtcohen6 I just realised that when I wrote "I am not sure what to write and where." I did not specified the fundamental part that I was talking about the CHANGELOG. |
This worked for me locally! I just kicked off the rest of the integration tests. My only hesitation on this is purely cosmetic: It's a lot more CLI text than we're used to printing to info. I wonder if we should try to summarize the row count, e.g. @drewbanin Could you lend an aesthetic eye? |
@ jtcohen6 |
This is groovy! I do agree - this is a lot of characters if we're working in an ~80 character budget. My vote would be for something like:
I don't feel super strongly about that though - happy to discuss if anyone thinks differently. This is really cool @alepuccetti - nice work so far! |
@drewbanin this looks great to me.
I can write a However I have a couple of questions on the implementation.
Alternative for rows number formatting (I am not a fan but could be an option): Thoughts? |
I'm all for:
Big fan of scientific notation in general, but agree it doesn't feel right here :) I'm also all for you coding up a |
@jtcohen6 Queries/Scripts I will do these changes in the next days. |
@jtcohen6, @drewbanin: I finally got the time to finish this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great! Let's reduce the print width a smidge further by cutting the word processed
. I think it's fine to leave that word for script output, though, because:
- it implies "total processed" / "processed overall"
- there's no row count to show, so the widths end up about the same
I'm not sure why the py38 integration test failed on Postgres. Much more important is that the BigQuery integration tests passed.
@jtcohen6 I think that not having "processed" can be misleading. It can be interpreted as the size of the results. I squashed the proposed changes for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Ok, I'm happy with how you've set this up, and I'm glad we have a future vision of how to make this more configurable for users (#2580).
In the meantime, let's get these tests running. Can you merge or rebase the changes from dev/marian-anderson
? Doing so should kick off integration tests automatically once the unit tests are passing.
core/dbt/utils.py
Outdated
def format_rows_number(rows_number): | ||
for unit in ['', 'k', 'm', 'b', 't']: | ||
if abs(rows_number) < 1000.0: | ||
return f"{rows_number:3.1f} {unit}".strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed this one! I think this is the cause of the failing unit test:
return f"{rows_number:3.1f} {unit}".strip() | |
return f"{rows_number:3.1f}{unit}".strip() |
This is greenest as it comes. Very excited for my first contribution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alepuccetti Looks great! I appreciate your patience on this. I left one last comment about consolidating the changelog notes, once that's set this is good to merge.
Co-authored-by: Jeremy Cohen <[email protected]>
@jtcohen6 Done ✅ |
@beckjake Postgres integration test failed with this error:
Otherwise, this is good to merge from my point of view |
resolves #2526
Description
Changing the log output of BigQuery query CREATE_TABLE_AS_SELECT statement to include byte processed.
Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.Sorry but at the moment, I cannot install all the requirements to run the tests suite.
I am not sure what to write and where. @jtcohen6 can you offer an advice?