Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for inconsistent Bigint/Long datatype that makes dbt think there was a schema change #358

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

francescomucio
Copy link

@francescomucio francescomucio commented May 16, 2022

resolves #642

Description

This PR fixes the inconsistent datatype returned for the Bigint columns (bigint vs long) by Spark sql.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-spark next" section.

@cla-bot
Copy link

cla-bot bot commented May 16, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

@jtcohen6
Copy link
Contributor

@cla-bot check

@cla-bot
Copy link

cla-bot bot commented May 16, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

@cla-bot
Copy link

cla-bot bot commented May 16, 2022

The cla-bot has been summoned, and re-checked this pull request!

@jtcohen6
Copy link
Contributor

jtcohen6 commented May 16, 2022

@francescomucio Thanks for the PR!

Could you rebase/squash your commits and force push? It looks like the first one doesn't have a git identity set, which is why the CLA bot isn't happy

@cla-bot
Copy link

cla-bot bot commented May 16, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

* More consistent results from get_columns_in_relation

* Not dispatched, full name

* Add changelog entry
@McKnight-42
Copy link
Contributor

@francescomucio sorry for the delay this is looking very good, do you think you will have time to update the branch to clear up the conflicts? would love to keep your local and remote up to date with one another.

@francescomucio
Copy link
Author

@McKnight-42 sorry for the late reply, I will try to work on this today. BTW I reached out to the Databricks support and they told me that the problem is also in opensource Apache Spark. Should I prepare a PR also for that adapter?

CHANGELOG.md Outdated
Comment on lines 6 to 39
### Under the hood
- Add `DBT_INVOCATION_ENV` environment variable to ODBC user agent string ([#366](https:/dbt-labs/dbt-spark/pull/366))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this changelog entry disappeared — mind adding it back, and adding one of your own? :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ups, sure, I will do it

@jtcohen6 jtcohen6 added the ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering label Jun 15, 2022
@francescomucio
Copy link
Author

@McKnight-42 @jtcohen6 sorry for being late, I finally found the time to fix this PR. I guess this will go in 1.2.0, not in 1.1.1

@McKnight-42
Copy link
Contributor

@francescomucio This is looking fantastic seems once you did some pushing up the pre-commit test started failing for end-of-line spacing. if you could rerun the pre-commit command locally again and add those eof changes should fix it.

@francescomucio
Copy link
Author

@McKnight-42 I think I fixed it

@McKnight-42
Copy link
Contributor

@francescomucio thank you so much, i'm sorry about this we just released some new versions for dbt-core and the adapters this morning, if you could pull down the latest changes should fix the version mismatch issue and get the ci running again.

jtcohen6 and others added 5 commits July 26, 2022 21:58
)

* Not dropping table for incremental full refresh with delta

* Updated changelog

* Simplified conditional logic according to suggestion

* Updated changelog

* Only drop table if not delta table

Co-authored-by: Jeremy Cohen <[email protected]>

* Update changelog, trigger CircleCI tests

Co-authored-by: Jeremy Cohen <[email protected]>
Co-authored-by: Jeremy Cohen <[email protected]>
* Run tests for data type macros. Fine tune numeric_type

* Hard code seed loading types for float + int

* Repoint, fixup, changelog entry
@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

2 similar comments
@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

1 similar comment
@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Francesco Mucio.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https:/settings/emails

@McKnight-42
Copy link
Contributor

@jtcohen6 would this be by setting the on_schema_change to "sync_all_columns" so that it does the type checking?

https://docs.getdbt.com/docs/building-a-dbt-project/building-models/configuring-incremental-models#what-if-the-columns-of-my-incremental-model-change

@francescomucio
Copy link
Author

@McKnight-42 @jtcohen6 any update on this?

We are trying to upgrade to dbt version 1.3 and I still need to apply this fix.

Copy link
Contributor

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the status of this change?

@@ -46,7 +50,7 @@ def numeric_type(cls, dtype: str, precision: Any, scale: Any) -> str:
return "{}({},{})".format("decimal", precision, scale)

def __repr__(self) -> str:
return "<SparkColumn {} ({})>".format(self.name, self.data_type)
return "<SparkColumn {} ({})>".format(self.name, self.translate_type(self.data_type))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.data_type is already translated?

@github-actions
Copy link
Contributor

This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Sep 14, 2023
@github-actions github-actions bot removed the Stale label Sep 15, 2023
Copy link
Contributor

github-actions bot commented Aug 6, 2024

This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-642] Schema change detected for bigint columns (expected long)