-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-12641] Use google-auth instead of oauth2client for GCP auth #15004
Conversation
Codecov Report
@@ Coverage Diff @@
## master #15004 +/- ##
==========================================
+ Coverage 73.94% 83.63% +9.68%
==========================================
Files 667 453 -214
Lines 88071 63264 -24807
==========================================
- Hits 65128 52912 -12216
+ Misses 21832 10352 -11480
+ Partials 1111 0 -1111
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
61afa23
to
609999d
Compare
As a reference, I was referred to: https:/googleapis/google-auth-library-python-httplib2 as how some other libraries went through a similar transition. |
Thanks! I wasn't aware that library existed, will see if I can incorporate it. |
:sdks:python:test-suites:tox:py38:testPy38CloudCoverage fails with ``` ImportError: cannot import name 'enquote_executable' from 'distlib.scripts' (/home/chuck.yang/beam/build/gradleenv/-1227304282/lib/python3.8/site-packages/distlib/scripts.py) ``` when I don't bump this version. It's odd because 0.3.1 should be enough to solve the problem...
Use google-auth-httplib2 library instead of implementing our own shims.
1642834
to
485c035
Compare
R: @tvalentyn - Could you review this? Do you see any issues with Dataflow internals related to this? |
What's the status of this PR? @tvalentyn @chunyang |
@pabloem I'm waiting for a review, I heard that this change may not be compatible with the internal repo at Google, but hope @tvalentyn can confirm/deny. On my side I need to look at the Python PreCommit failure. It seems unrelated to me at first glance. |
Thanks @chunyang for working on this and your patience with this change. Adding a dependency on google-auth-httplib2 should be possible, looking closer at the change. |
@@ -115,29 +131,32 @@ def get_service_credentials(cls): | |||
|
|||
@staticmethod | |||
def _get_service_credentials(): | |||
if is_running_in_gce: | |||
# We are currently running as a GCE taskrunner worker. | |||
return _GceAssertionCredentials(user_agent='beam-python-sdk/1.0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this path was used to authenticate requests originating from GCE VMs, and we had it so that when Beam SDK is running Dataflow workers, the requests to GCP were authenticated just because the SDK is running on GCE VM.
I wonder how this will work with the new dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @tvalentyn. My understanding is that the google.auth.default()
call in line 151 will attempt to find credentials on GCE VMs using the instance Metadata Server so we don't need a special case within the Beam code. Is this something we can check via the existing integration tests?
Run Python 3.7 PostCommit |
Yes, we should be able to check the behavior in postcommit integration tests. |
Run Python 3.7 PostCommit |
Run Python 3.7 PostCommit |
Oops sorry, I broke the postcommit. Fix pending in #15584 |
Run Python 3.7 PostCommit |
Hi @yeandy, I merged master and added you as a collaborator on my fork. Is that enough permissions? |
Thanks, that should work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done the internal verification, and everything seems to work fine.
Before final approval/merge, I have a few questions @tvalentyn.
- Do we want to also remove oauth2client from the four
base_image_requirements.txt
files under thesdks/python/container/py3*
directories too? - And do we still need
oath2client
in deps_urls_py.yaml?
sdks/python/build-requirements.txt
Outdated
@@ -24,4 +24,4 @@ grpcio-tools==1.37.0 | |||
mypy-protobuf==1.18 | |||
|
|||
# Avoid https:/pypa/virtualenv/issues/2006 | |||
distlib==0.3.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why has this been bumped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran into an error running tests locally when I didn't bump the version (see commit 0bd2c24). However, I don't think Jenkins had the same problem. We can probably revert this bump to minimize the change set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we can revert this back.
Yes, we should regenerate the container dependencies, can be done in separate commit/pr, or you can follow https://s.apache.org/beam-python-requirements-generate and push more commits to this branch. we can remove the entry in deps_urls_py.yaml in the same change. |
Running s.apache.org/beam-python-requirements-generate currently requires a linux machine. some deps are not installable on macos. |
Thanks for the info! I'll start regenerating them. |
R: @tvalentyn PTAL. Regenerated requirements, removed |
Happy to merge when tests pass, thanks all. |
Was just reviewing again, and not sure why, but |
is it a transitive dep of some other dep? you can install pipdeptree to check. |
Seems that it's a dependency of
|
@tvalentyn If I'm understanding this correctly, this doesn't require any additional regeneration, right? |
Run Portable_Python PreCommit |
Run PythonDocker PreCommit |
Run Python PreCommit |
right |
We need to revert the change to dep_urls_py.yaml. |
retest this please |
hmm. not seeing test signals for some reason, might be a github issue |
ok, triggered now. |
Congratulations! And thank you both! |
This change switches the GCP auth library from
oauth2client
togoogle-auth
.google-auth-httplib2
library is used to provide authorized HTTP clients that work with the existing vendored libraries for BigQuery, GCS, etc.I'm interested in this migration because of the need to use custom token URIs for issuing service account tokens--it's supported by
google-auth
but notoauth2client
.dev list discussion thread: https://lists.apache.org/x/thread.html/r4345315bfa80c6181138d556317a45d7692cd2b37d1d341238e5440a@%3Cdev.beam.apache.org%3E
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
ValidatesRunner
compliance status (on master branch)Examples testing status on various runners
Post-Commit SDK/Transform Integration Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.