Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip install [git repo] hangs on clone step on Windows with large repositories #8876

Closed
Mikuana opened this issue Sep 14, 2020 · 15 comments · Fixed by #9331
Closed

pip install [git repo] hangs on clone step on Windows with large repositories #8876

Mikuana opened this issue Sep 14, 2020 · 15 comments · Fixed by #9331
Labels
C: vcs pip's interaction with version control systems like git, svn and bzr type: bug A confirmed bug or unintended behavior
Milestone

Comments

@Mikuana
Copy link

Mikuana commented Sep 14, 2020

Environment

  • pip version: 20.2
  • Python version: 3.7 and 3.8
  • OS: Windows

Python packages are available as git repositories, hosted in AWS CodeCommit, with authentication handled via AWS CLI integration with git credential helper.

Description

Pip version 20.2 causes VCS pip install from AWS CodeCommit to freeze during the clone step. This does not occur with all our repos, but occurs consistently for the ones with the problem.

My team has a suite of python packages that we develop for our internal use. These packages are available as git repositories, and we install them using the pip VCS integration feature.

pip install git+https://git-codecommit.us-east-2.amazonaws.com/v1/repos/MyDemoRepo

This solution has worked for us, without issue, until we upgraded to pip version 20.2. I have tested that all minor versions between 19.1 and 20.1 work as expected.

On version 20.2, with specific repositories, when the install gets to the clone step it stalls there, and the process never completes. The temporary folder is initialized, and the hidden .git repo appears to be populated and in-tact, but the rest of the repo is empty and the process never completes.

Adding the verbose switch does not provide any detail as to why the process is hanging. Examination of logs on the side of the remote server suggest that the git command is being processed correctly.

Expected behavior
Using pip install git on a CodeCommit repo will complete or raise an error for all repositories.

How to Reproduce
Reproducing may be difficult, since the issue only occurs with some of our packages, but we don't know why. Our code is not public, so I can't expose it directly. However, the steps are as follows:

  1. Install python 3.7 and pip 20.2 on Windows
  2. Install AWS CLI
  3. Get an AWS API Token
  4. Configure .gitconfig credential helper to use AWS CLI for CodeCommit URL
  5. Start installing python packages with pip install git+https://; we're not sure why some work and some don't, so this last step is a little, uninformative...

Output

pip install git+https://git-codecommit.us-east-2.amazonaws.com/v1/repos/MyDemoRepo

Collecting git+https://git-codecommit.us-east-2.amazonaws.com/v1/repos/MyDemoRepo
  Cloning https://git-codecommit.us-east-2.amazonaws.com/v1/repos/MyDemoRepo to /tmp/pip-req-build-rpozcsdw
@uranusjr
Copy link
Member

The only significant change to VCS recently is #8817 but that doesn’t seem that related since you’re just fetching the default branch. Do the stuck repos have submodules or something?

@Mikuana
Copy link
Author

Mikuana commented Sep 15, 2020

No, there's no sub-module, or anything exotic in terms of the git structure. It's a fairly vanilla setup, with a .gitignore. The file structure is like this:

| repo
| ------- module/
|--------.gitignore
|--------LICENSE
|--------README.md
|--------setup.py

@zhangk1551
Copy link

Same problem for python 3.6 & pip 20.2.4.

@pradyunsg
Copy link
Member

Can folks confirm if this occurs with pip 20.3?

@pradyunsg pradyunsg added C: vcs pip's interaction with version control systems like git, svn and bzr S: awaiting response Waiting for a response/more information labels Dec 1, 2020
@Mikuana
Copy link
Author

Mikuana commented Dec 2, 2020

Confirmed that the same issue occurs with pip 20.3

@no-response no-response bot removed the S: awaiting response Waiting for a response/more information label Dec 2, 2020
@pradyunsg pradyunsg changed the title Pip 20.2 git install hangs on clone step for certain CodeCommit repositories pip install [git repo] hangs on clone step for certain CodeCommit repositories Dec 2, 2020
@pradyunsg pradyunsg added the state: needs reproducer Need to reproduce issue label Dec 2, 2020
@pradyunsg
Copy link
Member

Thanks for the confirmation!

Someone needs to provide us with a reproducer here, or contribute the fix, because pip's contributors currently have no way to test this.

@Mikuana
Copy link
Author

Mikuana commented Dec 3, 2020

I'll see if I can further isolate the issue. I've done some testing with a few different variables. So far the only variable that has seemed to make a difference is running the command from Ubuntu instead of Windows. It's possible this is an OS dependent thing, but whether that's a purely OS issue, an OS-pip issue, or an OS-git issue, I don't know yet.

@Mikuana
Copy link
Author

Mikuana commented Dec 5, 2020

I believe I've isolated the issue down to the way that Git for Windows behaves when working with large repos. For some reason, Git on Windows outputs to stderr instead of stdout when reporting on the download of large repos. In combination with the changes made to subprocess calls in v20.2, for some reason this results in the entire subprocess hanging. Essentially, when the VCS install subprocess runs, it's looking for a response on stdout, but if the repo is large enough everything gets output to stderr instead, and the whole thing hangs. It doesn't seem like this should happen (there may be an underlying bug in Git or in the subprocess module), but everything stops at the stdout read.

I've tested a fix based on v20.2.4 that first checks stderr, then checks stdout. I can't open a pull though, because there's no branch for that version, just a tag.

I've got a link to the comparison below. What's the process to fully test this and get it merged as a patch on 20.2 and 20.3?

https:/pypa/pip/compare/pypa:20.2.4...Mikuana:v20.2-vcs-bugfix?expand=1

@Mikuana
Copy link
Author

Mikuana commented Dec 5, 2020

@pradyunsg would it be possible to get a branch created from the 20.2.4 tag so that I can open a pull request directly against it? Then we can discuss the rest of the details in the PR instead of in this issue.

@Mikuana Mikuana changed the title pip install [git repo] hangs on clone step for certain CodeCommit repositories pip install [git repo] hangs on clone step on Windows with large repositories Dec 5, 2020
@uranusjr
Copy link
Member

uranusjr commented Dec 5, 2020

But you won’t be able to get the patch into a new pip version if you don’t create the patch against 20.3 (or master). What are the problems preventing a branch from 20.3 being possible?

@Mikuana
Copy link
Author

Mikuana commented Dec 5, 2020

None that I know of. I was planning to do that next. I was just thinking it would be best to patch the minor version where the issue first appeared, then work the change into the rest of the history from there.

That way, if someone is doing a large VCS install, it will work in 20.1, 20.2, and 20.3.

@pradyunsg
Copy link
Member

Feel free to make a PR against the current master. We usually don't backport fixes as follow CalVer, and don't usually cut bugfix releases for older versions.

I don't see why this issue needs to be treated differently anyway, so I reckon a PR to master is sufficient. ^>^

@Mikuana
Copy link
Author

Mikuana commented Dec 6, 2020

@pradyunsg sounds good. I think there's an case to be made for patching 20.2, but I'm certainly not going to argue against your normal process. I'll set up a pull against master and reference it here.

@Mikuana
Copy link
Author

Mikuana commented Dec 6, 2020

PR to master opened in #9234

@sbidoul sbidoul added this to the 20.3.4 milestone Dec 20, 2020
@sbidoul sbidoul added type: bug A confirmed bug or unintended behavior and removed state: needs reproducer Need to reproduce issue labels Dec 20, 2020
@sbidoul
Copy link
Member

sbidoul commented Jan 10, 2021

@pradyunsg Should we reopen this to track the backport to 20.3.4 ? Or is it a better way to track it ?

bors bot referenced this issue in duckinator/emanate Jan 31, 2021
215: Update pip to 21.0.1 r=duckinator a=pyup-bot


This PR updates [pip](https://pypi.org/project/pip) from **20.3.3** to **21.0.1**.



<details>
  <summary>Changelog</summary>
  
  
   ### 21.0.1
   ```
   ===================

Bug Fixes
---------

- commands: debug: Use packaging.version.parse to compare between versions. (`9461 &lt;https:/pypa/pip/issues/9461&gt;`_)
- New resolver: Download and prepare a distribution only at the last possible
  moment to avoid unnecessary network access when the same version is already
  installed locally. (`9516 &lt;https:/pypa/pip/issues/9516&gt;`_)

Vendored Libraries
------------------

- Upgrade packaging to 20.9
   ```
   
  
  
   ### 21.0
   ```
   =================

Deprecations and Removals
-------------------------

- Drop support for Python 2. (`6148 &lt;https:/pypa/pip/issues/6148&gt;`_)
- Remove support for legacy wheel cache entries that were created with pip
  versions older than 20.0. (`7502 &lt;https:/pypa/pip/issues/7502&gt;`_)
- Remove support for VCS pseudo URLs editable requirements. It was emitting
  deprecation warning since version 20.0. (`7554 &lt;https:/pypa/pip/issues/7554&gt;`_)
- Modernise the codebase after Python 2. (`8802 &lt;https:/pypa/pip/issues/8802&gt;`_)
- Drop support for Python 3.5. (`9189 &lt;https:/pypa/pip/issues/9189&gt;`_)
- Remove the VCS export feature that was used only with editable VCS
  requirements and had correctness issues. (`9338 &lt;https:/pypa/pip/issues/9338&gt;`_)

Features
--------

- Add ``--ignore-requires-python`` support to pip download. (`1884 &lt;https:/pypa/pip/issues/1884&gt;`_)
- New resolver: Error message shown when a wheel contains inconsistent metadata
  is made more helpful by including both values from the file name and internal
  metadata. (`9186 &lt;https:/pypa/pip/issues/9186&gt;`_)

Bug Fixes
---------

- Fix a regression that made ``pip wheel`` do a VCS export instead of a VCS clone
  for editable requirements. This broke VCS requirements that need the VCS
  information to build correctly. (`9273 &lt;https:/pypa/pip/issues/9273&gt;`_)
- Fix ``pip download`` of editable VCS requirements that need VCS information
  to build correctly. (`9337 &lt;https:/pypa/pip/issues/9337&gt;`_)

Vendored Libraries
------------------

- Upgrade msgpack to 1.0.2.
- Upgrade requests to 2.25.1.

Improved Documentation
----------------------

- Render the unreleased pip version change notes on the news page in docs. (`9172 &lt;https:/pypa/pip/issues/9172&gt;`_)
- Fix broken email link in docs feedback banners. (`9343 &lt;https:/pypa/pip/issues/9343&gt;`_)


.. note

    You should *NOT* be adding new change log entries to this file, this
    file is managed by towncrier. You *may* edit previous change logs to
    fix problems like typo corrections or such.

    To add a new change log entry, please see
        https://pip.pypa.io/en/latest/development/contributing/#news-entries

.. towncrier release notes start
   ```
   
  
  
   ### 20.3.4
   ```
   ===================

Features
--------

- ``pip wheel`` now verifies the built wheel contains valid metadata, and can be
  installed by a subsequent ``pip install``. This can be disabled with
  ``--no-verify``. (`9206 &lt;https:/pypa/pip/issues/9206&gt;`_)
- Improve presentation of XMLRPC errors in pip search. (`9315 &lt;https:/pypa/pip/issues/9315&gt;`_)

Bug Fixes
---------

- Fixed hanging VCS subprocess calls when the VCS outputs a large amount of data
  on stderr. Restored logging of VCS errors that was inadvertently removed in pip
  20.2. (`8876 &lt;https:/pypa/pip/issues/8876&gt;`_)
- Fix error when an existing incompatibility is unable to be applied to a backtracked state. (`9180 &lt;https:/pypa/pip/issues/9180&gt;`_)
- New resolver: Discard a faulty distribution, instead of quitting outright.
  This implementation is taken from 20.2.2, with a fix that always makes the
  resolver iterate through candidates from indexes lazily, to avoid downloading
  candidates we do not need. (`9203 &lt;https:/pypa/pip/issues/9203&gt;`_)
- New resolver: Discard a source distribution if it fails to generate metadata,
  instead of quitting outright. This implementation is taken from 20.2.2, with a
  fix that always makes the resolver iterate through candidates from indexes
  lazily, to avoid downloading candidates we do not need. (`9246 &lt;https:/pypa/pip/issues/9246&gt;`_)

Vendored Libraries
------------------

- Upgrade resolvelib to 0.5.4.
   ```
   
  
</details>


 

<details>
  <summary>Links</summary>
  
  - PyPI: https://pypi.org/project/pip
  - Changelog: https://pyup.io/changelogs/pip/
  - Homepage: https://pip.pypa.io/
</details>



Co-authored-by: pyup-bot <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: vcs pip's interaction with version control systems like git, svn and bzr type: bug A confirmed bug or unintended behavior
Projects
None yet
5 participants