Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the URL quoting in _clean_link() for IPv6 addresses #6245

Merged
merged 1 commit into from
Apr 8, 2019

Conversation

nicolasbock
Copy link
Contributor

@nicolasbock nicolasbock commented Feb 7, 2019

In cases where the base_url is a [] protected IPv6 address, the
_clean_link() function converts [ to %5B and ] to %5D, which
renders the base_url invalid. For example:

	Starting new HTTP connection (1): fd00:0:0:236::100:8181
	http://fd00:0:0:236::100:8181 "GET /os-releases/19.0.0.0b1/opensuse_leap-42.3-x86_64/requirements_absolute_requirements.txt HTTP/1.1" 200 None
	Setting setuptools==40.6.3 (from -c http://[fd00:0:0:236::100]:8181/os-releases/19.0.0.0b1/opensuse_leap-42.3-x86_64/requirements_absolute_requirements.txt (line 204)) extras to: ()
	Looking in indexes: http://[fd00:0:0:236::100]:8181/simple
	Collecting setuptools==40.6.3 (from -c http://[fd00:0:0:236::100]:8181/os-releases/19.0.0.0b1/opensuse_leap-42.3-x86_64/requirements_absolute_requirements.txt (line 204))
	  1 location(s) to search for versions of setuptools:
	  * http://[fd00:0:0:236::100]:8181/simple/setuptools/
	  Getting page http://[fd00:0:0:236::100]:8181/simple/setuptools/
	  http://fd00:0:0:236::100:8181 "GET /simple/setuptools/ HTTP/1.1" 200 376
	  Analyzing links from page http://[fd00:0:0:236::100]:8181/simple/setuptools/
	    _package_versions: link = http://%5bfd00:0:0:236::100%5d:8181/packages/opensuse_leap-42.3-x86_64/setuptools/setuptools-40.6.3-py2.py3-none-any.whl#md5=389d3cd088d7afec3a1133b1d8e15df0 (from http://[fd00:0:0:
	236::100]:8181/simple/setuptools/)
	    _link_package_versions: link = http://%5bfd00:0:0:236::100%5d:8181/packages/opensuse_leap-42.3-x86_64/setuptools/setuptools-40.6.3-py2.py3-none-any.whl#md5=389d3cd088d7afec3a1133b1d8e15df0 (from http://[fd00
	:0:0:236::100]:8181/simple/setuptools/)
	    Found link http://%5bfd00:0:0:236::100%5d:8181/packages/opensuse_leap-42.3-x86_64/setuptools/setuptools-40.6.3-py2.py3-none-any.whl#md5=389d3cd088d7afec3a1133b1d8e15df0 (from http://[fd00:0:0:236::100]:8181/
	simple/setuptools/), version: 40.6.3
	  Using version 40.6.3 (newest of versions: 40.6.3)
        Could not install packages due to an EnvironmentError.
        InvalidURL: Failed to parse: %5bfd00:0:0:236::100%5d:8181

This change moves the base_url outside the _clean_link call,
leaving it unaltered.

Signed-off-by: Nicolas Bock [email protected]

Fixes: #6285

@nicolasbock nicolasbock force-pushed the ipv6_uri branch 5 times, most recently from 2fe0cec to 15f17db Compare February 12, 2019 13:09
@uranusjr
Copy link
Member

Replacing hand-craft URL escaping with stdlib calls is almost always the right thing to do. +1 to this with one small request in implementation.

@nicolasbock
Copy link
Contributor Author

Do you want me to do anything else for this PR @uranusjr ?

@uranusjr
Copy link
Member

Do you know why CI is failing? It would be most ideal if we can get a green tick. Also I think this change needs to appear in the changlog, instead of being classified as trivial.

@nicolasbock nicolasbock force-pushed the ipv6_uri branch 2 times, most recently from cda40bf to 910e3e9 Compare February 21, 2019 18:25
@nicolasbock
Copy link
Contributor Author

I updated the news item. I will have a look at the CI failure. I don't have Windows so hopefully I can get something out of the console output 😃

@pradyunsg pradyunsg added type: bugfix S: needs triage Issues/PRs that need to be triaged labels Mar 11, 2019
@pradyunsg
Copy link
Member

Hey @nicolasbock! Did you manage to take a look at the CI failure?

@nicolasbock
Copy link
Contributor Author

@pradyunsg Yes, but I haven't been able to make a lot of progress on it. Since I don't have access to Windows I can't reproduce the issue to troubleshoot. Do you have a suggestion as to how to go about this?

Copy link
Member

@cjerdonek cjerdonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

tests/unit/test_url.py Outdated Show resolved Hide resolved
_clean_path)


def test_ipv4_clean_link():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making a separate test for each test case, it would be better to use @pytest.mark.parametrize(). Then each test case can simply be an additional argument to parametrize().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I am still using two extra variables, _path and _clean_path. Is that appropriate? Or should I fold that into the arguments for the decorator?

src/pip/_internal/index.py Outdated Show resolved Hide resolved
@cjerdonek cjerdonek removed the S: needs triage Issues/PRs that need to be triaged label Mar 23, 2019
@cjerdonek
Copy link
Member

cjerdonek commented Mar 23, 2019

Also, I'm not completely sure, but maybe the Windows failure is in part because the PR causes the : after the drive letter to be quoted. From the CI output--

Collecting simple
  1 location(s) to search for versions of simple:
  * file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/
   file: URL is directory, getting file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/index.html
  Getting page file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/index.html
  Analyzing links from page file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/index.html
    Found link file:///T%3A/pytest-1/popen-gw2/test_file_index_url_quoting0/data/packages/simple-1.0.tar.gz#md5=4bdf78ebb7911f215c1972cf71b378f0 (from file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/index.html), version: 1.0
  Using version 1.0 (newest of versions: 1.0)

@nicolasbock nicolasbock force-pushed the ipv6_uri branch 2 times, most recently from a310119 to 9520258 Compare April 1, 2019 22:37
@nicolasbock
Copy link
Contributor Author

Also, I'm not completely sure, but maybe the Windows failure is in part because the PR causes the : after the drive letter to be quoted. From the CI output--

Collecting simple
  1 location(s) to search for versions of simple:
  * file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/
   file: URL is directory, getting file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/index.html
  Getting page file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/index.html
  Analyzing links from page file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/index.html
    Found link file:///T%3A/pytest-1/popen-gw2/test_file_index_url_quoting0/data/packages/simple-1.0.tar.gz#md5=4bdf78ebb7911f215c1972cf71b378f0 (from file:///T:/pytest-1/popen-gw2/test_file_index_url_quoting0/data/indexes/in%20dex/simple/index.html), version: 1.0
  Using version 1.0 (newest of versions: 1.0)

That's a good point. I will have a look into how to parse a Windows path. Presumably / is equivalent to \?

@nicolasbock
Copy link
Contributor Author

I added a Windowsish path to the test. Note that at this point this is expected to fail.

@nicolasbock
Copy link
Contributor Author

nicolasbock commented Apr 2, 2019

@cjerdonek I think I got it now. The test is passing for me.

However, since there is no check whether pip is running on a Windows host, and I am simply assuming that anything that looks like a drive letter is a drive letter, there might be cases where it is desirable to keep the : in the "drive letter" quoted. For example, as far as I know, on Linux the following URI is legal: /C:/a/b/c where C: is not a drive letter obviously. Whether not quoting the : can be an issue I don't know.

src/pip/_internal/index.py Outdated Show resolved Hide resolved
src/pip/_internal/index.py Outdated Show resolved Hide resolved
tests/unit/test_index.py Outdated Show resolved Hide resolved
@cjerdonek
Copy link
Member

@nicolasbock That looks good. That's what I meant.

For the few test cases where the test expectation will be platform-dependent, the way I would do it is break those few cases off into test_clean_link_windows() and test_clean_link_non_windows(), and then use an appropriate pytest condition so that the test will be skipped if the platform is windows (or not), etc. I'm sure you can find some existing examples in the test suite.

@nicolasbock
Copy link
Contributor Author

@nicolasbock That looks good. That's what I meant.

For the few test cases where the test expectation will be platform-dependent, the way I would do it is break those few cases off into test_clean_link_windows() and test_clean_link_non_windows(), and then use an appropriate pytest condition so that the test will be skipped if the platform is windows (or not), etc. I'm sure you can find some existing examples in the test suite.

Thanks for the hint @cjerdonek . I have split the one test into two as you suggested. Let's see if I got the test conditions for Windows right this time 😄.

@cjerdonek
Copy link
Member

Looking through the code base, the pattern I see being used is this--

@pytest.mark.skipif("sys.platform != 'win32'")

(or == in place of !=)

@nicolasbock
Copy link
Contributor Author

Looking through the code base, the pattern I see being used is this--

@pytest.mark.skipif("sys.platform != 'win32'")

(or == in place of !=)

Isn't there a win64 also?

@uranusjr
Copy link
Member

uranusjr commented Apr 4, 2019

No, 64-bit Windows also identifies itself as win32.

@nicolasbock
Copy link
Contributor Author

Thanks @uranusjr . I adjusted the skip expressions @cjerdonek .

@nicolasbock
Copy link
Contributor Author

@uranusjr , @pradyunsg , @cjerdonek , Any chance for another review?

@pfmoore Could you check this PR on Windows again to make sure I got this right?

Thanks all!

Copy link
Member

@cjerdonek cjerdonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some wording suggestions on the comments. Otherwise looks good!

news/6285.bugfix Outdated Show resolved Hide resolved
src/pip/_internal/index.py Outdated Show resolved Hide resolved
src/pip/_internal/index.py Outdated Show resolved Hide resolved
src/pip/_internal/index.py Outdated Show resolved Hide resolved
src/pip/_internal/index.py Outdated Show resolved Hide resolved
tests/unit/test_index.py Outdated Show resolved Hide resolved
When the `base_url` is a `[]` protected IPv6 address, the
`_clean_link()` function converts `[` to `%5B` and `]` to `%5D`, which
renders the `base_url` invalid. For example:

```
	Starting new HTTP connection (1): fd00:0:0:236::100:8181
	http://fd00:0:0:236::100:8181 "GET /os-releases/19.0.0.0b1/opensuse_leap-42.3-x86_64/requirements_absolute_requirements.txt HTTP/1.1" 200 None
	Setting setuptools==40.6.3 (from -c http://[fd00:0:0:236::100]:8181/os-releases/19.0.0.0b1/opensuse_leap-42.3-x86_64/requirements_absolute_requirements.txt (line 204)) extras to: ()
	Looking in indexes: http://[fd00:0:0:236::100]:8181/simple
	Collecting setuptools==40.6.3 (from -c http://[fd00:0:0:236::100]:8181/os-releases/19.0.0.0b1/opensuse_leap-42.3-x86_64/requirements_absolute_requirements.txt (line 204))
	  1 location(s) to search for versions of setuptools:
	  * http://[fd00:0:0:236::100]:8181/simple/setuptools/
	  Getting page http://[fd00:0:0:236::100]:8181/simple/setuptools/
	  http://fd00:0:0:236::100:8181 "GET /simple/setuptools/ HTTP/1.1" 200 376
	  Analyzing links from page http://[fd00:0:0:236::100]:8181/simple/setuptools/
	    _package_versions: link = http://%5bfd00:0:0:236::100%5d:8181/packages/opensuse_leap-42.3-x86_64/setuptools/setuptools-40.6.3-py2.py3-none-any.whl#md5=389d3cd088d7afec3a1133b1d8e15df0 (from http://[fd00:0:0:
	236::100]:8181/simple/setuptools/)
	    _link_package_versions: link = http://%5bfd00:0:0:236::100%5d:8181/packages/opensuse_leap-42.3-x86_64/setuptools/setuptools-40.6.3-py2.py3-none-any.whl#md5=389d3cd088d7afec3a1133b1d8e15df0 (from http://[fd00
	:0:0:236::100]:8181/simple/setuptools/)
	    Found link http://%5bfd00:0:0:236::100%5d:8181/packages/opensuse_leap-42.3-x86_64/setuptools/setuptools-40.6.3-py2.py3-none-any.whl#md5=389d3cd088d7afec3a1133b1d8e15df0 (from http://[fd00:0:0:236::100]:8181/
	simple/setuptools/), version: 40.6.3
	  Using version 40.6.3 (newest of versions: 40.6.3)
        Could not install packages due to an EnvironmentError.
        InvalidURL: Failed to parse: %5bfd00:0:0:236::100%5d:8181
```

This change uses the vendored `urllib` library to split the host part
off of the url before URL quoting only the path part.

Fixes: pypa#6285
Signed-off-by: Nicolas Bock <[email protected]>
@nicolasbock
Copy link
Contributor Author

Thanks for the comments @cjerdonek ! I pushed another iteration incorporating your suggestions.

Copy link
Member

@cjerdonek cjerdonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for all your work and persistence on this. :) Before merging this I'll give @pfmoore another few days or so to take a look if he wants.

@cjerdonek cjerdonek changed the title Do not clean base_url Fix the URL quoting in _clean_link() for IPv6 addresses Apr 8, 2019
@cjerdonek cjerdonek added the C: finder PackageFinder and index related code label Apr 8, 2019
@pfmoore
Copy link
Member

pfmoore commented Apr 8, 2019

I'm OK with it - I'm not going to have time to have another look in the next few days, so go for it. And thanks @nicolasbock for all the work on this!

@cjerdonek
Copy link
Member

Okay, thanks, @pfmoore!

@cjerdonek cjerdonek merged commit 54b6a91 into pypa:master Apr 8, 2019
@cjerdonek
Copy link
Member

Thanks again, @nicolasbock!

@nicolasbock
Copy link
Contributor Author

Thanks for you all your helpful reviews!

@nicolasbock nicolasbock deleted the ipv6_uri branch April 8, 2019 10:39
@pradyunsg
Copy link
Member

pradyunsg commented Apr 11, 2019

This is nice! Thanks @nicolasbock for doing this and @cjerdonek, @pfmoore and @uranusjr for the reviews and inputs here. :D

@lock
Copy link

lock bot commented May 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label May 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators May 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: finder PackageFinder and index related code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A [] protected IPv6 address is incorrectly URL quoted
5 participants