Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created new module: unpacking #6866

Closed
wants to merge 14 commits into from
Closed

Created new module: unpacking #6866

wants to merge 14 commits into from

Conversation

ivenk
Copy link

@ivenk ivenk commented Aug 12, 2019

Fixes #6861

Only code refactoring was done.

Moved download::unpack_file_url() and misc::unzip_file(), misc::untar_file(), misc::unpack_file() to a new module. The new module is named unpacking and is located in pip/_internal/utils/

I ran all the tests and locally everything works.

Copy link
Member

@cjerdonek cjerdonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some initial comments.

@@ -21,23 +21,19 @@
from pip._vendor.requests.models import CONTENT_CHUNK_SIZE, Response
from pip._vendor.requests.structures import CaseInsensitiveDict
from pip._vendor.requests.utils import get_netrc_auth
# NOTE: XMLRPC Client is not annotated in typeshed as on 2017-07-17, which is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment shouldn't be removed.

from pip._vendor.six.moves import xmlrpc_client # type: ignore
from pip._vendor.six.moves.urllib import parse as urllib_parse
from pip._vendor.six.moves.urllib import request as urllib_request

import pip
from pip._internal.exceptions import HashMismatch, InstallationError
from pip._internal.models.index import PyPI
# Import ssl from compat so the initial import occurs in only one place.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment also shouldn't be removed (don't remove comments unrelated to the PR).

from collections import deque

from pip._vendor import pkg_resources
# NOTE: retrying is not annotated in typeshed as on 2017-07-17, which is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

current_umask,
ensure_dir,
file_contents,
has_leading_dir,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_svn_page(), has_leading_dir() and split_leading_dir() should also be moved to the new module, as the only place they're used is in the unpack functions.

logger.debug('lzma module is not available')


def unpack_file_url(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be best if the functions in the module are ordered so that functions only depend on function that occur above them in the module, so ordered low-level to high-level. That way the module can be read top-to-bottom, and it becomes easier to locate things.

Also, the tests of the functions moved to this module should be moved to e.g. test_unpacking.py, following the pattern of other test modules.

is_dir_url,
url_to_path,
_check_download_dir,
_copy_file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this module shouldn't have dependencies on download.py (avoid introducing new nested imports, by the way). Rather, the functions in download.py that unpack_file_url() depends on should also be moved into this module. We don't want to have dependencies from utils to higher-level modules outside of it like download.py.

Copy link
Member

@chrahunt chrahunt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments, but in general I'm concerned about the placement of unpack_file_url. It embeds a lot of really specific business logic whereas the others are closer to what I'd call utilities. Maybe it would be worth considering keeping it outside utils.

Any, AnyStr, Container, Iterable, List, Mapping, Match, Optional, Text,
Tuple, Union, cast,
Any, Container, Iterable, List, Mapping, Optional, Text,
Tuple, cast, AnyStr, Union, Match
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would alphabetize these.

from typing import Optional
from pip._internal.models.link import Link
from pip._internal.utils.hashes import Hashes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put two lines between the end of imports and body.

@@ -0,0 +1,286 @@
# mypy: strict-optional=False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere we pair this with # The following comment should be removed at some point in the future., we should do the same here.

@cjerdonek
Copy link
Member

Maybe it would be worth considering keeping it outside utils.

What location are you suggesting for unpack_file_url()? A stand-alone module alongside wheel.py or in wheel.py itself? I don't think it should be inside download.py. I wouldn't object to putting it in wheel.py. One of the main things IMO is removing the dependency of wheel.py on download.py. I do think that putting it in unpacking.py is better than the status quo from a dependency perspective. And we could always continue with further improvements afterwards.

@cjerdonek
Copy link
Member

I wouldn't object to putting it in wheel.py.

Actually, now that I look at it, wheel.py is also a file with ~1200 lines, so I don't think we'd want to be complicating that file further. So a stand-alone file outside of utils? A stand-alone file inside operations? We can always revisit this further after the PR. One thing consistent with putting it in utils is that it doesn't add additional third-party dependencies to utils (like putting network-accessing code would). These are functions "without significant third-party dependencies."

@chrahunt
Copy link
Member

chrahunt commented Aug 13, 2019

I agree, it shouldn't be in download.py. A standalone module would be better than wheel.py IMO since we can iterate with less tedious import management (and like you mention, it's already pretty big). Maybe local_project.py? If a few minutes of thought doesn't result in a clear winner then I'm OK with it going into unpacking.py, just keeping in mind that we'll need to drag a few other things into it as a result of #6844.

@cjerdonek
Copy link
Member

cjerdonek commented Aug 13, 2019

What about something like unpacker.py, to give it a file name different from unpacking to make grepping easier? The idea is that this module supports unpacking for a specific use case, as it contains business logic as you say.

@chrahunt
Copy link
Member

Sounds fine to me. In _internal?

@cjerdonek
Copy link
Member

cjerdonek commented Aug 13, 2019

Yes, sibling to download.py and wheel.py.

@cjerdonek
Copy link
Member

cjerdonek commented Aug 13, 2019

Okay, great! @ivenk Did you get that? You can continue with your unpacking.py changes and review comments, but in addition put unpack_file_url() and the download.py functions it depends on in a second new module called unpacker.py next to download.py.

@ivenk
Copy link
Author

ivenk commented Aug 13, 2019

@cjerdonek i got it, thanks ! I will fix everything mentioned here and create a new file for unpack_file_url() and required functions.

@BrownTruck
Copy link
Contributor

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

@BrownTruck BrownTruck added the needs rebase or merge PR has conflicts with current master label Aug 16, 2019
@cjerdonek
Copy link
Member

@ivenk This needs a rebase. Also, in your latest PR, it looks like all of the from pip._vendor imports were moved to be after the from pip._internal imports (e.g. in download.py). You should keep the import statements in their same relative location, i.e. don't do any reordering of import statements.

@ivenk
Copy link
Author

ivenk commented Aug 19, 2019

@cjerdonek thanks for the hint i think that happened during some refactoring from the ide.
Unfortunately i do not have time to look at this until the weekend so if anyone wants to pick this up, feel free to do so.

@cjerdonek
Copy link
Member

Okay, thanks for letting us know. Maybe we'll switch back to proceeding on some things that we were holding off on for this, or else take this (or parts of this) up to help ease the burden.

@chrahunt
Copy link
Member

Hi @ivenk. We were able to take care of this in #7028, #7037, and #7044, which were definitely easier having seen the changes and comments here. Thanks a lot for your help. :)

I will close this now, but please don't hesitate to pick up any other issues you find interesting.

@chrahunt chrahunt closed this Sep 21, 2019
@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Oct 21, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation needs rebase or merge PR has conflicts with current master
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move unpack_file_url() out of download.py
4 participants