Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor InstallRequirement #5051

Closed
pradyunsg opened this issue Mar 2, 2018 · 8 comments
Closed

Refactor InstallRequirement #5051

pradyunsg opened this issue Mar 2, 2018 · 8 comments
Assignees
Labels
auto-locked Outdated issues that have been locked by automation state: needs discussion This needs some more discussion type: refactor Refactoring code

Comments

@pradyunsg
Copy link
Member

pradyunsg commented Mar 2, 2018

I've had this cooking in my head for a bit now. These aren't critical path for some feature but this definitely something I want to do to make pip more approachable for new contributors and make it easier to reason about the installation flow. I've hinted this before in #4713 (comment).

The idea is that, eventually, the resolver should consume Requirements, get Candidates during resolution and spew out Distributions. The Distributions would then be operated upon during installation/uninstallation etc, downloading/caching would deal with Candidates, listing/freezing/checking would deal with Distributions. This 3-class model works cleanly for each of these use cases.

Currently, all three "jobs" are served by InstallRequirement (and InstallationCandidate :P), and it's sometimes hard to keep track of what's happening since there's a lot of stuff that's happening in that one class.

This would result in:

  • sanity while actually keeping track of the installation flow
  • make it easier later on to improve unit test coverage - scope for improving test speed
  • better decoupling of UI elements from actual code -- easier Redesigning pip's output (for install, wheel and download commands) #4649
  • ease when modifying behaviours for certain kinds of Distributions -- easier to implement PEP 517/518

I'm curious what @pypa/pip-committers and others think about this since, all in all, this work would result in changes in about a quarter of pip._internal, in terms of LOC.


What follows is basically the mind-dump for this (from about 4 months back, I think):

  • Refactor Refactor Refactor! {confirm it looks fine, figure out path}
    • Do away with all the respect for InstallRequirement and start chopping it.
    • Rename InstallationCandidate to CandidateDistribution
    • Move all "link" related information into CandidateDistribution
    • Create pip._internal.models.requirement.Requirement - Holds information that is given by the user
      • Essentially, a helpful wrapper around packaging.requirements.Requirement
    • pip._internal.models.candidate.CandidateDistribution - Holds information fetched from the index
      • package-version visible to the resolver
      • url-hash visible to download logic
      • TODO: check how is this different from current InstallationCandidate
      • TODO: Figure out how VCS looks with this?
    • pip._internal.models.distribution.Distribution - Describing the build, install, uninstall and other related behaviours for different kinds of distributions.
      • might wrap a pkg_resources.Distribution?
      • LegacySourceDistribution (non PEP 518/517)
      • ModernSourceDistribution (PEP 518/517)
      • WheelDistribution
      • Maybe do this early?
    • An intermediator to hold the logic of conversion from CandidateDistribution to a Distribution object
      • need to "build" to get dependencies and Distribution should hold that behaviour, not CandidateDistribution.

The reason I want to use the pip._internal.models is to represent that these are essentially objects representing data stored elsewhere that would be operated upon and isn't that what models are, in some sense? :P


PS: This is after I am done with zazo and then bringing it in. zazo is the project where I'm making the resolver.

@pradyunsg pradyunsg added type: refactor Refactoring code state: needs discussion This needs some more discussion labels Mar 2, 2018
@pradyunsg pradyunsg self-assigned this Mar 2, 2018
@pfmoore
Copy link
Member

pfmoore commented Mar 3, 2018

In principle I'm +1 on this. Our internals are private precisely so we have the freedom to do stuff like this.

It sounds like a pretty major piece of work, so timing will be important. We'll need to juggle this, PEP 517/518, and the resolver. Maybe have a release for each of them, so that's pip 11, 12, and 13 allocated (in some order or other)?

@pradyunsg
Copy link
Member Author

Great! ^>^

Somewhere around pip 11 or 12, depending on when PEP 517 work happens. If I have the choice, I'd do that after this refactor, but I don't want it to hold it off -- if someone makes a reviewable change for PEP 517, I'm happy to go ahead with that instead
.

@pradyunsg
Copy link
Member Author

@pfmoore is tackling PEP 517, I'll work on this refactor after that and then proceed to the resolver work.

The resolver is something that the pipenv maintainers are looking into so the refactoring would serve as a good chance to introduce the needed abstractions. :)

@pradyunsg
Copy link
Member Author

I'm picking this up now. Wish me luck! 🌈

@cjerdonek
Copy link
Member

cjerdonek commented May 25, 2019

My one piece of advice is to do this incrementally in reviewable PR's so you don't wind up with a giant PR that sits for a long time. This has many, many benefits, like letting people see where you're going, avoiding merge conflicts that are hard to handle, lets people review and comment, etc.

@pradyunsg
Copy link
Member Author

Yep yep! Definitely will be doing this through a lot of small PRs.

@pradyunsg
Copy link
Member Author

Current next steps:

  • Add 'distributions' subpackage.
  • Change the return type of "RequirementPreparer" to be the underlying Distribution object

The above is mostly just simply moving code around and stuff.

Broader "have to look at" things after this:

  • Move out some logic for "preparing" from InstallRequirement to SourceDistribution.
  • Based on how Distributions look, determine if there's a better model for the task at hand.
  • Look for better way to model _get_abstract_dist_for so that it can be moved out of the resolver.

@pradyunsg
Copy link
Member Author

Closing this is favor of #6607. :)

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jul 14, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jul 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation state: needs discussion This needs some more discussion type: refactor Refactoring code
Projects
None yet
Development

No branches or pull requests

3 participants