Use a set for TargetPython.get_tags for performance #12204
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This makes a representative pip-compile about 1.6x faster
This makes the set.isdisjoint operation done over here much cheaper: https:/pypa/pip/blob/593b85f4a/src/pip/_internal/models/wheel.py#L92
The reason for this is because a Python usually has a lot of supported tags. The Python I used above has 2000 supported tags! Whereas a wheel usually only has one or two file tags.
The CPython code will unfortunately iterate over all 2000 tags to check if there's a match. Only if the other collection is a set will CPython think to swap the operands to iterate over the shorter collection: https:/python/cpython/blob/35963da40f/Objects/setobject.c#L1352