Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix venv creation in Python environments #297628

Merged
merged 2 commits into from
Mar 22, 2024
Merged

Conversation

cwp
Copy link
Contributor

@cwp cwp commented Mar 21, 2024

The problem

The way we build python environments is subtly broken. A python environment should be semantically
identical to a vanilla Python installation in, say, /usr/local. The current implementation,
however, differs in two important ways.

The first is that it's impossible to use python packages from the environment in python virtual
environments. Here's a demonstration:

# build using nixpkgs master branch
> nix-build \
  -E '{pkgs ? import <nixpkgs> {}}: pkgs.python3.withPackages (ps: [ps.requests])' \
  -o classic
/nix/store/g8r3c4fwrl5gzz2qybj3i7s5119q8qs2-python3-3.10.12-env

# we can import a package installed in the environment
> classic/bin/python -c 'import requests; print(requests)'
<module 'requests' from '/nix/store/g8r3c4fwrl5gzz2qybj3i7s5119q8qs2-python3-3.10.12-env/lib/python3.10/site-packages/requests/__init__.py'>

# but can't import that package from a venv
> classic/bin/python -m venv --system-site-packages classic-venv
> classic-venv/bin/python -c 'import requests; print(requests)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'requests'

Another problem is that the nix installation of python appears to python code to be a virtual
environment. The canonical way to detect a Python venv is to compare sys.prefix to
sys.base_prefix. In the base python installation, they will have the same value, but a venv will
update sys.prefix to point to the venv.

# the nix environment appears to be a venv
> classic/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/nix/store/2rrfpkq6cr8ppip9szl0z1qfdlskdinq-python3-3.10.12
/nix/store/g8r3c4fwrl5gzz2qybj3i7s5119q8qs2-python3-3.10.12-env

# the venv has sys.base_prefix set to the bare python interpreter package
> classic-venv/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/nix/store/2rrfpkq6cr8ppip9szl0z1qfdlskdinq-python3-3.10.12
/Users/cwp/dev/nixpkgs/classic-venv

The nix-generated environment appears to be a venv, but it's not. Most notably, it lacks a
pyvenv.cfg file in the root directory, but the structure of the file tree and the symlinks it
contains are also a bit different. Sophisticated python code that manipulates virtualenvs fails when
run in a nix environment.

The cause

The build machinery for Python packages does some clever manipulation of the Python runtime to make
the python interpreter and python packages that are individually defined as nix derivations
available to python code without copying them all into a single directory the way a vanilla Python
installation does.

The python interpreter packages use the sitecustomize
hook to read environment variables and manipulate the python runtime:

  • it sets sys.executable based on NIX_PYTHONEXECUTABLE
  • it sets sys.prefix based on NIX_PYTHONPREFIX
  • it adds directories to sys.path based on NIX_PYTHONPATH

When a python environment with additional packages is built, a wrapper is created
that invokes the python interpreter with the correct environment supply the sitecustomize module
with the information it needs. This wrapper is what causes the issues mentioned above. The python
interpeter relies very heavily on the path with which it is invoked to initialize its runtime. The
wrapper is quite transparent; it invokes the python interpreter with the "real" path of the
interpreter rather than that of the wrapper. This leads the python runtime to get built in the
context of the bare interpreter, rather than the full environment. The sitecustomize module then
makes some tweaks, but it can't completely compensate for the "incorrect" initialization of the
runtime.

The solution

The fix is pretty simple. Rather than invoking python with the "correct" path to the interpreter,
we invoke it with the path to the wrapper. This causes python to initialize its runtime correctly,
and makes all the sitecustomize machinery unnecessary. Rather than manipulating the
python runtime, we rely on symlinks to make the various nix packages available in the enviornment.

# build using a patched nixpkgs
> nix-build \
  -I nixpkgs="$HOME/dev/nixpkgs" \
  -E '{pkgs ? import <nixpkgs> {}}: pkgs.python3.withPackages (ps: [ps.requests])' \
  -o fixed
/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env

# we can import packages from the nix environment and the venv
> /nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env/bin/python -m venv --system-site-packages fixed-venv
> fixed-venv/bin/pip install six
Collecting six
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six
Successfully installed six-1.16.0
> fixed-venv/bin/python -c 'import six; import requests; print(six); print(requests)'
<module 'six' from '/Users/cwp/dev/eg-nix-venv/fixed-venv/lib/python3.10/site-packages/six.py'>
<module 'requests' from '/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env/lib/python3.10/site-packages/requests/__init__.py'>

# sys.prefix matches sys.base_prefix, so it doesn't look like a venv
> /nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env
/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env

# this also works when python is invoked through a symlink
> fixed/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/Users/cwp/dev/eg-nix-venv/fixed
/Users/cwp/dev/eg-nix-venv/fixed

# the venv has sys.base_prefix set to the environment
> fixed-venv/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env
/Users/cwp/dev/eg-nix-venv/fixed-venv

This approach has several benefits:

  • virtualenvs can use packages from the python environment
  • the nix environment no longer looks like a virtualenv to Python code
  • simplifies the Python interpreter packages
  • makes it possible to write nix packages that use sitecustomize for their own purposes

Testing

This change includes tweaks to the existing python environment tests, and several new tests.

@cwp
Copy link
Contributor Author

cwp commented Mar 21, 2024

@domenkozar @3541 New PR

@domenkozar
Copy link
Member

I suggest we merge this to staging and give it a go.

for path in ${lib.concatStringsSep " " paths}; do
if [ -d "$path/bin" ]; then
cd "$path/bin"
for prg in *; do
if [ -f "$prg" ]; then
rm -f "$out/bin/$prg"
if [ -x "$prg" ]; then
makeWrapper "$path/bin/$prg" "$out/bin/$prg" --set NIX_PYTHONPREFIX "$out" --set NIX_PYTHONEXECUTABLE ${pythonExecutable} --set NIX_PYTHONPATH ${pythonPath} ${lib.optionalString (!permitUserSite) ''--set PYTHONNOUSERSITE "true"''} ${lib.concatStringsSep " " makeWrapperArgs}
if [ -f ".$prg-wrapped" ]; then
echo "#!${pythonExecutable}" > "$out/bin/$prg"
Copy link
Member

@FRidh FRidh Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong, but does this assume that all executables in linked Python packages are Python scripts? This is an assumption that cannot be made; it's not uncommon to have shell scripts in bin/ and binaries can occur as well. Replacing the shebang can be done, but it needs to be checked that it is a Python interpreter, and preferably also exactly the same interpreter. The latter should actually always be the case, if not, we have another problem (e.g. with overriding somewhere).

In time we should aim to not wrap packages at build time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it only assumes that if files named bin/foo and bin/.foo-wrapped exist, then bin/foo is a wrapper and bin/.foo-wrapped is a python script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In time we should aim to not wrap packages at build time.

Yes! I have an idea for how to do that.

Copy link
Member

@FRidh FRidh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for extending the tests, this gives a lot more trust in the functioning of this approach.

Please do separate the make-wrapper changes in a separate commit as they are a separate unit.

cwp added 2 commits March 21, 2024 19:26
The way we build python environments is subtly broken. A python
environment should be semantically identical to a vanilla Python
installation in, say, /usr/local. The current implementation, however,
differs in two important ways. The first is that it's impossible to use
python packages from the environment in python virtual environments. The
second is that the nix-generated environment appears to be a venv, but
it's not.

This commit changes the way python environments are built:

  * When generating wrappers for python executables, we inherit argv[0]
    from the wrapper. This causes python to initialize its configuration
    in the environment with all the correct paths.
  * We remove the sitecustomize.py file from the base python package.
    This file was used tweak the python configuration after it was
    incorrectly initialized. That's no longer necessary.

The end result is that python environments no longer appear to be venvs,
and behave more like a vanilla python installation. In addition it's
possible to create a venv using an environment and use packages from
both the environment and the venv.
When building a python environment's bin directory, we now detect
wrapped python scripts from installed packages, and generate unwrapped
copies with the environment's python executable as the interpreter.
@cwp
Copy link
Contributor Author

cwp commented Mar 22, 2024

@FRidh Done!

@ofborg ofborg bot requested a review from FRidh March 22, 2024 03:30
@3541
Copy link

3541 commented Mar 22, 2024

Did one last manual check, and everything still works as expected.

@domenkozar domenkozar merged commit fb88417 into NixOS:staging Mar 22, 2024
19 checks passed
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixpkgs-news-a-weekly-recap-for-the-nix-community/42137/5

@SuperSandro2000
Copy link
Member

For some wrappers this is causing python lines to be deleted. see #302315 or #301449

makeWrapper "$path/bin/$prg" "$out/bin/$prg" --set NIX_PYTHONPREFIX "$out" --set NIX_PYTHONEXECUTABLE ${pythonExecutable} --set NIX_PYTHONPATH ${pythonPath} ${lib.optionalString (!permitUserSite) ''--set PYTHONNOUSERSITE "true"''} ${lib.concatStringsSep " " makeWrapperArgs}
if [ -f ".$prg-wrapped" ]; then
echo "#!${pythonExecutable}" > "$out/bin/$prg"
sed -e '1d' -e '3d' ".$prg-wrapped" >> "$out/bin/$prg"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming line 3 here is wrong. In normal wrappers this is line 2 and when injecting the wrapper this could be any line, since it is skipping comments IIRC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert in #302385

@RuRo
Copy link
Contributor

RuRo commented Apr 11, 2024

@cwp this PR got reverted because of the broken sed unwrapping, can you clarify if you are planning on trying to fix this again? No pressure/expectations, just asking so that this doesn't get lost in limbo, and we don't step on each other's toes.

@imincik
Copy link
Contributor

imincik commented Apr 11, 2024

cc @domenkozar

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/cuda-tensorflow-my-setup-is-really-hacky-would-appreciate-help-unhackying-it/43912/9

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-build-python-virtualenv-with-packages-provided-by-python3-withpackages/24766/16

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-build-python-virtualenv-with-packages-provided-by-python3-withpackages/24766/20

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-build-python-virtualenv-with-packages-provided-by-python3-withpackages/24766/22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants