Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: allow to generate requirements.lock for each modules in a workspace #615

Open
lambda-science opened this issue Feb 8, 2024 · 5 comments · May be fixed by #1094
Open

Feature: allow to generate requirements.lock for each modules in a workspace #615

lambda-science opened this issue Feb 8, 2024 · 5 comments · May be fixed by #1094
Labels
enhancement New feature or request

Comments

@lambda-science
Copy link

lambda-science commented Feb 8, 2024

When using a workspace with multiples modules, all packages are installed in a top-level .venv with a requirements.lock that combine all packages. It's good for local developpement

But when you want to build a Docker image for each module, you want to minimize the size of your build, so you want to only install the package necessary for each module of the workspace. Currently you can use RYE and docker very easily with:

RUN sed '/-e/d' requirements.lock > requirements.txt
RUN pip install -r requirements.txt

The issue is that, in workspace we only have a top-level requirements.lock.
Feature suggestion: when using a workspace, we should add the option to generate a per-module requirements.lock
Like:

my-workspace/
├─ module1/
│  ├─ src/
│  ├─ pyproject.toml # Module/Package pyproject.toml for build/publish
│  ├─ requirements.lock
│  ├─ requirements-dev.lock
├─ module2/
│  ├─ src/
│  ├─ pyproject.toml # Module/Package pyproject.toml for build/publish
│  ├─ requirements.lock
│  ├─ requirements-dev.lock
├─ pyproject.toml # Workspace pyproject.toml for the IDE
├─ requirements.lock
├─ requirements-dev.lock
@mitsuhiko
Copy link
Collaborator

I quite like the idea but that could be quite slow. Might be a good idea to brainstorm how this would be configured and how it would behave.

@mitsuhiko mitsuhiko added the enhancement New feature or request label Feb 9, 2024
@davfsa
Copy link
Contributor

davfsa commented May 13, 2024

I quite like the idea but that could be quite slow.

I might be missing something, but I don't think it will be much less performant. It would just be writing a file per each of the workspace projects.


I would like to propose my idea and maybe even work on it (my rust knowledge is really basic, but I would love to give it a shot):

Configuration:

Under [tool.rye.workspace], there can be a new configuration key: per_member_lock: bool (for example) and that will toggle the functionality for whether to write the lockfile per member or not.

Behavior:

The per member locks would be purely symbolic, they wont be used to install any requirements, that would be all handled with the top level lock

Found this issue, as it happens to be the exact use case I would like to have in rye and found missing (usecase is a monorepo), and would love to aid and implement it if I can!

@davfsa davfsa linked a pull request May 19, 2024 that will close this issue
@Taytay
Copy link

Taytay commented May 26, 2024

Rye does appear to know which sub-projects need which modules, since it documents the references in the lockfile itself:

# generated by rye
# use `rye lock` or `rye sync` to update this lockfile
#
# last locked with the following flags:
#   pre: false
#   features: ["all"]
#   all-features: false
#   with-sources: false
#   generate-hashes: false

-e file:src/project-a
    # via project-b
-e file:src/project-b
accelerate==0.26.1
    # via project-a
    # via transformers
aiohttp==3.9.5
    # via project-b
    # via fsspec
    # via instructor
    # via litellm
...

Although it would require the sub-projects to correct declare their depencies, rather than relying upon the universe of installed packages being sufficient, this does seem like the sort of thing that could be determined by pruning the dependency tree. (Honestly, you might not need to get THAT sophisticated if you parsed the lockfile comments and build the tree from that...)

@Taytay
Copy link

Taytay commented May 28, 2024

Update:

First, I see that there is a PR open to fix this, and thanks @davfsa for taking a stab at it!

I the meantime, I have managed to hack this together as follows:

Imagine I have a workspace with project_a, project_b, and project_c.
Each have their own separate dependencies from pypi. Rye does a good job of ensuring that they have versions of each dependency in common with the global lockfile.

Now imagine that project_b depends on project_a, and project_c depends on project_b

In each sub-project's pyproject.toml, I list dependencies for that project of course, as well as dependencies on other locally-installed package in my workspace. I just reference them by name though! I don't reference them by any sort of relative path. (This works fine because every time we need to resolve that package at package install time, it's already been installed for us by rye or another mechanism anyway.)
We do it this way because pyproject.toml files don't allow for relative references. requirements.txt files do though.

So this is where it starts to get a bit hacky.

As a sibling of each pyproject.py, I've got a requirements.in file, that looks like this.

# project_b/requirements_local.in

# this line is here because in my case I want my lockfile to record the fact that I'm installing my local dir in editable mode
# -e file:./

# this line is here because I need to ensure that project_a is installed when compiling the lockfile. That will automatically install all of project_a's deps referenced in pyproject.toml
-e file:../project_a

# and this says: Make sure that you also include any other local references to projects that project_a might have.
-r ../project_a/requirements_local.in

Now I can go into project_b and run:

WORKSPACE_DIR=$(git rev-parse --show-toplevel)
FOLDER=src/project_b

RELATIVE_PATH_FROM_FOLDER_TO_WORKSPACE=$(realpath --relative-to="$FOLDER" "$WORKSPACE_DIR")
rye run uv pip compile \
  --constraint "$RELATIVE_PATH_FROM_FOLDER_TO_WORKSPACE/requirements.lock" \
  --strip-extras \
  ./pyproject.toml \
  ./requirements_local.in \
  -o requirements.lock

(You can do a standard pip compile instead, but uv pip compile is about 100x faster.)

This tells pip compile:
1: Use requirements from both pyproject.toml AND requirements_local.in
2: When you are generating a lockfile for this project, use the root requirements.lock as a "constraints" file: https://pip.pypa.io/en/stable/user_guide/#constraints-files
That way, we respect the resolved versions of packages from our root lockfile.
3: Put the output into a local requirements.lock

It will also include the references to the editable installs of the local packages, which you might not want:

# requirements.lock:

# This file was autogenerated by uv via the following command:
#    uv pip compile --constraint ../../requirements.lock --strip-extras ./pyproject.toml ./requirements_local.in -o requirements.lock
-e file:../project_a
    # via -r ./requirements_local.in

If you don't want that sort of thing, you can add another option to the uv pip compile like:
--no-emit-package project-a

That way, the reference to project-a itself isn't in the lockfile, but its deps still are. This is equivalent to pip's --unsafe-package flag, which has a bad name for historical reasons.

@Taytay
Copy link

Taytay commented May 28, 2024

If you want to (ab)use this lockfile a bit more and make sure that your project-specific lockfile is sufficient, you can create a sub-venv with that lockfile you made:

[tool.rye.scripts]
venv_create = "uv venv ./.venv"
venv_sync= "uv pip sync --python ./.venv/bin/python ./requirements.lock"
venv_create_sync = {chain = ["venv_create", "venv_sync"]}
check_deps = "basedpyright --project ./pyproject.toml"

[tool.basedpyright]
include = ["src/"]
# Ensure that it uses the .venv that we created for this project with the lockfile
venvPath="./"
venv=".venv"
# We really only care about some import issues, so we disable everything and report on missing imports:
typeCheckingMode = "off"
reportMissingImports = true

Now you can run in your sub-project:

cd src/projectB && rye run venv_create_sync && rye run check_deps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants