Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for --scie-busybox. #2468

Merged
merged 16 commits into from
Aug 1, 2024
Merged

Add support for --scie-busybox. #2468

merged 16 commits into from
Aug 1, 2024

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Jul 17, 2024

This allows you to turn your PEX into a native BusyBox scie by
specifying which entry points to expose as BusyBox commands.

This allows you to turn your PEX into a native BusyBox scie by
specifying which entry points to expose as BusyBox commands.
@jsirois
Copy link
Member Author

jsirois commented Jul 17, 2024

Reviewers, a quickie before I hit the road. I'll be back on 7/25 to add tests.

For example though:

# Build a BusyBox PEX scie with 3 entry-points: 1 console script and 2 module eps from the stdlib:
:; python -mpex cowsay --scie lazy --scie-busybox cowsay,venv:venv,http-server:http.server -obusy-box

# Since the exe name does not match any ep, you get BusyBox help:
:; ./busy-box
Error: Could not determine which command to run.

Please select from the following boot commands:

cowsay
http-server
venv

You can select a boot command by setting the SCIE_BOOT environment variable or else by passing it as the 1st argument.

# Now use the BusyBox:
:; ./busy-box http-server -h
usage: server.py [-h] [--cgi] [--bind ADDRESS] [--directory DIRECTORY] [port]

positional arguments:
  port                  specify alternate port (default: 8000)

options:
  -h, --help            show this help message and exit
  --cgi                 run as CGI server
  --bind ADDRESS, -b ADDRESS
                        specify alternate bind address (default: all interfaces)
  --directory DIRECTORY, -d DIRECTORY
                        specify alternate directory (default: current directory)

# Now install a link farm for more convenient use (if the install directory is on your PATH anyhow):
:; SCIE=install ./busy-box -s /tmp/bin
:; ls -l /tmp/bin/
total 0
lrwxrwxrwx 1 jsirois jsirois 39 Jul 17 15:15 cowsay -> /home/jsirois/dev/pex-tool/pex/busy-box
lrwxrwxrwx 1 jsirois jsirois 39 Jul 17 15:15 http-server -> /home/jsirois/dev/pex-tool/pex/busy-box
lrwxrwxrwx 1 jsirois jsirois 39 Jul 17 15:15 venv -> /home/jsirois/dev/pex-tool/pex/busy-box

:; /tmp/bin/venv -h
usage: venv [-h] [--system-site-packages] [--symlinks | --copies] [--clear] [--upgrade] [--without-pip] [--prompt PROMPT] [--upgrade-deps] ENV_DIR [ENV_DIR ...]

Creates virtual Python environments in one or more target directories.

positional arguments:
  ENV_DIR               A directory to create the environment in.

options:
  -h, --help            show this help message and exit
  --system-site-packages
                        Give the virtual environment access to the system site-packages dir.
  --symlinks            Try to use symlinks rather than copies, when symlinks are not the default for the platform.
  --copies              Try to use copies rather than symlinks, even when symlinks are the default for the platform.
  --clear               Delete the contents of the environment directory if it already exists, before environment creation.
  --upgrade             Upgrade the environment directory to use this version of Python, assuming Python has been upgraded in-place.
  --without-pip         Skips installing or upgrading pip in the virtual environment (pip is bootstrapped by default)
  --prompt PROMPT       Provides an alternative prompt prefix for this environment.
  --upgrade-deps        Upgrade core dependencies: pip setuptools to the latest version in PyPI

Once an environment has been created, you may wish to activate it, e.g. by sourcing an activate script in its bin directory.

# And always end on a Moo!:
:; mv busy-box cowsay
./cowsay -t Moo!
  ____
| Moo! |
  ====
    \
     \
       ^__^
       (oo)\_______
       (__)\       )\/\
           ||----w |
           ||     ||

Copy link
Collaborator

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait for the tests, but this LGTM!

@@ -30,6 +33,88 @@ class Value(Enum.Value):
EAGER = Value("eager")


@attr.s(frozen=True)
class ConsoleScriptsManifest(object):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sureshjoshi I have not updated the --scie-busybox help string yet to reflect this idea - I wanted to run it by you 1st:

You can now, for example, say:

python -m pex ansible --scie eager --scie-busybox *:* -o ansible

That nets you a binary that executes as ansible, but allows SCIE=install ./ansible ... to get the BusyBox of all ansible scripts. Now ansible is clean in terms of transitive deps not having unwanted scripts. But for cases where that is so, you can restrict to all scripts in a dist:

python -m pex ansible --scie eager --scie-busybox ansible:*,ansible-core:* -o ansible

Or use some combination of the above subtracting off individual scripts or all scripts from a distribution by prefixing with -.

Good, bad or ugly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, switched from - to ! so the CLI use is not a total PITA, but otherwise filled out help docs and tests. I think this makes sense.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey sorry, just a heads up that I just made it down to the states for a vacation, so I wouldn't be able to review anything until I'm back in the cold north

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd defer to @sureshjoshi but this seems like reasonable syntax to me.

Copy link
Member Author

@jsirois jsirois Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the <project-name>:* and *:* are excessive, (the *'s are not needed at all for parsing disambiguation) but my thought was just <project-name>: and : respectively would confuse people (they did enough in Pants-land). I think * as the wildcard is more expected and neither <project-name>:* nor *:* are likely to match in the current directory; so they are not shell-onerous.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding flags, this un-overloads syntax. : used to mean many things, now it means just index a function in a module, which is pre-existing Python-ecosystem syntax.

Good approach, I like it!

With that change, I'm happy where this landed. Either of *@* or :all:@:all: are fine by me, definitely agree that :all: doesn't look particularly nice... tolerable (to me), but not nice.

I'm unsure where we landed with validation of the @-specified module scripts, but definitely 100% happy to go with * if there is validation. And like 80% happy to go with * without the validation. I.e. happy with * either way 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually lean towards dropping the wildcard with the switch from : -> (=, :, @). There is no need for a wildcard (just as there was none for :), but I think @ansible-core, !@ansible-core, and (less so) @ all make sense on their own side-stepping the shell potentially turning * into invalid entry points. As I said above though, I'm writing up a documentation page to go with this PR; so we'll see how the explanation reads there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hoisted on my own petard. Can't use @ since argparse is configured with @fromfile support via a user contribution in 2021. I'll noodle a bit more...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring the pre-compile bindings, this looks like a use case for the adhoc=module.deeper:func style, am I correct?

Aha, yes - for the "seed" command, not the default command. There were 2 things going on in that second example and I didn't read closely enough. So, you'd run:

pex uvicorn -c uvicorn --inject-args "apigateway.main:app --port 7999" --scie eager --scie-busybox seed=seeder:main,uvicorn -o uvicorn

I've just finished the --inject-args / --inject-python-args / --inject-env support, which was a bug to be missing under the "behaves just like the PEX" mantra.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. @sureshjoshi, @huonw and @benjyw probably worth another look. I've switched to @ syntax, made the needed change in the Pex argparse setup to still support @fromfiles, fixed --inject* propagation to the matching BusyBox command, and added a new doc page rendered here: https://jsirois.github.io/pex/scie.html.

@jsirois
Copy link
Member Author

jsirois commented Jul 21, 2024

Alright reviewers, I got to this earlier than expected and I think its good to go for a final review.

Copy link
Collaborator

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice feature. NB. I haven't reviewed the functionality in detail.

@@ -30,6 +33,88 @@ class Value(Enum.Value):
EAGER = Value("eager")


@attr.s(frozen=True)
class ConsoleScriptsManifest(object):
Copy link
Collaborator

@huonw huonw Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huonw ... Any thoughts on this?

If you're asking about * and escaping, I'll think out loud:

  • There is a risk of people writing a script that doesn't escape properly and it seeming to work until they add a file with an unlucky name.
  • If one added foo:bar, the immediate consequence would be expanding *:* to --scie-busybox foo:bar.
  • It looks like this would be silently accepted and result in a pex with an unexpected configuration (at least, I don't see validation that the project name foo is understood, but maybe I missed it).
  • Having validation that all input project names are known distributions would reduce that consequence; is that possible?
  • Alternatively, use some other shell-safe signifier. I don't have great ideas, only ..., e.g. ansible:... or ...:... but that's pretty ugly. * matches my expectations more.

Summary: I can't think of a better * I like, but maybe we could validate the foo:bar values are actually known entrypoints (at least the project name) to reduce the consequence of problems here?

type=str,
default=[],
action="append",
help=(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor feedback: these docs look good, but they're a bit of a wall of text that'll usually be read in the context of a terminal without any formatting aids, and thus I'd expect most people (even well-intentioned diligent ones) to fail to synthesise the information out of it.

Is it possible to make it easier for users by breaking it up into paragraphs and/or lists?

Copy link
Member Author

@jsirois jsirois Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not - Argparse doesn't support formatting I know of. I've both avoided spending time on any bespoke argument system to remedy this or turning towards documentation, which needs major work going on 10 years now. One or both are coming, but I'm prioritizing robust users so far in favor of righting the Pex ship going on 8 years. I do finally feel close though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Thanks for the background. 👍

jsirois added a commit to jsirois/pex that referenced this pull request Jul 29, 2024
Comment on lines -301 to -303
if qualname_separator:
for attr in qualname.split('.'):
entry_point = getattr(entry_point, attr)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The re-factoring of {EntryPoint,CallableEntryPoint} -> NamedEntryPoint <<containing>> {ModuleEntryPoint,CallableEntryPoint} highlighted that this generated console script was wrong for modules (i.e.: when this if branch is not taken) since a module is not a callable! It turns out that in practice, console script entrypoints are always functions (callables) and not modules and so no bug reports. That said, I switched Pex's own pex entrypoint from a function to the pex module in its own pyproject.toml and verified that the pex wheel still builds fine and then a Pex PEX venv built from that wheel had a pex console script that then failed. Thus, the fix.

Copy link
Collaborator

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, only minor feedback

docs/scie.md Outdated Show resolved Hide resolved
pex/scie/model.py Outdated Show resolved Hide resolved
@jsirois
Copy link
Member Author

jsirois commented Aug 1, 2024

@sureshjoshi I'd like to proceed with this by tomorrow (August 1st) evening. Let me know if you need more time to review; otherwise, I'll assume you're fine with the @ syntax and no explicit wildcards.

@sureshjoshi
Copy link
Collaborator

@jsirois I'm unavailable for any in-depth review until probably Tuesday/Wednesday - but I think we've converged on something and the docs make sense to me

@jsirois jsirois merged commit 9be1789 into pex-tool:main Aug 1, 2024
26 checks passed
@jsirois jsirois deleted the scie/BusyBox branch August 1, 2024 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants