Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CondaEnvironment decospecs ignores base environment #654

Closed
bishax opened this issue Aug 23, 2021 · 2 comments
Closed

CondaEnvironment decospecs ignores base environment #654

bishax opened this issue Aug 23, 2021 · 2 comments
Assignees

Comments

@bishax
Copy link
Contributor

bishax commented Aug 23, 2021

Looking at some of the implementation details of Metaflow last week, I discovered the undocumented(?) metaflow_custom extension mechanism.

I implemented a different default MetaflowEnvironment that adds the current project to the job package, and adds a StepDecorator via. MetaflowEnvironment.decospecs that installs this package.

The problem comes when I want to add a conda environment on top of this default environment.
An instance of CondaEnvironment does not look at self.base_env.decospecs and add those onto its own implementation...

A change along the lines of the following allows the composition of CondaEnvironment and a custom base environment.

- return ('conda')
+ return ('conda', *self.base_env.decospecs())
@romain-intel
Copy link
Contributor

Good catch. Feel free to propose a PR for this along the lines of what you suggest. I had changed some of the other functions to delegate more to the base environment (#502 and #493 for example) but missed that one. get_client_info may be another one where we need to also delegate but that one may be a little more complicated and isn't as useful.

As for the metaflow_custom mechanism, yes, it is currently not documented and we do not guarantee that the APIs will stay stable. We are determining the best way to make it something more officially supported (if there is an appetite for it) so would appreciate any indication on how you are using it and if there are things you would like it to have that it doesn't.

@bishax
Copy link
Contributor Author

bishax commented Aug 24, 2021

Thanks @romain-intel, I've submitted a PR #660 with my suggestion above.
I agree that get_client_info is more complicated and less useful, so haven't tackled that here. My suggestion would be to add a key base (or similar) to the returned dict containing the value of self.base_env.get_client_info(...) (but I'm not familiar with the details of where all this information gets used).

From my end there's definitely an appetite to make metaflow_custom something more officially supported.
I'll share code and any associated reflections when I have something a little more refined but some of the functionality I've prototyped:

  1. A different default environment paired with a step decorator that together adds a "project"* to the job package (performed by the environment), and installs it (performed by the step decorator).
  2. Extend the CLI to be able to parametrise run from a YAML config file.
  3. Super simple flow decorator that raises an error if run on batch or in conda - Some flows will fail if run on batch or in an isolated env because certain files/dependencies won't be present, this ensures that a flow fails early if someone tries to run --with batch.

Other things I may try in future:

  • Run a flow in docker on a local machine
  • pipenv (or similar) alternative to the Conda environment and decorator

*Currently crudely defined as walking up from the flow directory until you find a setup.py (or doing nothing if it reaches /)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants