metaflow-ray
is an extension for Metaflow that enables seamless integration with Ray, allowing users to easily leverage
Ray's powerful distributed computing capabilities within their Metaflow flows.
With metaflow-ray
, you can spin up transient Ray clusters on AWS Batch directly from your Metaflow steps using
the @metaflow_ray
decorator. This enables you to run your Ray applications that leverage Ray Core, Ray Train, Ray Tune,
and Ray Data effortlessly within your Metaflow flow.
- Effortless Ray Integration: This extension provides a simple and intuitive way to incorporate Ray
into your Metaflow workflows using the
@metaflow_ray
decorator - Transient Ray Clusters: Let Metaflow orchestrate the creation of transient Ray clusters on top of AWS Batch multi-node parallel jobs
- Seamless Ray Initialization: The
@metaflow_ray
decorator handles the initialization of the Ray cluster for you, so you just need to focus on writing your Ray code without worrying about cluster setup - Wide Range of Applications: Run a wide variety of Ray applications, including hyperparameter tuning, distributed data processing, and distributed training
You can install metaflow-ray
via pip
alongside your existing Metaflow installation:
pip install metaflow-ray
- Import the
@metaflow_ray
decorator to enable integration:
from metaflow import metaflow_ray
- Decorate your step with
@metaflow_ray
@step
def start(self):
self.next(self.train, num_parallel=NUM_NODES)
@metaflow_ray
@batch(**RESOURCES) # You can even set @kubernetes
@step
def train(self):
# Your step's training code here
- Initialize Ray within Your Step
@metaflow_ray
@batch(**RESOURCES) # You can even set @kubernetes
@step
def train(self):
import ray
ray.init()
# Your ray application code here
Check out the examples directory for sample Metaflow flows that demonstrate how to use the metaflow-ray
extension
with various Ray applications.
Directory | Description |
---|---|
Hello! | Run a Ray program in Python, then do it inside a Metaflow task! |
Train XGBoost | Use Ray Train to build XGBoost models on one or more nodes, including CPU and GPU examples. |
Tune PyTorch | Use Ray Tune to build PyTorch models on one or more nodes, including CPU and GPU examples. |
End-to-end Batch Workflow | Train models, evaluate them, and serve them. See how to use Metaflow workflows and various Ray abstractions together in a complete workflow. |
PyTorch Lightning | Get started run a PyTorch Lightning job on the Ray cluster formed in a @metaflow_ray step. |
GPT-J Fine Tuning | Fine tune the 6B parameter GPT-J model on a Ray cluster. |
metaflow-ray
is distributed under the Apache License.