Skip to content
This repository has been archived by the owner on Feb 5, 2024. It is now read-only.

Ray command execution on Della #10

Open
vineetbansal opened this issue Nov 17, 2023 · 0 comments
Open

Ray command execution on Della #10

vineetbansal opened this issue Nov 17, 2023 · 0 comments

Comments

@vineetbansal
Copy link
Owner

vineetbansal commented Nov 17, 2023

EDIT: Handle after tackling all other issues.

We have a ray.jinja similar to hello.jinja that starts a ray cluster remotely. We need to test it out on Della. Once the ray cluster has started, we should be able to run remote ray commands like these:

    ray.init("ray://localhost:10001", runtime_env={"py_modules": [mymodule]})

    # At this point 'mymodule' is available on the remote node
    from mymodule.api import hello_ray, hello_gpu

    future = hello_ray.remote()
    result = ray.get(future)
    print(result)

Note that mymodule is local code that only sits on your machine (and is not part of the wbi module).

The ray:///.. part should point to the head node of the ray cluster. Normally this would be the head node of the ray process (not the head node on Della). To find out which compute node was assigned the head node of Ray, look at the output of the slurm job (the ray template outputs all this information to stdout). Let's say its della-l07g3. Then you can set up local port forwarding like so:

ssh -J della -N -L "10001:localhost:10001" "della-l07g3"

i.e. forward port 10001 on localhost to della-l07g3 throught the jump server della, The address to ray.init can then simply be ray://localhost:10001.

The current template activates an environment:

conda activate ray

For this to work for all users, the environment has to be placed at a place that is readable by all persons in the group, and we can do something like conda activate /tigress/LEIFER/path/to/conda/env.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants