Ray command execution on Della #10

vineetbansal · 2023-11-17T21:29:05Z

EDIT: Handle after tackling all other issues.

We have a ray.jinja similar to hello.jinja that starts a ray cluster remotely. We need to test it out on Della. Once the ray cluster has started, we should be able to run remote ray commands like these:

    ray.init("ray://localhost:10001", runtime_env={"py_modules": [mymodule]})

    # At this point 'mymodule' is available on the remote node
    from mymodule.api import hello_ray, hello_gpu

    future = hello_ray.remote()
    result = ray.get(future)
    print(result)

Note that mymodule is local code that only sits on your machine (and is not part of the wbi module).

The ray:///.. part should point to the head node of the ray cluster. Normally this would be the head node of the ray process (not the head node on Della). To find out which compute node was assigned the head node of Ray, look at the output of the slurm job (the ray template outputs all this information to stdout). Let's say its della-l07g3. Then you can set up local port forwarding like so:

ssh -J della -N -L "10001:localhost:10001" "della-l07g3"

i.e. forward port 10001 on localhost to della-l07g3 throught the jump server della, The address to ray.init can then simply be ray://localhost:10001.

The current template activates an environment:

conda activate ray

For this to work for all users, the environment has to be placed at a place that is readable by all persons in the group, and we can do something like conda activate /tigress/LEIFER/path/to/conda/env.

The text was updated successfully, but these errors were encountered:

vineetbansal mentioned this issue Nov 17, 2023

Create hello world python template #11

Closed

vineetbansal assigned anushka255 Nov 19, 2023

vineetbansal unassigned anushka255 Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ray command execution on Della #10

Ray command execution on Della #10

vineetbansal commented Nov 17, 2023 •

edited

Loading

Ray command execution on Della #10

Ray command execution on Della #10

Comments

vineetbansal commented Nov 17, 2023 • edited Loading

vineetbansal commented Nov 17, 2023 •

edited

Loading