Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing BC baseline results on soft body envs #5

Open
etaoxing opened this issue Jan 20, 2023 · 6 comments
Open

Reproducing BC baseline results on soft body envs #5

etaoxing opened this issue Jan 20, 2023 · 6 comments

Comments

@etaoxing
Copy link

I'm having trouble reproducing results on Pinch-v0. I was able to get Write-v0 and Hang-v0 working though.

Here's the commands I'm running: demo conversion with general_soft_body_envs.txt and scripts/example_training/bc_soft_body_pointcloud.sh:

python maniskill2_learn/apis/run_rl.py configs/brl/bc/rgbd_soft_body.py \
--work-dir workdir/ --gpu-ids 0 \
--cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=rgbd" "env_cfg.n_points=1200" \
"env_cfg.reward_mode=dense" \
"env_cfg.control_mode=pd_ee_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinch-v0/trajectory.none.pd_ee_delta_pose_rgbd.h5" \
"replay_cfg.num_samples=50" "replay_cfg.cache_size=1024" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" \
"train_cfg.n_eval=50000" "train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=500"

I've also tried with env_cfg.control_mode=pd_ee_target_delta_pose

How much memory would be needed to run with replay_cfg.num_samples=-1? Or is there a better way of training with all 1500+ demos using replay_cfg.dynamic_loading=True?

@xuanlinli17
Copy link
Collaborator

xuanlinli17 commented Jan 22, 2023

Pinch-v0 BC demos contain target images, so it consumes lots of memory.

I recommend modify the demo replay buffer config file in this case:

demo_replay_cfg=dict(
    type="ReplayMemory",
    capacity=int(2e4),
    num_samples=-1,
    cache_size=int(2e4),
    dynamic_loading=True,
    synchronized=False,
    keys=["obs", "actions", "dones", "episode_dones"],
    buffer_filenames=[
        "PATH_TO_DEMO.h5",
    ],
),

i.e. thru demo_replay_cfg.dynamic_loading=True demo_replay_cfg.capacity=20000 demo_replay_cfg.cache_size=20000 demo_replay_cfg.num_samples=-1; this will load all demo data dynamically.

For BC, there is only 1 replay buffer, so replace the above demo_replay_cfg with replay_cfg.

Note that for non-BC algorithms, demo_replay_cfg is not the same as replay_cfg, i.e. demo replay buffer is a separate buffer from the (online) replay buffer for collecting online environment trajectories

@etaoxing
Copy link
Author

etaoxing commented Jan 24, 2023

I'm running on a machine with a 3090 and 64GB RAM, so I lowered to replay_cfg.capacity=5000 and replay_cfg.cache_size=5000. Pinch-v0/trajectory.none.pd_ee_delta_pose_pointcloud.h5 is 40GB.

python maniskill2_learn/apis/run_rl.py configs/brl/bc/pointnet_soft_body.py \
--work-dir workdir/ --gpu-ids 0 \
--cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "env_cfg.obs_frame=ee" \
"env_cfg.reward_mode=dense" \
"env_cfg.control_mode=pd_ee_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinch-v0/trajectory.none.pd_ee_delta_pose_pointcloud.h5" \
"replay_cfg.capacity=5000" "replay_cfg.num_samples=-1" "replay_cfg.cache_size=5000" \
"replay_cfg.dynamic_loading=True" "replay_cfg.synchronized=False" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" \
"train_cfg.n_eval=50000" "train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=500"

Still unable to train pointcloud BC baseline. GPU utilization shows 0%, occasionally increasing to 3-8%. Attached the log.

20230124_115441-train.log

@xuanlinli17
Copy link
Collaborator

Does it report anything if you set train_cfg.n_updates=5?

If it reports, then it means it's training, it's just really slow due to file io.

BTW Is the demo stored on ssd?

@xuanlinli17
Copy link
Collaborator

Also you can do some custom processing in env wrappers and implement new architectures if you implement your own approach, since Pinch-v0 indeed has the largest observation space among all envs (for default wrapper, we only downsample the observation point cloud, but not the target_rgb, target_points, or target_depth)

image

@etaoxing
Copy link
Author

Yes, the demos are on root ssd.

Seems to start training, but grad_norm becomes 0 pretty quickly. True for env_cfg.control_mode=pd_ee_target_delta_pose and env_cfg.control_mode=pd_ee_delta_pose

python maniskill2_learn/apis/run_rl.py configs/brl/bc/pointnet_soft_body.py --work-dir workdir / 
--gpu-ids 0 --cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=pointcloud" \
"env_cfg.n_points=1200" "env_cfg.obs_frame=ee" \
"env_cfg.reward_mode=dense" "env_cfg.control_mode=pd_ee_target_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinchv0/trajectory.none.pd_ee_target_delta_pose_pointcloud.h5" \
"replay_cfg.capacity=2000" "replay_cfg.num_samples=-1" "replay_cfg.cache_size=2000" \
"replay_cfg.dynamic_loading=True" "replay_cfg.synchronized=False" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" "train_cfg.n_eval=50000" \ 
"train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=10"

20230119_102001-train.log

@xuanlinli17
Copy link
Collaborator

xuanlinli17 commented Jan 27, 2023

Was able to reproduce it for point cloud BC. Though for RGB-D BC, the gradient does not fall to zero. (RGB-D BC also requires more memory).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants