Need testing? #9

0xsamgreen · 2019-05-11T02:36:05Z

Hi, thank you for making this port :)

In you conversation with Danijar, I read that you're limited in your abilities to test because of GPU availability? I'm interested in building from your code, and I'd be happy to help run tests for you. I have four Titan Xps I could dedicate to it for a bit. My limitation is that I don't have a Mujoco license (I'm working on it) so testing would be limited to Gym environments.

Kaixhin · 2019-05-11T10:17:43Z

That would be much appreciated! Unfortunately I'm still waiting for the latest results from Danijar (fixing the bug in the RNN would improve upon the original results), and in order to check that this code is fine we'd want to compare results so that means DeepMind Control Suite.

If you're interested in getting baseline results for your own work then perhaps it would be interesting to have some results on Pendulum-v0 and MountainCarContinuous-v0 (both symbolic and visual observations) anyway? We currently have no idea what appropriate hyperparameters are, and as one of those people who also don't have easy access to MuJoCo licenses I know it'd be nice to have some completely open source reference results.

0xsamgreen · 2019-05-12T22:13:24Z

Hi @Kaixhin,

I'm running headless, and I'm able to run symbolic mode fine, but I can't run in non-sybolic mode. I have render set to False, but I get the following error:

                          Options
                          seed: 1
                          disable_cuda: False
                          env: Pendulum-v0
                          symbolic_env: False
                          max_episode_length: 1000
                          experience_size: 1000000
                          activation_function: relu
                          embedding_size: 1024
                          hidden_size: 200
                          belief_size: 200
                          state_size: 30
                          action_repeat: 2
                          action_noise: 0.3
                          episodes: 2000
                          seed_episodes: 5
                          collect_interval: 100
                          batch_size: 50
                          chunk_size: 50
                          overshooting_distance: 50
                          overshooting_kl_beta: 1
                          overshooting_reward_scale: 1
                          global_kl_beta: 0.1
                          free_nats: 2
                          learning_rate: 0.001
                          action_noise: 0.3
                          episodes: 2000
                          seed_episodes: 5
                          collect_interval: 100
                          batch_size: 50
                          chunk_size: 50
                          overshooting_distance: 50
                          overshooting_kl_beta: 1
                          overshooting_reward_scale: 1
                          global_kl_beta: 0.1
                          free_nats: 2
                          learning_rate: 0.001
                          grad_clip_norm: 1000
                          planning_horizon: 12
                          optimisation_iters: 10
                          candidates: 1000
                          top_candidates: 100
                          test_interval: 25
                          test_episodes: 10
                          checkpoint_interval: 25
                          checkpoint_experience: False
                          load_experience: False
                          load_checkpoint: 0
                          render: False
Traceback (most recent call last):
  File "main.py", line 85, in <module>
    observation, done, t = env.reset(), False, 0
  File "/home/sgreen/working/planet.pt/env.py", line 87, in reset
    return torch.tensor(cv2.resize(self._env.render(mode='rgb_array'), (64, 64), interpolation=cv2.INTER_LINEAR).transpose(2, 0, 1), dtype=torch.float32).div_(255).unsqueeze(dim=0)
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/gym/core.py", line 249, in render
    return self.env.render(mode, **kwargs)
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/gym/envs/classic_control/pendulum.py", line 61, in render
    from gym.envs.classic_control import rendering
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/gym/envs/classic_control/rendering.py", line 27, in <module>
    from pyglet.gl import *
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/pyglet/gl/__init__.py", line 239, in <module>
    import pyglet.window
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/pyglet/window/__init__.py", line 1896, in <module>
    gl._create_shadow_window()
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/pyglet/gl/__init__.py", line 208, in _create_shadow_window
    _shadow_window = Window(width=1, height=1, visible=False)
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/pyglet/window/xlib/__init__.py", line 166, in __init__
    super(XlibWindow, self).__init__(*args, **kwargs)
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/pyglet/window/__init__.py", line 501, in __init__
    display = get_platform().get_default_display()
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/pyglet/window/__init__.py", line 1845, in get_default_display
    return pyglet.canvas.get_display()
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/pyglet/canvas/__init__.py", line 82, in get_display
    return Display()
  File "/ascldap/users/sgreen/anaconda3/envs/planet.pt/lib/python3.7/site-packages/pyglet/canvas/xlib.py", line 86, in __init__
    raise NoSuchDisplayException('Cannot connect to "%s"' % name)
pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None"

Are you also running headless?

Kaixhin · 2019-05-13T01:44:27Z

I am not running headless, as I'm using gym's render functionality to get image-based observations. I'm afraid you'll need to run with either a real or fake display.

0xsamgreen · 2019-05-16T23:58:59Z

I have a MuJoCo license now! Is there a sweep of MuJoCo environment tests you would like run?

Kaixhin · 2019-05-17T00:18:09Z

Great news! I've started a run on walker-walk, so any of the others.

0xsamgreen · 2019-05-17T00:24:18Z

Will do. Other than the environment, should I use the default parser arguments of your latest commit?

Kaixhin · 2019-05-17T00:32:34Z

Latest commit should be same settings as PlaNet camera ready so yes - only other change is the recommended action repeat per environment, which you can see in env.py.

0xsamgreen · 2019-05-17T15:20:16Z

Thanks, I started a run on cartpole-swingup, finger-spin, cheetah-run, and ball_in_cup-catch.

0xsamgreen · 2019-05-18T18:18:49Z

Hi @Kaixhin, things are looking good! I will continue to train for another day and then update, but it seems that scores are meeting or approaching Danijar's. Thanks again for making this port.

cartpole-swingup

cheetah-run

cup-catch

finger-spin

Kaixhin · 2019-05-18T21:03:55Z

Awesome! If possible, do you mind finding a way to send me all the data once done (checkpoints and results)? Perhaps via a file sharing service, and I'll let you know once I've downloaded it all because it would take a lot of space. I'll also take the rewards and final model and make them available as a release. I've got results for walker-walk, so just kicked off a run for reacher-easy.

0xsamgreen · 2019-05-20T19:49:27Z

Here are the final test result plots! I'm looking into sharing the checkpoints and result logs.

cartpole-swingup

cheetah-run

cup-catch

finger-spin

Kaixhin · 2019-05-20T22:26:29Z

Awesome! Do let me know if you do something else with PlaNet, but I think the results are good. I'll close this once you get me all the data.

Kaixhin · 2019-05-29T10:55:54Z

@SG2 if you still have some capacity would you be able to run the same environments again with the latest commit? Among various improvements I've made changes to the image processing to match what was actually done in the original (I missed some of this originally). Should only take half the time now since I've set the default number of episodes to 1000 like in the camera ready.

0xsamgreen · 2019-05-29T14:36:26Z

Hi @Kaixhin, I'm on it!

Kaixhin · 2019-05-30T11:17:02Z

Would you also be able to get results for walker-walk and reacher-easy? I've got just about enough space on Google Drive to get the results for all 6 tasks, so just email me and I'll share a folder with you that you can put everything into.

0xsamgreen · 2019-06-05T17:34:09Z

Here are my test results for commit ee9b996.

cartpole-swingup

cartpole-balance

cheetah-run

cup-catch

finger-spin

reacher-easy

walker-walk

Kaixhin · 2019-06-05T19:04:51Z

Thanks a lot! Uploaded all figures for release v1.0 and v1.1. Unfortunately walker_walk doesn't look that good either. Added notes on discrepancies to v1.0 - would be good if you can pass on the data from both sets of experiments; I'll upload final trained models for both.

0xsamgreen · 2019-06-05T19:30:44Z

No problem, thanks again for the port! Yes, I'll work on getting all the results from v1.0 and v1.1 to you. (I also trained all six agents on v1.0, before doing v1.1.)

maximecb · 2019-06-06T20:25:31Z

Out of curiosity: are these results all comparable to the original PlaNet implementation? Can you explain why the cup-catch performance collapses during the middle of training and then recovers?

Kaixhin · 2019-06-06T22:29:57Z

Apart from the high variance in cup-catch, which makes it hard to tell without more seeds if it's the same or a bit worse, results with tag 1.0 seem to be comparable. 1.1, which adds the 5-bit quantisation, noise and observation normalisation/centering, and is hence closer to the original, unfortunately seems to be worse on walker-walk and cup-catch. Have now noted this with the releases. Not sure about cup-catch collapse, but one thing in terms of the task is that it either gets the ball in and gets reward, or not, so the score can vary a lot based on success on this precise task.

longfeizhang617 · 2019-06-17T09:31:40Z

@SG2 It's very glad to know that you have successfully run the entire code. while,when I run the non-sybolic mode,it always collapses just at 300 or 750 episodes（ the whole episode is 1000 ）. I don't know what's the reason.Would you have met this problem?Thank you very much.

0xsamgreen · 2019-06-17T21:44:17Z

@longfeizhang617 I'm sorry to hear that. What environment are you training? Also did you continue to let it run? You can see in my last result plots that it collapses for cup catch and then recovers. I never saw it collapse and then stay collapsed forever.

longfeizhang617 · 2019-06-18T01:54:33Z

@SG2 Thanks for your attention.the default environment of training is Pendulum-v0. I suspect that planner part caused the memory overflow and thus the error,but i'm not sure.I have decreased the experience_size to be 100000(the oriainal is 1000000),and changed the batch-size/chunk-size/overshooting-distance from 50 to 30.While,the issue is still stayed.Is the experiment condition limiting the result ? My experiment condition includes a 16G memory , a GeForce GTX 1080Ti GPU.

Kaixhin · 2019-06-18T06:41:58Z

@longfeizhang617 I added support for Gym environments for people to try PlaNet without needing MuJoCo. However the original paper only includes experiments for DeepMind Confirm Suite, so you would have to tune hyperparameters for any other environment. I'll make a note on the README.

longfeizhang617 · 2019-06-18T12:10:55Z

@Kaixhin Thank you sincerely. I really haven't got the licenses of MuJoCo,so I just try Planet in Gym environment .You have inspired me that it maybe collapsed because of the mismatching of hyperparameters. I will try to tune hyperparameters.These days,I also communicate with others,at first,i suspect that there is somthing wrong in the iteration of the code,but @SG2 have accomplished it ,so i have to try more. It's really headless.

vballoli · 2020-05-14T14:33:24Z

@Kaixhin @xsamgreen Do you have approximate training time stats for single/multi GPU setup on any of the symbolic environments ? It'd be really helpful to have training stats for a few envs(symbolic or otherwise) in the readme. Btw, thank you for the code and experiments, they're really helpful !!

Kaixhin assigned 0xsamgreen May 11, 2019

Kaixhin mentioned this issue May 21, 2019

PyTorch Port google-research/planet#28

Closed

Kaixhin closed this as completed Jul 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need testing? #9

Need testing? #9

0xsamgreen commented May 11, 2019

Kaixhin commented May 11, 2019

0xsamgreen commented May 12, 2019

Kaixhin commented May 13, 2019

0xsamgreen commented May 16, 2019

Kaixhin commented May 17, 2019

0xsamgreen commented May 17, 2019

Kaixhin commented May 17, 2019

0xsamgreen commented May 17, 2019

0xsamgreen commented May 18, 2019

Kaixhin commented May 18, 2019 •

edited

Loading

0xsamgreen commented May 20, 2019

Kaixhin commented May 20, 2019

Kaixhin commented May 29, 2019

0xsamgreen commented May 29, 2019

Kaixhin commented May 30, 2019

0xsamgreen commented Jun 5, 2019

Kaixhin commented Jun 5, 2019

0xsamgreen commented Jun 5, 2019

maximecb commented Jun 6, 2019

Kaixhin commented Jun 6, 2019 •

edited

Loading

longfeizhang617 commented Jun 17, 2019

0xsamgreen commented Jun 17, 2019

longfeizhang617 commented Jun 18, 2019

Kaixhin commented Jun 18, 2019

longfeizhang617 commented Jun 18, 2019

vballoli commented May 14, 2020

Need testing? #9

Need testing? #9

Comments

0xsamgreen commented May 11, 2019

Kaixhin commented May 11, 2019

0xsamgreen commented May 12, 2019

Kaixhin commented May 13, 2019

0xsamgreen commented May 16, 2019

Kaixhin commented May 17, 2019

0xsamgreen commented May 17, 2019

Kaixhin commented May 17, 2019

0xsamgreen commented May 17, 2019

0xsamgreen commented May 18, 2019

Kaixhin commented May 18, 2019 • edited Loading

0xsamgreen commented May 20, 2019

Kaixhin commented May 20, 2019

Kaixhin commented May 29, 2019

0xsamgreen commented May 29, 2019

Kaixhin commented May 30, 2019

0xsamgreen commented Jun 5, 2019

Kaixhin commented Jun 5, 2019

0xsamgreen commented Jun 5, 2019

maximecb commented Jun 6, 2019

Kaixhin commented Jun 6, 2019 • edited Loading

longfeizhang617 commented Jun 17, 2019

0xsamgreen commented Jun 17, 2019

longfeizhang617 commented Jun 18, 2019

Kaixhin commented Jun 18, 2019

longfeizhang617 commented Jun 18, 2019

vballoli commented May 14, 2020

Kaixhin commented May 18, 2019 •

edited

Loading

Kaixhin commented Jun 6, 2019 •

edited

Loading