Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about infinit bootstrap #22

Open
geekyutao opened this issue Sep 18, 2021 · 2 comments
Open

Questions about infinit bootstrap #22

geekyutao opened this issue Sep 18, 2021 · 2 comments

Comments

@geekyutao
Copy link

Hi, thank you for your code. I'm a little bit confused of the infinit bootstrap in

curl/train.py

Line 269 in 8416d6e

done_bool = 0 if episode_step + 1 == env._max_episode_steps else float(
.
Will it be wrong when sampling at the end of an episode (where the next_obs is the start observation of the next episode)? It seems you simply ignore this.

@yueyang130
Copy link

It seems in DMcontrol there is no true terminal state. So it allows infinte bootstrap.

@yueyang130
Copy link

yueyang130 commented Aug 6, 2022

For @geekyutao 's question, the point is that the next_ob will never be the start observation of the next episode. Because at the previous timestep, the next_ob is the terminal state and done is true (Note done_bool is alway false whereas done is true at the max step). Then env is reset and the ob is set to the start observation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants