You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 1, 2024. It is now read-only.
in the for loop that is incrementing the steps of the current epoch, the steps_epoch iteration variable is not being reset after we observe a termination (done=True).
This will mean the next epoch will start from the steps_epoch+1 in which the previous epoch ended.
And the next epoch will be shorter than the actual epoch length (i.e. epoch_length - steps_epoch)
for steps_epoch in range(cfg.overrides.epoch_length):
if steps_epoch == 0 or done:
obs, done = env.reset(), False
Is this behaviour desired?
Expected
If it's not desired we would propose something like this to set steps_epoch=0 :
for steps_epoch in range(cfg.overrides.epoch_length):
if steps_epoch == 0 or done:
steps_epoch = 0
obs, done = env.reset(), False
Thanks!:)
The text was updated successfully, but these errors were encountered:
Observed
mbpo.py - line 199
in the for loop that is incrementing the steps of the current epoch, the steps_epoch iteration variable is not being reset after we observe a termination (done=True).
This will mean the next epoch will start from the steps_epoch+1 in which the previous epoch ended.
And the next epoch will be shorter than the actual epoch length (i.e. epoch_length - steps_epoch)
Is this behaviour desired?
Expected
If it's not desired we would propose something like this to set steps_epoch=0 :
Thanks!:)
The text was updated successfully, but these errors were encountered: