You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.
In core/environment.py, done defaults to torch.ones, instead of torch.zeros. This means that in monobeast's act(), the first replay entry each actor creates has a done value of 1. Then when episode returns are reported, those episodes have rewards of 0, though the episodes never really happened at all.
(By the way, excellent repo! Very useful.)
The text was updated successfully, but these errors were encountered:
Thanks for your interest in TorchBeast and for your kind words.
You are correct. The reason done is True at t=0 is because done == True iff "episode just started" which is the the case for the first episode, too.
If this is a problem in your case I'm happy to accept a patch that turns the torch.ones into torch.zeros as I don't believe this matters currently (I suppose the LSTM/agent state needs to be reset in the same way it is initialized for it to not matter at all, but all of this affects the first episode only).
In core/environment.py, done defaults to torch.ones, instead of torch.zeros. This means that in monobeast's act(), the first replay entry each actor creates has a done value of 1. Then when episode returns are reported, those episodes have rewards of 0, though the episodes never really happened at all.
(By the way, excellent repo! Very useful.)
The text was updated successfully, but these errors were encountered: