You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe I discovered a bug, in resetting the MetaDriveEnv resulting in nondeterminism.
MetaDrive simulation supposed to be deterministic but even when I use the enviroment and reset it with same with same seed resulting traces are not identical. Cosider following code adapted from examples:
try:
env=MetaDriveEnv(config={"map":"C",
"num_scenarios": n_scenarios})
forrepinrange(n_scenarios):
obs, step_info=env.reset(seed)
whileTrue:
# get action from expert driving, or a dummy actionaction=expert(env.agent, deterministic=True) ifexpert_drivingelse [0, 0.33]
obs, reward, tm, tr, step_info=env.step(action)
traces.append(step_info)
iftmortr:
breakfinally:
env.close()
When I was analyzing traces (step info for each timestep) from diffrent repetitions I found slight diffrences probably comming from floating point number arithemtic. Those diffrences (error) between traces is magnified, the longer the episode is.
Suspecting that .reset() function doesn't clear the state properly I started initializing the enviroment for each repetition, and closing at the end.
try:
forrepinrange(n_scenarios):
env=MetaDriveEnv(config={"map":"C",
"num_scenarios": n_scenarios})
obs, step_info=env.reset(seed)
whileTrue:
# get action from expert driving, or a dummy actionaction=expert(env.agent, deterministic=True) ifexpert_drivingelse [0, 0.33]
obs, reward, tm, tr, step_info=env.step(action)
iftmortr:
breakenv.close()
finally:
pass
Above solved an issue and each traces produced are exacly the same (fully deterministic).
Hi MetaDrive team,
I believe I discovered a bug, in resetting the MetaDriveEnv resulting in nondeterminism.
MetaDrive simulation supposed to be deterministic but even when I use the enviroment and reset it with same with same seed resulting traces are not identical. Cosider following code adapted from examples:
When I was analyzing traces (step info for each timestep) from diffrent repetitions I found slight diffrences probably comming from floating point number arithemtic. Those diffrences (error) between traces is magnified, the longer the episode is.
Suspecting that
.reset()
function doesn't clear the state properly I started initializing the enviroment for each repetition, and closing at the end.Above solved an issue and each traces produced are exacly the same (fully deterministic).
Please see my notebook reproducing the bug.
Conda env
The text was updated successfully, but these errors were encountered: