v0.1.1
What's Changed
- [Feature] Stacking specs by @vmoens in #892
- [Feature] Multicollector interruptor by @albertbou92 in #963
- [BugFix] VMAS api fix by @matteobettini in #978
- [CI] Fix D4RL tests in CI by @vmoens in #976
- [CI] Fix CI by @vmoens in #982
- [Refactor] Binary spec inherits from discrete spec by @matteobettini in #984
- [Feature]
_DataCollector
->DataCollectorBase
by @vmoens in #985 - [Feature] Discrete SAC by @BY571 in #882
- [Refactor, Doc] Refactor refs to SafeModule to TensorDictModule unless necessary by @vmoens in #986
- [BugFix] Quickfix by @vmoens in #991
- [Feature] Add Dropout to MLP module by @BY571 in #988
- [Feature] Warn when collectors collect more frames than requested by @matteobettini in #989
- [BugFix] make "_reset", "step_count", and other done_based keys follow done_spec by @matteobettini in #981
- [Feature] Bandit datasets by @vmoens in #912
- [BugFix] Fix sampling in PPO tutorial by @vmoens in #996
- [Refactor] Refactor losses (value function, doc, input batch size) by @vmoens in #987
- [BugFix,Feature,Doc] Fix replay buffers sampling info, docstrings and iteration by @vmoens in #1003
- [Feature] Replace ValueError by warning in collectors when total_frames is not an exact multiple of frames_per_batch by @albertbou92 in #999
- [BugFix] Only call replay buffer transforms when there are by @vmoens in #1008
- [BugFix] Patch tests in 1008 by @vmoens in #1009
- [Feature] Multidim value functions by @vmoens in #1007
- [BugFix] Fix exploration (OU and Gaussian) by @vmoens in #1006
- [CI] Fix python version in habitat by @vmoens in #1010
- Advantages pass
time_dim
and docfix by @matteobettini in #1014 - [Refactor] Faster transformed distributions by @vmoens in #1017
- [WIP, CI] Upgrade cuda channel by @vmoens in #1019
- [BugFix] Fix collector reset with truncation by @vmoens in #1021
- [Refactor] Improve collector performance by @matteobettini in #1020
- [BugFix] Fix params and buffer casting for policies by @vmoens in #1022
- [Feature] PPO allow entropy logging when entropy_coeff is 0 by @matteobettini in #1025
- [Feature] Distributed data collector (ray) by @albertbou92 in #930
- [Refactor] Minor changes in tensordict construction by @vmoens in #1029
- [CI] Fix Brax 0.9.0 by @vmoens in #1011
- [Feature] Multiagent API in vmas by @matteobettini in #983
- [Feature] Benchmarking worflow by @vmoens in #1028
- [Benchmark] Fix adv benchmark by @vmoens in #1030
- [Doc] Refactor DDPG and DQN tutos to narrow the scope by @vmoens in #979
- Revert "[Doc] Refactor DDPG and DQN tutos to narrow the scope" by @vmoens in #1032
- [BugFix] Advantage normalisation in ClipPPOLoss is done after computing gain1 by @albertbou92 in #1033
- [BugFix] Codecov SHA error by @vmoens in #1035
- [Doc] DDPG and DQN refactoring -- Doc cleaning by @vmoens in #1036
- [BugFix,CI] Fix macos codecov install by @vmoens in #1039
- [BugFix] kwargs update in distributed collectors by @vmoens in #1040
- [Feature]
make_composite_from_td
by @vmoens in #1042 - [Refactor] Import envpool locally to avoid importing gym at root level by @vmoens in #1041
- [Minor] Fix a typo by @FrankTianTT in #1046
- [BugFix] Fix param tying in loss modules by @vmoens in #1037
- [Refactor] less ad-hoc disable_env_checker check by @vmoens in #1047
- [Refactor] Improve distributed collectors by @vmoens in #1044
- [Doc] Document tensordict modules by @vmoens in #1053
- [Doc] Minor changes to contributing.md by @vmoens in #1054
- [Doc] A bit more doc on modules by @vmoens in #1056
- [Refactor] Import enum and interaction_type utils by @Goldspear in #1055
- [Feature] Deduplicate calls to common layers in PPO by @vmoens in #1057
- [BugFix] CompositeSpec nested key deletion by @btx0424 in #1059
- [Feature] Add MaskedCategorical distribution by @xiaomengy in #1012
- [Refactor] resetting envs in collectors always passes the _reset entry by @vmoens in #1061
- [Refactor] Better integration of QValue tools by @vmoens in #1063
- MUJOCO_INSTALLATION.md: Fix typo by @traversaro in #1064
- [Refactor] Removes "reward" from root tensordicts by @vmoens in #1065
- [Test] Fix tests for older pytorch versions by @vmoens in #1066
- [Feature] Reward2go Transform by @BY571 in #1038
- [CI] Reduce tests by @vmoens in #1071
- [Feature] Skip existing for advantage modules by @vmoens in #1070
- [BugFix] Fix parallel env data passing on cuda by @vmoens in #1024
- [Refactor] Deprecate interaction_mode by @vmoens in #1067
- [Doc] Update KB: cannot find -lGL by @vmoens in #1073
- [Doc] fix figures display issues in documentation of actors.py by @DamienAllonsius in #1074
- [Example] PPO simplified example by @albertbou92 in #1004
- [Feature] Update td in step (not overwrite) by @vmoens in #1075
- [CI] Remove migrated CircleCI macOS jobs by @seemethere in #1069
- [Feature] Target Return Transform by @BY571 in #1045
- [Test] Fix tensorboard tests with ImageIO 2.26 by @vmoens in #1083
- [Feature] LSTMModule by @vmoens in #1084
- [BugFix] Change default of skip_existing to None by @tcbegley in #1082
- [Example] A2C simplified example by @albertbou92 in #1076
- [BugFix] Fix output_spec transform calls by @vmoens in #1091
- [Feature] Indexing Discrete and OneHot specs by @remidomingues in #1081
- [Refactor] Refactor DQN by @vmoens in #1085
- [Feature] Auto-init updaters and raise a warning if not present by @vmoens in #1092
- [BugFix] Remove false warnings in losses by @vmoens in #1096
- [CI, BugFix] Fix CI warnings and errors by @vmoens in #1100
- [Refactor] Update vmap imports to torch by @vmoens in #1102
- [Refactor] Make advantages non-differentiable by default (except in losses) by @vmoens in #1104
- [Feature] Indexing specs by @remidomingues in #1105
- [BugFix] Fix EnvPoool by @vmoens in #1106
- [Feature,Doc] QValue refactoring and QNet + RNN tuto by @vmoens in #1060
- [BugFix] Fix Gym imports by @vmoens in #1023
- [CI] pytest should not skip tests for dependencies by @rohitnig in #1048
- [BugFix, Doc] Fix tutos by @vmoens in #1107
- [CI] Fix tutos (2) by @vmoens in #1109
- [Doc] Fix doc rendering by @vmoens in #1112
- Added the entry for skip-tests in the environment.yml by @rohitnig in #1113
- [CI] Upgrade ubuntu version in GHA by @vmoens in #1116
- Fix in windows unit test by @mischab in #1099
- Revert "Fix in windows unit test" by @mischab in #1117
- [Nova] Lint job on GHA by @osalpekar in #1114
- [Nova] Remove CircleCI Wheels Builds by @osalpekar in #1121
- [BugFix] Set exploration mode to MODE in all losses by default by @vmoens in #1123
- [BugFix] Instruct the value key to PPOLoss by @vmoens in #1124
- [Feature] CatFrames for offline data by @vmoens in #1122
- [CI] Fix windows CI by @vmoens in #1128
- [Refactor] Buffers tensorclass compat and tutorial by @vmoens in #1101
- [Feature] Marking the time dimension by @vmoens in #1095
- [Doc] Add tuto and time dim info in docs by @vmoens in #1130
- [Doc] Fix locked samples from RBs and ccl of tuto by @vmoens in #1132
- [BugFix] Fix unlock in RB by @vmoens in #1135
- [BugFix] extract the info dict from a list by @xmaples in #1131
- [Feature] Added support for vector-based rewards from environments in MO-Gymnasium by @dennismalmgren in #992
- [Versioning] v0.1.1 by @vmoens in #1137
New Contributors
- @FrankTianTT made their first contribution in #1046
- @Goldspear made their first contribution in #1055
- @btx0424 made their first contribution in #1059
- @traversaro made their first contribution in #1064
- @DamienAllonsius made their first contribution in #1074
- @seemethere made their first contribution in #1069
- @remidomingues made their first contribution in #1081
- @rohitnig made their first contribution in #1048
- @mischab made their first contribution in #1099
- @osalpekar made their first contribution in #1114
- @xmaples made their first contribution in #1131
- @dennismalmgren made their first contribution in #992
Full Changelog: v0.1.0...v0.1.1