v0.5.7 Release Notes

- Python
Published by belerico over 1 year ago

v0.5.6 Release Notes

Fix buffer checkpoint and added the possibility to specify the pre-fill steps upon resuming. Updated the how-tos accordingly in #280
Updated how-tos in #281
Fix division by zero when computing sps-train in #283
Better code naming in #284
Fix Minedojo actions stacking (and more generally multi-discrete actions) and missing keys in #286
Fix computation of prefill steps as policy steps in #287
Fix the Dreamer-V3 imagination notebook in #290
Add the ActionsAsObservationWrapper to let the user add the actions played as observations in #291

- Python
Published by belerico over 1 year ago

Added parallel stochastic in dv3: #225
Update dependencies and python version: #230, #262, #263
Added dv3 notebook for imagination and obs reconstruction: #232
Created citation.cff: #233
Added replay ratio for off-policy algorithms: #247
Single strategy for the player (now it is instantiated in the build_agent() function: #244, #250, #258
Proper terminated and truncated signals management: #251, #252, #253
Added the possibility to choose whether or not to learn initial recurrent state: #256
Added A2C benchmarks: #266
Added prepare_obs() function to all the algorithms: #267
Improved code readability: #248, #265
bug fix: #220, #222, #224, #231, #243, #255, #257

- Python
Published by michele-milesi almost 2 years ago

Added Dreamer V3 different sizes configs (#208).
Update torch version: 2.2.1 or in 2.0., 2.1..
Fix observation normalization in dreamer v3 and p2e_dv3 (#214).
Update README (#215).
Fix installation and agent evaluation: new commands are made available for agent evaluation, model registration, and for the available agents (#216).

- Python
Published by michele-milesi almost 2 years ago

- Python
Published by michele-milesi about 2 years ago

Added A2C algorithm (#33).
Added a new how-to on how to add an external algorithm (no need to clone sheeprl locally) in (#175).
Added optimizations (#177):
- Metrics are instantiated only when needed.
- Removed the torch.cat() operation between empty and dense tensors in the MultiEncoder class.
- Added possibility not to test the agent after training.
Fixed GitHub actions workflow (#180).
Fixed bugs (#181, #183).
Added benchmarks with respect to StableBaselines3 (#185).
Added BernoulliSafeMode distribution, which is a Bernoulli distribution where the mode is computed safely, i.e. it returns self.probs > 0.5 without seeting any NaN (#186) .

- Python
Published by michele-milesi about 2 years ago

- Python
Published by michele-milesi about 2 years ago

Added Numpy buffers (#169):
- The user can now decide if to use the torch.as_tensor function or the torch.from_numpy one to convert the Numpy buffer into tensors when sampling (#172).
Added optimizations to reduce training time (#168).
Added the possibility to keep only the last n checkpoints in an experiment to avoid filling up the disk (#171).
Fix bugs (#167).
Update documentation.

- Python
Published by michele-milesi about 2 years ago

Added torch>=2.0 as dependency in #161
Let mlflow be an optional package to be installed, i.e. the user can directly install it with pip install sheeprl[mlflow] in #164
Fix the resume_from_checkpoint in #163. In particular:
- Added save_configs function to save the configs of the experiment in the <log_dir>/config.yaml file.
- Fix the resume from checkpoint of all the algorithms (restart from the correct policy step + fix decoupled).
- Given more flexibility to p2e finetuning scripts regarding the fabric configs.
- MineDojo Wrapper: avoid modifying the kwargs (to always save consistent configs in the <log_dir>/config.yaml file).
- Tensorboar Logger creation: update logger configs to always save consistent configs in the <log_dir>/config.yaml file.
- Added as_dict() method (to dotdict class) to get a primitive python dictionary from a dotdict object.

- Python
Published by belerico about 2 years ago

The following config keys have been moved in #158 :
- cnn_keys, mlp_keys, per_rank_batch_size, per_rank_sequence_length, per_rank_num_batches and total_steps have been moved to the specifig algo config
We have added the integration of the MLflowLogger in #159 . This comes with new documentation and notebooks under the example folder on how to use it.

- Python
Published by belerico about 2 years ago

SheepRL is now on PyPI: every time a release is published, the new version of SheepRL is published also in PyPI (#155)
Torchmetrics is no longer installed from the github main branch (#155).
Moviepy is no longer installed from the github main branch (#155).
box2d-py is not a mandatory dependency anymore, it is possible to install gymnasium[box2d] with the pip install sheeprl[box2d] command (#156)
The moviepy.decorators.use_clip_fps_by_default function is replaced (in the ./sheeprl/__init__.py file) with the method in the moviepy main branch (#156).

- Python
Published by michele-milesi about 2 years ago

The exploration amount of the Dreamer's player has been moved to the Actor in #150
All the P2E scripts have been split into exploration and finetuning in #151
The hydra version has been fixed to 1.3 in #152
SheepRL is now published on PyPi in #155

- Python
Published by belerico about 2 years ago