Releases | Open Source Science

sbx-rl - v0.22.0: n-step returns support for off-policy algorithms via the `n_steps` argument

What's Changed

Add n-step returns support via the n_steps argument by @araffin in https://github.com/araffin/sbx/pull/74

Full Changelog: https://github.com/araffin/sbx/compare/v0.21.0...v0.22.0

- Python
Published by araffin 11 months ago

sbx-rl - v0.21.0: KL Adaptive LR for PPO and learning rate schedule for SAC/TQC

What's Changed

KL Adaptive LR for PPO and LR schedule for SAC/TQC by @araffin in https://github.com/araffin/sbx/pull/72

Full Changelog: https://github.com/araffin/sbx/compare/v0.20.0...v0.21.0

- Python
Published by araffin about 1 year ago

sbx-rl - v0.20.0: Hotfix for PPO with un-normalized env, `net_arch` support for PPO, additional fixes

What's Changed

Update PPO to support net_arch, and additional fixes by @araffin in https://github.com/araffin/sbx/pull/65
fixed entropy coeff wrongly logged for SAC and derivatives.
fixed PPO predict() for env that were not normalized (action spaces with limits != [-1, 1])
PPO now logs the standard deviation

Full Changelog: https://github.com/araffin/sbx/compare/v0.19.0...v0.20.0

- Python
Published by araffin over 1 year ago

sbx-rl - v0.19.0: SimBa Policy: Simplicity Bias for Scaling Up Parameters in DRL

What's Changed

Add SimBa Policy: Simplicity Bias for Scaling Up Parameters in DRL by @araffin in https://github.com/araffin/sbx/pull/59
Cleanups and update min version to python 3.9

Full Changelog: https://github.com/araffin/sbx/compare/v0.18.0...v0.19.0

- Python
Published by araffin over 1 year ago

sbx-rl - SBX v0.18.0: Bug fix for SAC, optimize log of ent coeff to be consistent with SB3

What's Changed

Optimize the log of the entropy coeff instead of the entropy coeff by @jamesheald in https://github.com/araffin/sbx/pull/56

New Contributors

@jamesheald made their first contribution in https://github.com/araffin/sbx/pull/56

Full Changelog: https://github.com/araffin/sbx/compare/v0.17.0...v0.18.0

- Python
Published by araffin over 1 year ago

sbx-rl - SBX v0.17.0: CNN support for DQN

What's Changed

Fix warning and remove DroQ class in favor of SAC config by @araffin in https://github.com/araffin/sbx/pull/47
Add CNN support for DQN by @araffin in https://github.com/araffin/sbx/pull/49

Full Changelog: https://github.com/araffin/sbx/compare/v0.15.0...v0.17.0

- Python
Published by araffin almost 2 years ago

sbx-rl - SBX v0.15.0: Hotfix for offpolicy algorithms, the pseudo random key was not updated

[!NOTE] No performance difference should be expected (See report in https://github.com/araffin/sbx/pull/46), this bug was introduced in v0.11.0.

What's Changed

Support for setting the target entropy by @jan1854 in https://github.com/araffin/sbx/pull/43
Hotfix - Return the new updated key in function _train by @theovincent in https://github.com/araffin/sbx/pull/46

New Contributors

@theovincent made their first contribution in https://github.com/araffin/sbx/pull/46

Full Changelog: https://github.com/araffin/sbx/compare/v0.13.0...v0.15.0

- Python
Published by araffin about 2 years ago

sbx-rl - SBX v0.13.0: Added CrossQ algorithm and support for custom activations

[!WARNING] Using DroQ class directly is deprecated and will be removed in SBX v0.14.0. Please use SAC/TQC/CrossQ directly instead with the DroQ configuration, see https://github.com/araffin/sbx?tab=readme-ov-file#note-about-droq

To upgrade: pip install sbx-rl --upgrade

CrossQ: https://openreview.net/forum?id=PczQtTsTIX (SAC with batch norm and no target network)

What's Changed

Fix for new tensorflow probability version by @araffin in https://github.com/araffin/sbx/pull/39
Allow to pass custom activation function in policy_kwargs by @paolodelia99 in https://github.com/araffin/sbx/pull/41
Add CrossQ by @araffin, @danielpalen and @jan1854 in https://github.com/araffin/sbx/pull/28

New Contributors

@paolodelia99 made their first contribution in https://github.com/araffin/sbx/pull/41
@danielpalen made their first contribution in https://github.com/araffin/sbx/pull/28

Full Changelog: https://github.com/araffin/sbx/compare/v0.12.0...v0.13.0

- Python
Published by araffin about 2 years ago

sbx-rl - SBX v0.12.0: Added support for MultiDiscrete and MultiBinary action spaces to PPO

What's Changed

Support for MultiDiscrete and MultiBinary action spaces in PPO by @jan1854 in https://github.com/araffin/sbx/pull/30

Full Changelog: https://github.com/araffin/sbx/compare/v0.11.0...v0.12.0

- Python
Published by araffin over 2 years ago

sbx-rl - SBX v0.11.0: Added support for large values for gradient_steps to SAC, TD3, and TQC

What's Changed

Added support for large values for gradient_steps to SAC, TD3, and TQC by @jan1854 in https://github.com/araffin/sbx/pull/21

New Contributors

@jan1854 made their first contribution in https://github.com/araffin/sbx/pull/21

Full Changelog: https://github.com/araffin/sbx/compare/v0.10.0...v0.11.0

- Python
Published by araffin over 2 years ago

sbx-rl - SBX v0.10.0: Fix `train()` signature and update type hints

What's Changed

Fix train signature and update type hints by @araffin in https://github.com/araffin/sbx/pull/24

Full Changelog: https://github.com/araffin/sbx/compare/v0.9.1...v0.10.0

- Python
Published by araffin over 2 years ago

sbx-rl - SBX v0.9.1: Fix replay buffer device at load time

What's Changed

Fix replay buffer device at load time by @araffin in https://github.com/araffin/sbx/pull/20

This issue was introduced with SB3 v2.2.1.

Full Changelog: https://github.com/araffin/sbx/compare/v0.9.0...v0.9.1

- Python
Published by araffin over 2 years ago

sbx-rl - SBX v0.9.0: Add flatten layer

What's Changed

Add flatten layer and update dependencies by @araffin in https://github.com/araffin/sbx/pull/18

Full Changelog: https://github.com/araffin/sbx/compare/v0.8.0...v0.9.0

- Python
Published by araffin over 2 years ago

sbx-rl - SBX v0.8.0: Added DDPG and TD3

What's Changed

Add DDPG and TD3 by @araffin in https://github.com/araffin/sbx/pull/16

Full Changelog: https://github.com/araffin/sbx/compare/v0.7.0...v0.8.0

- Python
Published by araffin almost 3 years ago

sbx-rl - SBX v0.7.0: Gymnasium and HerReplayBuffer support

Also flexible MLP for offpolicy algorithms and better type annotations.

- Python
Published by araffin about 3 years ago

Recent Releases of sbx-rl

sbx-rl - v0.22.0: n-step returns support for off-policy algorithms via the `n_steps` argument

What's Changed

sbx-rl - v0.21.0: KL Adaptive LR for PPO and learning rate schedule for SAC/TQC

What's Changed

sbx-rl - v0.20.0: Hotfix for PPO with un-normalized env, `net_arch` support for PPO, additional fixes

What's Changed

sbx-rl - v0.19.0: SimBa Policy: Simplicity Bias for Scaling Up Parameters in DRL

What's Changed

sbx-rl - SBX v0.18.0: Bug fix for SAC, optimize log of ent coeff to be consistent with SB3

What's Changed

New Contributors

sbx-rl - SBX v0.17.0: CNN support for DQN

What's Changed

sbx-rl - SBX v0.15.0: Hotfix for offpolicy algorithms, the pseudo random key was not updated

What's Changed

New Contributors

sbx-rl - SBX v0.13.0: Added CrossQ algorithm and support for custom activations

What's Changed

New Contributors

sbx-rl - SBX v0.12.0: Added support for MultiDiscrete and MultiBinary action spaces to PPO

What's Changed

sbx-rl - SBX v0.11.0: Added support for large values for gradient_steps to SAC, TD3, and TQC

What's Changed

New Contributors

sbx-rl - SBX v0.10.0: Fix `train()` signature and update type hints

What's Changed

sbx-rl - SBX v0.9.1: Fix replay buffer device at load time

What's Changed

sbx-rl - SBX v0.9.0: Add flatten layer

What's Changed

sbx-rl - SBX v0.8.0: Added DDPG and TD3

What's Changed

sbx-rl - SBX v0.7.0: Gymnasium and HerReplayBuffer support