Recent Releases of rlberry
rlberry - v0.7.3
Version 0.7.3
PR #454
- remove unused librairies
PR #451
- Moving UCBVI to rlberry_scool
PR #438
- move long tests to rlberry research
PR #436 #444 #445 #447 #448 #455 #456
- Update user guide
- add tests on the userguide examples
- removing rlberry_research references as much as possible (doc and code)
- Python
Published by JulienT01 almost 2 years ago
rlberry - rlberry-v0.7.0
Release of version 0.7.0 of rlberry.
This is the first rlberry release since we did a major restructuration of rlberry in three repositories (PR #379) : rlberry (this repo): everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting... rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL... rlberry-research: repository of agents and environments used inside Inria Scool team
Changes since last version.
PR #397
- Automatic save after fit() in ExperienceManager
PR #396
- Improve coverage and fix version workflow
PR #385 to #390
- Switch from RTD to github page
PR #382
- switch to poetry
PR #376
- New plotwriterdata function that does not depend on seaborn and that can plot smoothed function and confidence band if scikit-fda is installed.
- Python
Published by JulienT01 about 2 years ago
rlberry - rlberry-v0.6.0
Release of version 0.6.0 of rlberry.
This is the last rlberry release before we do a major restructuration of rlberry in three repositories: - rlberry: everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting... - rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL... - rlberry-research: repository of agents and environments used inside Inria Scool team
Changes since last version.
PR #276 * Non adaptive multiple tests for agent comparison.
PR #365
- Fix Sphinx version to <7.
PR #350
- Rename AgentManager to ExperimentManager.
PR #326
- Moved SAC from experimental to torch agents. Tested and benchmarked.
PR #335
- Upgrade from Python3.9 -> python3.10
- Python
Published by TimotheeMathieu over 2 years ago
rlberry - rlberry-v0.5.0
Release of version 0.4.1 of rlberry.
With this release, rlberry switches to gymnasium!
New in version 0.5.0:
PR #281, #323
- Merge gymnasium branch into main, make gymnasium the default library for environments in rlberry.
Remark: for now stablebaselines 3 has no stable release with gymnasium. To use stablebaslines with gymnasium, use the main branch from github:
pip install git+https://github.com/DLR-RM/stable-baselines3
- Python
Published by TimotheeMathieu over 2 years ago
rlberry - rlberry-v0.4.1
Release of version 0.4.1 of rlberry.
:warning: WARNING :warning: :
Before the rlberry installation, please install the fork of gym 0.21 : "gym[accept-rom-license] @ git+https://github.com/rlberry-py/gymfix021"
New in 0.4.1
PR #307
- Create fork gym0.21 for setuptools non-retrocompatible changes.
PR #306
- Add Q-learning agent in :class:
rlberry.agents.QLAgentand SARSA agent in :class:rlberry.agents.SARSAAgent.
PR #298
- Move old scripts (jax agents, attention networks, old examples...) that we won't maintain from the main branch to an archive branch.
PR #277
- Add and update code to use "Atari games" env
- Python
Published by JulienT01 over 2 years ago
rlberry - rlberry-v0.4.0
Release of version 0.4.0 of rlberry.
New in 0.4.0
PR #273
- Change the default behavior of plotwriterdata so that if seaborn has version >= 0.12.0 then a 90% percentile interval is used instead of sd.
PR #269
- Add rlberry.envs.PipelineEnv a way to define pipeline of wrappers in a simple way.
PR #262
- PPO can now handle continuous actions.
PR #261, #264
Implementation of Munchausen DQN in rlberry.agents.torch.MDQNAgent.
Comparison of MDQN with DQN agent in the long tests.
PR #244, #250, #253
- Compress the pickles used to save the trained agents.
PR #235
- Implementation of rlberry.envs.SpringCartPole environment, an RL environment featuring two cartpoles linked by a spring.
PR #226, #227
Improve logging, the logging level can now be changed with rlberry.utils.logging.set_level().
Introduce smoothing in curves done with plotwriterdata when only one seed is used.
PR #223
- Moved PPO from experimental to torch agents. Tested and benchmarked.
- Python
Published by TimotheeMathieu about 3 years ago
rlberry - rlberry-v0.3.0
Release of version 0.3.0 of rlberry.
New in 0.3.0
PR #206
- Creation of a Deep RL tutorial, in the user guide.
PR #132
- New tracker class
rlberry.agents.bandit.tools.BanditTrackerto track statistics to be used in Bandit algorithms.
PR #191
- Possibility to generate a profile with
rlberry.agents.manager.AgentManager.
PR #148, #161, #180
- Misc improvements on A2C.
- New StableBaselines3 wrapper
rlberry.agents.stable_baselines.StableBaselinesAgentto import StableBaselines3 Agents.
PR #119
- Improving documentation for agents.torch.utils
- New replay buffer
rlberry.agents.utils.replay.ReplayBuffer, aiming to replace code in utils/memories.py - New DQN implementation, aiming to fix reproducibility and compatibility issues.
- Implements Q(lambda) in DQN Agent.
Feb 22, 2022 (PR #126)
- Setup
rlberry.__version__(currently 0.3.0dev0) - Record rlberry version in a AgentManager attribute equality of AgentManagers
- Override
__eq__method of the AgentManager class.
Feb 14-15, 2022 (PR #97, #118)
- (feat) Add Bandits basic environments and agents. See
~rlberry.agents.bandits.IndexAgentand~rlberry.envs.bandits.Bandit. - Thompson Sampling bandit algorithm with gaussian or beta prior.
- Base class for bandits algorithms with custom save & load functions (called
~rlberry.agents.bandits.BanditWithSimplePolicy)
Feb 11, 2022 (#83, #95)
- (fix) Fixed bug in
FiniteMDP.sample(): terminal state was being checked withself.stateinstead of givenstate - (feat) Option to use 'fork' or 'spawn' in
~rlberry.manager.AgentManager - (feat) AgentManager output_dir now has a timestamp and a short ID by default.
- (feat) Gridworld can be constructed from string layout
- (feat)
max_workersargument for~rlberry.manager.AgentManagerto control the maximum number of processes/threads created by thefitmethod.
Feb 04, 2022
- Add
~rlberry.manager.read_writer_datato load agent's writer data from pickle files and make it simpler to customize in~rlberry.manager.plot_writer_data - Fix bug, dqn should take a tuple as environment
- Add a quickstart tutorial in the docs
quick_start - Add the RLSVI algorithm (tabular)
~rlberry.agents.RLSVIAgent - Add the Posterior Sampling for Reinforcement Learning PSRL agent for tabular MDP
~rlberry.agents.PSRLAgent - Add a page to help contributors in the doc
contributing
- Python
Published by TimotheeMathieu almost 4 years ago
rlberry - rlberry-v0.2.1
New in v0.2
Improving interface and tools for parallel execution (#50)
AgentStatsrenamed toAgentManager.-
AgentManagercan handle agents that cannot be pickled. -
Agentinterface requireseval()method instead ofpolicy()to handle more general agents (e.g. reward-free, POMDPs etc). - Multi-processing and multi-threading are now done with
ProcessPoolExecutorandThreadPoolExecutor(allowing nested processes for example). Processes are created withspawn(jax does not work withfork, see #51).
New experimental features (see #51, #62)
- JAX implementation of DQN and replay buffer using reverb.
rlberry.network: server and client interfaces to exchange messages via sockets.RemoteAgentManagerto train agents in a remote server and gather the results locally (usingrlberry.network).
Logging and rendering:
- Data logging with a new
DefaultWriterand improved evaluation and plot methods inrlberry.manager.evaluation. - Fix rendering bug with OpenGL (bf606b44aaba1b918daf3dcc02be96a8ef5436b4).
Bug fixes.
New in v0.2.1 (#65)
Features:
AgentandAgentManagerboth have aunique_idattribute (useful for creating unique output files/directories).DefaultWriteris now initialized in base classAgentand (optionally) wraps a tensorboard SummaryWriter.AgentManagerhas an optionenable_tensorboardthat activates tensorboard logging in each of itsAgents (with theirwriterattribute). Thelog_dirs of tensorboard are automatically assigned byAgentManager.RemoteAgentManagerreceives tensorboard data created in the server, when the methodget_writer_data()is called. This is done by a zip file transfer withrlberry.network.BaseWrapperandgym_makenow have an optionwrap_spaces. If set toTrue, this option convertsgym.spacestorlberry.spaces, which provides classes with better seeding (using numpy'sdefault_rnginstead ofRandomState)AgentManager: new method getagentinstances() that returns trained instancesplot_writer_data: possibility to set xtag (tag used for x-axis)
Bug fixes:
- Fixed agent initialization bug in
AgentHandler(eval_envmissing in kwargs for agent_class).
- Python
Published by omardrwch over 4 years ago
rlberry - rlberry-v0.2
Improving interface and tools for parallel execution (#50)
AgentStatsrenamed toAgentManager.-
AgentManagercan handle agents that cannot be pickled. -
Agentinterface requireseval()method instead ofpolicy()to handle more general agents (e.g. reward-free, POMDPs etc). - Multi-processing and multi-threading are now done with
ProcessPoolExecutorandThreadPoolExecutor(allowing nested processes for example). Processes are created withspawn(jax does not work withfork, see #51).
New experimental features (see #51, #62)
- JAX implementation of DQN and replay buffer using reverb.
rlberry.network: server and client interfaces to exchange messages via sockets.RemoteAgentManagerto train agents in a remote server and gather the results locally (usingrlberry.network).
Logging and rendering:
- Data logging with a new
DefaultWriterand improved evaluation and plot methods inrlberry.manager.evaluation. - Fix rendering bug with OpenGL (bf606b44aaba1b918daf3dcc02be96a8ef5436b4).
Bug fixes.
- Python
Published by omardrwch over 4 years ago