learning-safety-in-mpc-based-rl

Safety-aware MPC-based RL framework

https://github.com/filippoairaldi/learning-safety-in-mpc-based-rl

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, sciencedirect.com
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.8%) to scientific vocabulary

Keywords

gaussian-processes model-predictive-control reinforcement-learning safe-control

Last synced: 6 months ago · JSON representation

Repository

Safety-aware MPC-based RL framework

Basic Info

Host: GitHub
Owner: FilippoAiraldi
License: gpl-3.0
Language: Python
Default Branch: release
Homepage: https://www.sciencedirect.com/science/article/pii/S2405896323009308
Size: 1.65 GB

Statistics

Stars: 65
Watchers: 1
Forks: 9
Open Issues: 0
Releases: 3

Topics

gaussian-processes model-predictive-control reinforcement-learning safe-control

Created over 3 years ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

Learning safety in model-based Reinforcement Learning using MPC and Gaussian Processes

results-image

This repository contains the source code used to produce the results obtained in our 2023 IFAC WC submission (extended version here).

In this work, we propose a straightforward yet effective algorithm for enabling safety in the context of Safe Reinforcement Learning (RL) using Model Predictive Control (MPC) as function approximation. The unknown constraints encoding safety are learnt from observed MPC trajectories via Gaussian Process (GP) regression, and are then enforced onto the RL agent to guarantee that the MPC controller is safe with high probability.

If you find the paper or this repository helpful in your publications, please consider citing it.

bibtex @article{airaldi20235759, title = {Learning safety in model-based Reinforcement Learning using MPC and Gaussian Processes}, journal = {IFAC-PapersOnLine}, volume = {56}, number = {2}, pages = {5759-5764}, year = {2023}, note = {22nd IFAC World Congress}, doi = {https://doi.org/10.1016/j.ifacol.2023.10.563}, author = {Filippo Airaldi and Bart De Schutter and Azita Dabiri}, }

Installation

The code was created with Python 3.9.5. To access it, clone the repository

bash git clone https://github.com/FilippoAiraldi/learning-safety-in-mpc-based-rl.git cd learning-safely

and then install the required packages by, e.g., running

bash pip install -r requirements.txt

Structure

The repository code is structured in the following way

agents contains the RL algorithms used within the paper
- the Perfect-Knowledge agent, a non-learning agent with exact information on the quadrotor drone dynamics
- the LSTD Q-learning agent, in both its safe and unsafe variants, i.e., with and without our proposed algorithm, respectively.
envs contains the quadrotor environment (in OpenAI's gym style) used in the numerical experiment
mpc contains the implementation (based CasADi) of the MPC optimization scheme
resouces contains media and other miscellaneous resources
sim contains pickle-serialized simulation results of the different agents
util contains utility classes and functions for, e.g., plotting, I/O, exceptions, etc.
train.py launches simulations for the different agents
visualization.py visualizes the simulation results

Training

Training simulations can easily be launched via command. The default arguments are already set to yield the results found in the paper. To reproduce the simulation results run the following command calling one of the 3 available

bash python train.py (--pk | --lstdq | --safe_lstqd)

Note that only one can be simulated at a time. Results will be saved under the filename ${runname}.pkl.

Visualization

To visualize simulation results, simply run

bash python visualization.py ${runname1}.pkl ... ${runnameN}.pkl

You can additionally pass --papermode, which will cause the paper figures to be created (in this case, the simulation results filepaths are hardcoded).

License

The repository is provided under the GNU General Public License. See the LICENSE file included with this repository.

Author

Filippo Airaldi, PhD Candidate [f.airaldi@tudelft.nl | filippoairaldi@gmail.com]

Delft Center for Systems and Control in Delft University of Technology

This research is part of a project that has received funding from the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (Grant agreement No. 101018826 - CLariNet).

Copyright notice: Technische Universiteit Delft hereby disclaims all copyright interest in the program learning-safety-in-mpc-based-rl (Learning safety in model-based Reinforcement Learning using MPC and Gaussian Processes) written by the Author(s). Prof. Dr. Ir. Fred van Keulen, Dean of 3mE.

Owner

Name: Filippo Airaldi
Login: FilippoAiraldi
Kind: user
Location: Netherlands
Company: Delft University of Technology

Website: https://www.tudelft.nl/en/staff/f.airaldi/?cHash=179a303f7470cc84be9db32be39efb78
Repositories: 6
Profile: https://github.com/FilippoAiraldi

PhD researcher at TU Delft.

GitHub Events

Total

Watch event: 20
Fork event: 2

Last Year

Watch event: 20
Fork event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 0
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: about 1 year
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 1.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 1

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

dependabot[bot] (1)

Top Labels

Issue Labels

Pull Request Labels

dependencies (1)

Dependencies

requirements.txt pypi

casadi ==3.5.5
gym ==0.26.0
joblib ==1.1.0
matplotlib ==3.5.2
numpy ==1.23.1
scikit-learn ==1.1.2
scipy ==1.8.1
tqdm ==4.64.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science