dynamic-shielding

An implementation of dynamic shielding

https://github.com/eratommsd/dynamic-shielding

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

An implementation of dynamic shielding

Basic Info
  • Host: GitHub
  • Owner: ERATOMMSD
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 2.55 MB
Statistics
  • Stars: 5
  • Watchers: 4
  • Forks: 1
  • Open Issues: 0
  • Releases: 1
Created over 3 years ago · Last pushed over 3 years ago
Metadata Files
Readme License Citation

README.md

Dynamic Shielding

License: MIT Python Unittest

This is the source code repository for an implementation of dynamic shielding for reinforcement learning.

Setup

In order to ensure reproducibility of the results, we suggest using two different virtual environments. One for the benchmark taken from safe-rl-shielding and the other for the other benchmarks.

Virtual environment

In this repository, there are two "requirements" files. - requirements.txt: original requirements of the safe-rl-shielding benchmarks in ./envs/ + pyeda - requirements-sb.txt: requirements for stable-baselines benchmarks in ./python/benchmarks/

All benchmarks inside ./python/benchmarks/ use requirements-sb.txt except for selfdrivingcar, which requires some libraries that are used in the original benchmarks.

Since some of the benchmarks requires old OpenCV, the latest python does not work. We can set up the suitable python environment using pyenv-virtualenv and install the required packages as follows.

sh pyenv virtualenv 3.6.10 safe-rl-shielding pyenv local safe-rl-shielding pyenv exec pip install -r requirements.txt

As for the stable-baselines benchmarks, Python 3.6.8+ is required.

sh pyenv virtualenv 3.6.8 shielded-automata-learning pyenv local shielded-autamata-learning pyenv exec pip install -r requirements-sb.txt

We note that the use of pyenv itself is optional. You can use venv to make separated environments.

Usage

  1. Compile the java part of our code with cd java && mvn package
  2. Set up the environment, for example, with python3.6 -v venv .venv
  3. Activate the environment, for example, with . .venv/bin/activate
  4. Update pip with pip install --upgrade pip
  5. Install the dependencies with pip install -r requirements-sb.txt
    • You may need some extra care if you want to use GPUs
  6. Run a script to train a controller, for example, cd python/benchmarks/grid_world/ && python ./scripts/run.py
    • You can disable GPUs. See this stack overflow post for the detail.
    • We remark that you should repeat the experiment sufficiently many times due to its randomness.
  7. You can see the result using tensorboard

Note

How to install Spot for pyenv virtualenv or venv

By default, the python library for Spot is installed under /usr/local/. In order to use Spot in pyenv virtualenv, the python library must be installed under $HOME/.pyenv/versions/shielded-learning/. By the following modification of configure in spot, we can install the python library in the appropriate directory.

sh sed -i 's:PYTHON_PREFIX=.*:PYTHON_PREFIX="$HOME/.pyenv/versions/shielded-learning/":;s:PYTHON_EXEC_PREFIX=.*:PYTHON_EXEC_PREFIX="$HOME/.pyenv/versions/shielded-learning/":;' configure && ./configure --prefix ~/.pyenv/versions/shielded-learning/

If you use venv, the command should be as follows.

sh sed -i 's:PYTHON_PREFIX=.*:PYTHON_PREFIX="$HOME/dynamic-shielding/venv/":;s:PYTHON_EXEC_PREFIX=.*:PYTHON_EXEC_PREFIX="$HOME/dynamic-shielding/venv/":;' configure && ./configure --prefix ~/dynamic-shielding/venv/

Alternatively, spot location can be added to the virtual environment by adding the following line inside ./venv/lib/python3.6/site-packages/distutils-precedence.pth

sh import sys; sys.path.append('/usr/local/lib/python3.6/site-packages/');

Contributors (to the source code)

Citation

If you want to cite our paper, please use the following .bib file.

@inproceedings{WCPKTH22, author = {Masaki Waga and Ezequiel Castellano and Sasinee Pruekprasert and Stefan Klikovits and Toru Takisaka and Ichiro Hasuo}, editor = {Ahmed Bouajjani and Luk{\'{a}}s Hol{\'{\i}}k and Zhilin Wu}, title = {Dynamic Shielding for Reinforcement Learning in Black-Box Environments}, booktitle = {Automated Technology for Verification and Analysis - 20th International Symposium, {ATVA} 2022, Virtual Event, October 25-28, 2022, Proceedings}, series = {Lecture Notes in Computer Science}, volume = {13505}, pages = {25--41}, publisher = {Springer}, year = {2022}, url = {https://doi.org/10.1007/978-3-031-19992-9\_2}, doi = {10.1007/978-3-031-19992-9\_2} }

Acknowledgments

The source code under envs are originally from https://github.com/safe-rl/safe-rl-shielding/, which is distributed under MIT license. Some of the source code under java/ are originally from https://github.com/mtf90/learnlib-py4j-example, which is distributed under Apache-2.0 license.

Reference

  • Dynamic Shielding for Reinforcement Learning in Black-Box Environments. Masaki Waga, Ezequiel Castellano, Sasinee Pruekprasert, Stefan Klikovits, Toru Takisaka, and Ichiro Hasuo

Owner

  • Name: ERATO MMSD
  • Login: ERATOMMSD
  • Kind: organization
  • Location: Tokyo, Japan

ERATO Metamathematics for Systems Design Project

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Waga"
  given-names: "Masaki"
  orcid: "https://orcid.org/0000-0001-9360-7490"
- family-names: "Castellano"
  given-names: "Ezequiel"
  orcid: "https://orcid.org/0000-0002-9604-9997"
- family-names: "Pruekprasert"
  given-names: "Sasinee"
  orcid: "https://orcid.org/0000-0002-5929-9014"
- family-names: "Klikovits"
  given-names: "Stefan"
  orcid: "https://orcid.org/0000-0003-4212-7029"
title: "Dynamic Shielding"
version: ATVA2022
doi: 10.5281/zenodo.6906673
date-released: 2022-07-26
url: "https://github.com/ERATOMMSD/dynamic-shielding"
preferred-citation:
  type: conference-paper
  authors:
  - family-names: "Waga"
    given-names: "Masaki"
    orcid: "https://orcid.org/0000-0001-9360-7490"
  - family-names: "Castellano"
    given-names: "Ezequiel"
    orcid: "https://orcid.org/0000-0002-9604-9997"
  - family-names: "Pruekprasert"
    given-names: "Sasinee"
    orcid: "https://orcid.org/0000-0002-5929-9014"
  - family-names: "Klikovits"
    given-names: "Stefan"
    orcid: "https://orcid.org/0000-0003-4212-7029"
  - family-names: "Takisaka"
    given-names: "Toru"
    orcid: "https://orcid.org/0000-0002-5046-7480"
  - family-names: "Hasuo"
    given-names: "Ichiro"
    orcid: "https://orcid.org/0000-0002-8300-4650"
  doi: "10.1007/978-3-031-19992-9_2"
  booktitle: "Automated Technology for Verification and Analysis - 20th International Symposium, {ATVA} 2022, Virtual Event, October 25-28, 2022, Proceedings"
  start: 25 # First page number
  end: 41 # Last page number
  publisher: "Springer"
  title: "Dynamic Shielding for Reinforcement Learning in Black-Box Environments"
  series: "Lecture Notes in Computer Science"
  volume: 13505
  year: 2022

GitHub Events

Total
Last Year

Dependencies

java/pom.xml maven
  • de.learnlib:learnlib-parent 0.14.0 import
  • org.projectlombok:lombok 1.18.20 provided
  • ch.qos.logback:logback-classic 1.2.6
  • de.learnlib.distribution:learnlib-distribution
  • net.sf.py4j:py4j 0.10.8.1
  • com.pholser:junit-quickcheck-core 0.9.5 test
  • com.pholser:junit-quickcheck-generators 0.9.5 test
  • org.assertj:assertj-core 3.21.0 test
  • org.junit.jupiter:junit-jupiter RELEASE test
  • org.junit.jupiter:junit-jupiter-api 5.8.1 test
  • org.junit.jupiter:junit-jupiter-engine 5.8.1 test
  • org.junit.platform:junit-platform-launcher 1.8.1 test
  • org.junit.vintage:junit-vintage-engine 5.8.1 test
envs/self_driving_car/setup.py pypi
  • keras >=1.0.7
python/benchmarks/grid_world/requirements.txt pypi
  • Box2D-kengz ==2.3.3
  • Django ==1.11.4
  • Jinja2 ==2.11.2
  • Keras ==2.0.0
  • Keras-Applications ==1.0.8
  • Keras-Preprocessing ==1.1.2
  • Markdown ==2.6.9
  • MarkupSafe ==1.1.1
  • Pillow ==4.2.1
  • PyBrain3 ==3.0.4
  • PyOpenGL ==3.1.0
  • PyYAML ==3.12
  • Pygments ==2.7.2
  • QtPy ==1.9.0
  • Send2Trash ==1.5.0
  • Theano ==0.9.0
  • Werkzeug ==0.12.2
  • absl-py ==0.11.0
  • appnope ==0.1.0
  • argon2-cffi ==20.1.0
  • astor ==0.8.1
  • astunparse ==1.6.3
  • async-generator ==1.10
  • atari-py ==0.1.1
  • attrs ==20.3.0
  • backcall ==0.2.0
  • baselines ==0.1.6
  • bleach ==1.5.0
  • cachetools ==4.1.1
  • certifi ==2017.7.27.1
  • cffi ==1.14.3
  • chardet ==3.0.4
  • click ==7.1.2
  • cloudpickle ==1.6.0
  • decorator ==4.4.2
  • defusedxml ==0.6.0
  • dill ==0.3.3
  • entrypoints ==0.3
  • future ==0.18.2
  • gast ==0.3.3
  • google-auth ==1.23.0
  • google-auth-oauthlib ==0.4.2
  • google-pasta ==0.2.0
  • grpcio ==1.33.2
  • gym ==0.15.7
  • h5py ==2.10.0
  • html5lib ==0.9999999
  • idna ==2.6
  • image ==1.5.13
  • imageio ==2.2.0
  • importlib-metadata ==2.0.0
  • ipykernel ==5.3.4
  • ipython ==7.16.1
  • ipython-genutils ==0.2.0
  • ipywidgets ==7.5.1
  • jedi ==0.17.2
  • joblib ==0.17.0
  • jsonschema ==3.2.0
  • jupyter ==1.0.0
  • jupyter-client ==6.1.7
  • jupyter-console ==6.2.0
  • jupyter-core ==4.7.0
  • jupyterlab-pygments ==0.1.2
  • mistune ==0.8.4
  • mpi4py ==3.0.3
  • mujoco-py ==0.5.7
  • mypy ==0.782
  • mypy-extensions ==0.4.3
  • nbclient ==0.5.1
  • nbconvert ==6.0.7
  • nbformat ==5.0.8
  • nest-asyncio ==1.4.3
  • notebook ==6.1.5
  • numpy ==1.19.4
  • oauthlib ==3.1.0
  • olefile ==0.44
  • opencv-python ==3.3.0.9
  • opt-einsum ==3.3.0
  • pandocfilters ==1.4.3
  • parso ==0.7.1
  • pexpect ==4.8.0
  • pickleshare ==0.7.5
  • progressbar2 ==3.53.1
  • prometheus-client ==0.9.0
  • prompt-toolkit ==3.0.8
  • protobuf ==3.14.0
  • ptyprocess ==0.6.0
  • py4j ==0.10.8.1
  • pyasn1 ==0.4.8
  • pyasn1-modules ==0.2.8
  • pycparser ==2.20
  • pyeda ==0.28.0
  • pygame ==1.9.3
  • pyglet ==1.5.0
  • pyrsistent ==0.17.3
  • python-dateutil ==2.8.1
  • python-utils ==2.4.0
  • pytz ==2017.2
  • pyzmq ==20.0.0
  • qtconsole ==4.7.7
  • requests ==2.18.4
  • requests-oauthlib ==1.3.0
  • rsa ==4.6
  • scipy ==1.5.4
  • six ==1.15.0
  • tensorboard ==1.14.0
  • tensorboard-plugin-wit ==1.7.0
  • tensorflow ==1.14.0
  • tensorflow-estimator ==1.14.0
  • tensorflow-tensorboard ==0.1.5
  • termcolor ==1.1.0
  • terminado ==0.9.1
  • testpath ==0.4.4
  • tornado ==6.1
  • tqdm ==4.52.0
  • traitlets ==4.3.3
  • typed-ast ==1.4.1
  • typing-extensions ==3.7.4.3
  • urllib3 ==1.22
  • wcwidth ==0.2.5
  • widgetsnbextension ==3.5.1
  • wrapt ==1.12.1
  • zipp ==3.4.0
  • zmq ==0.0.0
python/requirements.txt pypi
  • pyeda *
requirements-sb.txt pypi
  • EasyProcess ==0.3
  • Markdown ==3.3.4
  • Pillow ==8.3.2
  • PyVirtualDisplay ==2.2
  • Werkzeug ==2.0.2
  • absl-py ==0.14.1
  • atari-py ==0.2.6
  • box2d-py ==2.3.8
  • cachetools ==4.2.4
  • certifi ==2021.5.30
  • charset-normalizer ==2.0.6
  • cloudpickle ==1.6.0
  • cycler ==0.10.0
  • dataclasses ==0.8
  • google-auth ==1.35.0
  • google-auth-oauthlib ==0.4.6
  • grpcio ==1.41.0
  • gym ==0.19.0
  • gym-miniworld ==2020.1.9
  • highway-env ==1.2
  • idna ==3.2
  • importlib-metadata ==4.8.1
  • kiwisolver ==1.3.1
  • matplotlib ==3.3.4
  • numpy ==1.19.5
  • oauthlib ==3.1.1
  • opencv-python ==4.6.0.66
  • pandas ==1.1.5
  • protobuf ==3.18.1
  • psutil ==5.8.0
  • py4j ==0.10.9.2
  • pyasn1 ==0.4.8
  • pyasn1-modules ==0.2.8
  • pyeda ==0.28.0
  • pygame ==2.0.1
  • pyglet ==1.5.21
  • pyparsing ==2.4.7
  • python-dateutil ==2.8.2
  • pytz ==2021.3
  • requests ==2.26.0
  • requests-oauthlib ==1.3.0
  • rsa ==4.7.2
  • scipy ==1.5.4
  • setuptools >=41.0.0
  • six ==1.16.0
  • stable-baselines3 ==1.2.0
  • tensorboard ==2.6.0
  • tensorboard-data-server ==0.6.1
  • tensorboard-plugin-wit ==1.8.0
  • torch ==1.10.2
  • torchvision *
  • tryalgo ==1.3.0
  • typing-extensions ==3.10.0.2
  • urllib3 ==1.26.7
  • zipp ==3.6.0
requirements.txt pypi
  • Box2D-kengz ==2.3.3
  • Django ==1.11.4
  • Keras ==2.0.0
  • Markdown ==2.6.9
  • Pillow ==4.2.1
  • PyBrain3 ==3.0.4
  • PyOpenGL ==3.1.0
  • PyYAML ==3.12
  • Theano ==0.9.0
  • Werkzeug ==0.12.2
  • atari-py ==0.1.1
  • bleach ==1.5.0
  • certifi ==2017.7.27.1
  • chardet ==3.0.4
  • gym ==0.18.0
  • html5lib ==0.9999999
  • idna ==2.6
  • image ==1.5.13
  • imageio ==2.2.0
  • mujoco-py ==0.5.7
  • numpy ==1.11.3
  • olefile ==0.44
  • opencv-python ==3.3.0.9
  • protobuf ==3.4.0
  • py4j ==0.10.8.1
  • pyeda *
  • pygame ==1.9.3
  • pyglet *
  • pytz ==2017.2
  • requests ==2.18.4
  • scipy ==0.19.1
  • six ==1.10.0
  • tensorflow ==1.3.0
  • tensorflow-tensorboard ==0.1.5
  • tqdm ==4.15.0
  • tryalgo *
  • urllib3 ==1.22
.github/workflows/unittest.yml actions
  • actions/cache v2 composite
  • actions/checkout v1 composite