vagram

[ICLR 22] Value Gradient weighted Model-Based Reinforcement Learning.

https://github.com/pairlab/vagram

Science Score: 52.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
✓
Institutional organization owner
Organization pairlab has institutional domain (pair.toronto.edu)
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

[ICLR 22] Value Gradient weighted Model-Based Reinforcement Learning.

Basic Info

Host: GitHub
Owner: pairlab
License: mit
Language: Python
Default Branch: main
Homepage: https://www.pair.toronto.edu/blog/2022/vagram-voelcker/
Size: 16.9 MB

Statistics

Stars: 24
Watchers: 2
Forks: 5
Open Issues: 0
Releases: 0

Created about 4 years ago · Last pushed about 3 years ago

Metadata Files

Readme Changelog Contributing License Code of conduct Citation

Value Gradient weighted Model-Based Reinforcement Learning.

This is the official code for VAGRAM published at ICLR 2022.
The code framework builds on MBRL-lib

Experiments

To run the experiments presented in the paper, install the required libraries found in requirements.txt and use the vagram/mbrl/examples/main.py script provided by mbrl-lib.

The exact settings for the hopper experiments can be found in vagram/scripts:

Distraction (2nd cmd parameter sets the number of distracting dimensions): python3 -m mbrl.examples.main \ seed=$1 \ algorithm=mbpo \ overrides=mbpo_hopper_distraction \ overrides.num_steps=500000 \ overrides.model_batch_size=1024 \ overrides.distraction_dimensions=$2

Reduced model size (numlayers sets the model size): ``` python3 -m mbrl.examples.main \ seed=$RANDOM \ algorithm=mbpo \ overrides=mbpohopper \ dynamicsmodel.model.numlayers=3 \ dynamicsmodel.model.hidsize=64 \ overrides.modelbatchsize=1024 ```

To use MSE/MLE instead of VaGraM, run:

python3 -m mbrl.examples.main \ seed=$1 \ algorithm=mbpo \ overrides=mbpo_hopper_distraction \ overrides.num_steps=500000 \ overrides.model_batch_size=256 \ dynamics_model=gaussian_mlp_ensemble \ overrides.distraction_dimensions=$2

Using VaGraM

The core implementation of the VaGraM algorithm can be found in vagram/mbrl/models/vaml_mlp.py. The code offers three variants, one for IterVAML, on for the unbounded VaGraM objective and finally the bounded VaGraM objective used in the paper. THe default configuration used in all experiments can be found in vagram/mbrl/examples/conf/dynamics_model/vaml_ensemble.yaml.

In addition to the implementation details in the paper, we introduced a cache for the computed value function gradients. This does not change any detail of the optimization, but saves gradients of the state samples until the value function is updated for faster computation.

Citing

If you use this project in your research, please cite:

BibTeX @inproceedings{voelcker2022vagram, title={{Value Gradient weighted Model-Based Reinforcement Learning}}, author={Claas A Voelcker and Victor Liao and Animesh Garg and Amir-massoud Farahmand}, booktitle={International Conference on Learning Representations (ICLR)}, year={2022}, url={https://openreview.net/forum?id=4-D6CZkRXxI} }

License

VaGRAM is released under the MIT license. See LICENSE for additional details about it.

Owner

Name: PAIR Lab
Login: pairlab
Kind: organization
Email: garg@cs.toronto.edu

Website: pair.toronto.edu
Twitter: animesh_garg
Repositories: 18
Profile: https://github.com/pairlab

PAIR Lab works on machine learning & perception in robotics with implications on interactions with people.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite our paper at ICLR."
authors:
- family-names: "Voelcker"
  given-names: "Claas"
- family-names: "Liao"
  given-names: "Victor"
- family-names: "Garg"
  given-names: "Animesh"
- family-names: "Farahmand"
  given-names: "Amir-massoud"
title: "vagram"
license: "MIT"
url: "https://github.com/pairlab/vagram"
preferred-citation:
  type: conference-paper
  authors:
  - family-names: "Voelcker"
    given-names: "Claas"
  - family-names: "Liao"
    given-names: "Victor"
  - family-names: "Garg"
    given-names: "Animesh"
  - family-names: "Farahmand"
    given-names: "Amir-massoud"
  collection-title: "International Conference on Learning Representations (ICLR)"
  title: "Value Gradient weighted Model-Based Reinforcement Learning"
  year: 2022
  url: "https://openreview.net/forum?id=4-D6CZkRXxI"

GitHub Events

Total

Last Year

Dependencies

mbrl/third_party/dmc2gym/setup.py pypi

dm_control *
gym *

mbrl/third_party/pytorch_sac/setup.py pypi

line.rstrip *

requirements/dev.txt pypi

black >=21.4b2 development
flake8 >=3.8.4 development
mypy >=0.902 development
nbsphinx >=0.8.0 development
pytest >=6.0.1 development
sphinx >=3.3.1 development
sphinx-rtd-theme >=0.5.0 development
types-pyyaml >=0.1.6 development
types-termcolor >=0.1.0 development

requirements/main.txt pypi

gym ==0.17.2
hydra-core ==1.0.3
imageio >=2.9.0
jupyter >=1.0.0
matplotlib >=3.3.1
mujoco-py ==2.1.2.14
numpy >=1.19.1
pytest >=6.0.1
sk-video >=1.1.10
tensorboard >=2.4.0
termcolor >=1.1.0
torch ==1.11.0

.github/workflows/ci.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

setup.py pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

vagram

Science Score: 52.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Value Gradient weighted Model-Based Reinforcement Learning.

Experiments

Using VaGraM

Citing

License

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies