imitation

Clean PyTorch implementations of imitation and reward learning algorithms

https://github.com/humancompatibleai/imitation

Science Score: 51.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
1 of 35 committers (2.9%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.3%) to scientific vocabulary

Keywords

gymnasium imitation-learning inverse-reinforcement-learning reward-learning

Keywords from Contributors

gym reinforcement-learning

Last synced: 11 months ago · JSON representation ·

Repository

Clean PyTorch implementations of imitation and reward learning algorithms

Basic Info

Host: GitHub
Owner: HumanCompatibleAI
License: mit
Language: Python
Default Branch: master
Homepage: https://imitation.readthedocs.io/
Size: 27.8 MB

Statistics

Stars: 1,576
Watchers: 17
Forks: 288
Open Issues: 95
Releases: 8

Topics

gymnasium imitation-learning inverse-reinforcement-learning reward-learning

Created over 7 years ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

Imitation Learning Baseline Implementations

This project aims to provide clean implementations of imitation and reward learning algorithms. Currently, we have implementations of the algorithms below. 'Discrete' and 'Continous' stands for whether the algorithm supports discrete or continuous action/state spaces respectively.

| Algorithm (+ link to paper) | API Docs | Discrete | Continuous | |-----------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------|----------|------------| | Behavioral Cloning | algorithms.bc | ✅ | ✅ | | DAgger | algorithms.dagger | ✅ | ✅ | | Density-Based Reward Modeling | algorithms.density | ✅ | ✅ | | Maximum Causal Entropy Inverse Reinforcement Learning | algorithms.mce_irl | ✅ | ❌ | | Adversarial Inverse Reinforcement Learning | algoritms.airl | ✅ | ✅ | | Generative Adversarial Imitation Learning | algorithms.gail | ✅ | ✅ | | Deep RL from Human Preferences | algorithms.preference_comparisons | ✅ | ✅ | | Soft Q Imitation Learning | algorithms.sqil | ✅ | ❌ |

You can find the documentation here.

You can read the latest benchmark results here.

Installation

Prerequisites

Python 3.8+
(Optional) OpenGL (to render Gymnasium environments)
(Optional) FFmpeg (to encode videos of renders)

Note: imitation is only compatible with newer gymnasium environment API and does not support the older gym API.

Installing PyPI release

Installing the PyPI release is the standard way to use imitation, and the recommended way for most users.

pip install imitation

Install from source

If you like, you can install imitation from source to contribute to the project or access the very last features before a stable release. You can do this by cloning the GitHub repository and running the installer directly. First run: git clone http://github.com/HumanCompatibleAI/imitation && cd imitation.

For development mode, then run:

pip install -e ".[dev]"

This will run setup.py in development mode, and install the additional dependencies required for development. For regular use, run instead

pip install .

Additional extras are available depending on your needs. Namely, tests for running the test suite, docs for building the documentation, parallel for parallelizing the training, and atari for including atari environments. The dev extra already installs the tests, docs, and atari dependencies automatically, and tests installs the atari dependencies.

For macOS users, some packages are required to run experiments (see ./experiments/README.md for details). First, install Homebrew if not available (see Homebrew). Then, run:

brew install coreutils gnu-getopt parallel

CLI Quickstart

We provide several CLI scripts as a front-end to the algorithms implemented in imitation. These use Sacred for configuration and replicability.

From examples/quickstart.sh:

```bash

Train PPO agent on pendulum and collect expert demonstrations. Tensorboard logs saved in quickstart/rl/

python -m imitation.scripts.trainrl with pendulum environment.fast policyevaluation.fast rl.fast fast logging.log_dir=quickstart/rl/

Train GAIL from demonstrations. Tensorboard logs saved in output/ (default log directory).

python -m imitation.scripts.trainadversarial gail with pendulum environment.fast demonstrations.fast policyevaluation.fast rl.fast fast demonstrations.path=quickstart/rl/rollouts/final.npz demonstrations.source=local

Train AIRL from demonstrations. Tensorboard logs saved in output/ (default log directory).

python -m imitation.scripts.trainadversarial airl with pendulum environment.fast demonstrations.fast policyevaluation.fast rl.fast fast demonstrations.path=quickstart/rl/rollouts/final.npz demonstrations.source=local ```

Tips:

Remove the "fast" options from the commands above to allow training run to completion.
python -m imitation.scripts.train_rl print_config will list Sacred script options. These configuration options are documented in each script's docstrings.

For more information on how to configure Sacred CLI options, see the Sacred docs.

Python Interface Quickstart

See examples/quickstart.py for an example script that loads CartPole-v1 demonstrations and trains BC, GAIL, and AIRL models on that data.

Density reward baseline

We also implement a density-based reward baseline. You can find an example notebook here.

Citations (BibTeX)

@misc{gleave2022imitation, author = {Gleave, Adam and Taufeeque, Mohammad and Rocamonde, Juan and Jenner, Erik and Wang, Steven H. and Toyer, Sam and Ernestus, Maximilian and Belrose, Nora and Emmons, Scott and Russell, Stuart}, title = {imitation: Clean Imitation Learning Implementations}, year = {2022}, howPublished = {arXiv:2211.11972v1 [cs.LG]}, archivePrefix = {arXiv}, eprint = {2211.11972}, primaryClass = {cs.LG}, url = {https://arxiv.org/abs/2211.11972}, }

Contributing

See Contributing to imitation for more information.

Owner

Name: Center for Human-Compatible AI
Login: HumanCompatibleAI
Kind: organization

Website: https://humancompatible.ai
Repositories: 44
Profile: https://github.com/HumanCompatibleAI

CHAI seeks to develop the conceptual and technical wherewithal to reorient the general thrust of AI research towards provably beneficial systems.

Citation (CITATION.bib)

@misc{gleave2022imitation,
  author = {Gleave, Adam and Taufeeque, Mohammad and Rocamonde, Juan and Jenner, Erik and Wang, Steven H. and Toyer, Sam and Ernestus, Maximilian and Belrose, Nora and Emmons, Scott and Russell, Stuart},
  title = {imitation: Clean Imitation Learning Implementations},
  year = {2022},
  howPublished = {arXiv:2211.11972v1 [cs.LG]},
  archivePrefix = {arXiv},
  eprint = {2211.11972},
  primaryClass = {cs.LG},
  url = {https://arxiv.org/abs/2211.11972},
}

GitHub Events

Total

Create event: 1
Release event: 1
Issues event: 9
Watch event: 249
Delete event: 1
Issue comment event: 19
Push event: 13
Pull request event: 3
Pull request review comment event: 1
Pull request review event: 1
Fork event: 39

Last Year

Create event: 1
Release event: 1
Issues event: 9
Watch event: 249
Delete event: 1
Issue comment event: 19
Push event: 13
Pull request event: 3
Pull request review comment event: 1
Pull request review event: 1
Fork event: 39

Committers

Last synced: about 1 year ago

All Time

Total Commits: 657
Total Committers: 35
Avg Commits per committer: 18.771
Development Distribution Score (DDS): 0.607

Past Year

Commits: 1
Committers: 1
Avg Commits per committer: 1.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Steven H. Wang	w**h@g**m	258
Adam Gleave	a**m@g**e	166
M. Ernestus	m**n@e**e	50
zenev	n**s@g**m	30
Juan Rocamonde	j**e@g**m	20
Sam Toyer	q****v	17
Mohammad Taufeeque	9**9@g**m	15
Yawen Duan	3****n	10
Daniel Pandori	9****i	9
Tom Tseng	t****g	8
Lev McKinney	l**y@g**m	8
Erik Jenner	e**9@g**m	8
Daniel Filan	df@d****m	7
Ian Fan	i**0@g**m	7
Ansh Radhakrishnan	a**n@g**m	5
Nora Belrose	3****e	5
Yawen Duan	3****d	4
zajaczajac	e**c@g**m	4
Ariel Kwiatkowski	a**i@g**m	3
Mifeet	m****t	3
hacobe	9****e	3
Cody Wild	c**d@b**u	2
Pavel C	p**t@p**t	2
lukasberglund	l**d@g**m	2
rk1a	9****a	1
pedrofreire	p**x@g**m	1
Ziyue Wang	3****5	1
Yulong Lin	l**7@g**m	1
Xiaotan Zhu	3****u	1
Tim Bauman	t**n@t**m	1
and 5 more...

Committer Domains (Top 20 + Academic)

scottemmons.com: 1 tbauman.com: 1 posteo.net: 1 berkeley.edu: 1 danielfilan.com: 1 ernestus.de: 1 gleave.me: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 148
Total pull requests: 119
Average time to close issues: 2 months
Average time to close pull requests: about 1 month
Total issue authors: 87
Total pull request authors: 30
Average comments per issue: 2.19
Average comments per pull request: 2.51
Merged pull requests: 76
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 9
Pull requests: 3
Average time to close issues: about 1 month
Average time to close pull requests: less than a minute
Issue authors: 9
Pull request authors: 2
Average comments per issue: 0.44
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

ernestum (22)
AdamGleave (11)
mertalbaba (6)
ZiyueWang25 (6)
Liuzy0908 (4)
Rocamonde (3)
levmckinney (3)
dfilan (3)
kavinwkp (2)
azafar1991 (2)
spearsheep (2)
PavelCz (2)
hhroberthdaniel (2)
mifeet (2)
kierad (2)

Pull Request Authors

ernestum (39)
taufeeque9 (9)
AdamGleave (6)
ZiyueWang25 (5)
hacobe (5)
RedTachyon (5)
timbauman (4)
zajaczajac (4)
jas-ho (3)
timokau (3)
iwishiwasaneagle (2)
Ivan-267 (2)
Rocamonde (2)
mifeet (2)
qxcv (2)

Top Labels

Issue Labels

enhancement (63) bug (55) docs (13) low priority (1)

Pull Request Labels

docs (4)

Packages

Total packages: 4
Total downloads:
- pypi 3,912 last-month
Total docker downloads: 102

Total dependent packages: 2
(may contain duplicates)
Total dependent repositories: 10
(may contain duplicates)
Total versions: 28
Total maintainers: 4

pypi.org: imitation

Implementation of modern reward and imitation learning algorithms.

Homepage: https://github.com/HumanCompatibleAI/imitation
Documentation: https://imitation.readthedocs.io/
License: MIT
Latest release: 1.0.1
published over 1 year ago

Versions: 9
Dependent Packages: 2
Dependent Repositories: 10
Downloads: 3,905 Last month
Docker Downloads: 102

Rankings

Stargazers count: 2.1%

Docker downloads count: 3.2%

Forks count: 3.7%

Average: 4.5%

Dependent repos count: 4.6%

Dependent packages count: 4.8%

Downloads: 8.8%

Maintainers (3)

AdamGleave qxcv shwang

Last synced: 11 months ago

proxy.golang.org: github.com/humancompatibleai/imitation

Documentation: https://pkg.go.dev/github.com/humancompatibleai/imitation#section-documentation
License: mit
Latest release: v1.0.1
published over 1 year ago

Versions: 9
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.4%

Average: 6.7%

Dependent repos count: 6.9%

Last synced: 11 months ago

proxy.golang.org: github.com/HumanCompatibleAI/imitation

Documentation: https://pkg.go.dev/github.com/HumanCompatibleAI/imitation#section-documentation
License: mit
Latest release: v1.0.1
published over 1 year ago

Versions: 9
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.4%

Average: 6.7%

Dependent repos count: 6.9%

Last synced: 11 months ago

pypi.org: imitation-qjm

Implementation of modern reward and imitation learning algorithms.

Homepage: https://github.com/HumanCompatibleAI/imitation
Documentation: https://imitation-qjm.readthedocs.io/
License: MIT
Latest release: 0.1.dev188172685
published about 1 year ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 7 Last month

Rankings

Dependent packages count: 9.2%

Average: 30.5%

Dependent repos count: 51.8%

Maintainers (1)

SunnyQjm

Last synced: 11 months ago

Dependencies

.github/workflows/publish-to-pypi.yml actions

actions/checkout v3 composite
actions/setup-python v3 composite
pypa/gh-action-pypi-publish release/v1 composite

Dockerfile docker

base latest build
nvidia/cuda 11.6.2-cudnn8-runtime-ubuntu20.04 build
python-req latest build

setup.py pypi

gym *

pyproject.toml pypi