Recent Releases of imitation

imitation - v1.0.1

Fix bug with tensors being on the wrong device when there is more than one device available (#831).

What's Changed

  • Update the README files by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/817
  • Add more benchmarking documentation by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/822
  • Clarify in README.md that we switched to gymnasium. by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/824
  • Switch to lualatex to generate the documentation PDF by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/826
  • Fix documentation pipeline by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/827
  • Fix warning in quickstart.py by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/823
  • Fix coverage issue in BC tests by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/830
  • Remove FloatReward by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/829
  • Ensure safetotensor moves tensors to the specified device by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/831

Full Changelog: https://github.com/HumanCompatibleAI/imitation/compare/v1.0.0...v1.0.1

- Python
Published by tomtseng about 1 year ago

imitation - v1.0.0 -- first stable release

We're pleased to announce the first stable release of imitation. Key improvements include: * Gymnasium compatibility, which has superceded Gym * Tuned hyperparameters and benchmark results for common algorithm-environment pairs (see release artifact attached). * New algorithm (beta): SQIL For more information, see the changelog below.

What's Changed

  • Updated Installation Instructions by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/760
  • Download experts from hf inside tutorials and docs by @jas-ho in https://github.com/HumanCompatibleAI/imitation/pull/766
  • Implementation of the SQIL algorithm by @RedTachyon in https://github.com/HumanCompatibleAI/imitation/pull/744
  • Additional examples of CLI usage by @EdoardoPona in https://github.com/HumanCompatibleAI/imitation/pull/761
  • Dependency fixes by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/775
  • Tune hyperparameters for kernel density estimation tutorial by @michalzajac-ml in https://github.com/HumanCompatibleAI/imitation/pull/774
  • Tune hyperparameters in tutorials for GAIL and AIRL by @michalzajac-ml in https://github.com/HumanCompatibleAI/imitation/pull/772
  • Introduce interactive policies to gather data from a user by @michalzajac-ml in https://github.com/HumanCompatibleAI/imitation/pull/776
  • Add an option to run SQIL with various off-policy algorithms by @michalzajac-ml in https://github.com/HumanCompatibleAI/imitation/pull/778
  • Complete PR #771 (Tune preference comparison example hyperparameters) by @lukasberglund in https://github.com/HumanCompatibleAI/imitation/pull/782
  • Add CLI for SQIL by @lukasberglund in https://github.com/HumanCompatibleAI/imitation/pull/784
  • Gymnasium Compatibility by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/735
  • Ensure MyST-NB raises an error when rendering a notebook fails. by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/803
  • Add a test timeout by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/779
  • Fix MacOS Pipeline: Include tests not in subdirectories by @AdamGleave in https://github.com/HumanCompatibleAI/imitation/pull/797
  • Remove MuJoCo dependency from SQIL notebook by @AdamGleave in https://github.com/HumanCompatibleAI/imitation/pull/800
  • Add partial support for dictionary observation spaces (bc, density) by @NixGD in https://github.com/HumanCompatibleAI/imitation/pull/785
  • Update gymnasium dependency and render_mode in gym.make by @taufeeque9 in https://github.com/HumanCompatibleAI/imitation/pull/806
  • Upgrade pytype by @ZiyueWang25 in https://github.com/HumanCompatibleAI/imitation/pull/801
  • Reduce training time and improve expert loading code in the tutorials by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/810
  • Add scripts and configs for hyperparameter tuning by @taufeeque9 in https://github.com/HumanCompatibleAI/imitation/pull/675
  • SQIL and PC performance check fixes by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/811
  • Running benchmarks by @ernestum in https://github.com/HumanCompatibleAI/imitation/pull/812

New Contributors

  • @jas-ho made their first contribution in https://github.com/HumanCompatibleAI/imitation/pull/766
  • @EdoardoPona made their first contribution in https://github.com/HumanCompatibleAI/imitation/pull/761
  • @michalzajac-ml made their first contribution in https://github.com/HumanCompatibleAI/imitation/pull/774
  • @lukasberglund made their first contribution in https://github.com/HumanCompatibleAI/imitation/pull/782
  • @NixGD made their first contribution in https://github.com/HumanCompatibleAI/imitation/pull/785
  • @ZiyueWang25 made their first contribution in https://github.com/HumanCompatibleAI/imitation/pull/801

Full Changelog: https://github.com/HumanCompatibleAI/imitation/compare/v0.4.0...v1.0.0

- Python
Published by AdamGleave over 2 years ago

imitation - v0.4.0

What's Changed

  • Continuous Integration: Add support for Mac OS; remove dependency on MuJoCo
  • Preference comparison: improved logging, support for active learning based on variance of ensemble.
  • HuggingFace integration for model and dataset loading.
  • Benchmarking: add results and example configs.
  • Documentation: add notebook tutorials; other general improvements.
  • General changes: migrate to pathlib; add more type hints to enable mypy as well as pytype.

Full Changelog: https://github.com/HumanCompatibleAI/imitation/compare/v0.3.1...v0.4.0

- Python
Published by AdamGleave over 2 years ago

imitation - v0.3.1

What's Changed

Main changes: * Added reward ensembles and conservative reward functions by @levmckinney in https://github.com/HumanCompatibleAI/imitation/pull/460 * Dropping support for python 3.7 by @levmckinney in https://github.com/HumanCompatibleAI/imitation/pull/505

Minor changes: * Docstring and other fixes after #472 by @Rocamonde in https://github.com/HumanCompatibleAI/imitation/pull/497 * Improve Windows CI by @AdamGleave in https://github.com/HumanCompatibleAI/imitation/pull/495

Full Changelog: https://github.com/HumanCompatibleAI/imitation/compare/v0.3.0...v0.3.1

- Python
Published by AdamGleave over 3 years ago

imitation - Major improvements

New features: - New algorithm: Deep RL from Human Preferences (thanks to @ejnnr @norabelrose et al) - Notebooks with examples (thanks to @ernestum) - Serialized trajectories using NumPy arrays rather than pickles, ensuring stability across versions and saving space on disk (thanks to @norabelrose) - Weights and Biases logging support (thanks to @yawen-d)

Improvements: - Port MCE IRL from JAX to Torch, eliminating the JAX dependency. (thanks to @qxcv) - Refactor RewardNet code to be independent from AIRL, and shared across algorithms. (thanks to @ejnnr) - Add Windows support including continuous integration. (thanks to @taufeeque9)

- Python
Published by AdamGleave over 3 years ago

imitation - First PyTorch release

- Python
Published by shwang over 5 years ago

imitation - Final TF1 release

- Python
Published by shwang over 5 years ago

imitation - Initial release

Prototype versions of AIRL, GAIL, BC, DAGGER.

- Python
Published by AdamGleave almost 6 years ago