Recent Releases of torchrl

torchrl - v0.9.2: Bug fixes and perf improvements

TorchRL 0.9.2 Release Notes

This release focuses on bug fixes, performance improvements, and code quality enhancements.

๐Ÿš€ New Features

  • LineariseRewards: Now supports negative weights for more flexible reward shaping (#3064)

๐Ÿ› Bug Fixes

  • Fixed policy reference handling in state dictionaries (#3043)
  • Improved unbatched data handling in LLM wrappers (#3070)
  • Fixed cross-entropy log-probability computation for batched inputs (#3080)
  • Fixed Binary clone() operations (#3077)
  • Fixed in-place spec modifications in TransformedEnv (#3076)

โšก Performance Improvements

  • Optimized distribution sampling by avoiding unnecessary log-probability computations (#3081)

๐Ÿ”ง Code Quality

  • Standardized coefficient naming in A2C and PPO algorithms (#3079)

๐Ÿ“ฆ Installation

bash pip install torchrl==0.9.2

Thanks to all contributors: @felixy12, @Xmaster6y, @louisfaury and @LCarmi

- Python
Published by vmoens 11 months ago

torchrl - v0.9.1: fix for history-based vLLM and Transformers wrappers

Fixes an critical issue with vLLMWrapper and TransformersWrapper, where a stack of History objects is resent to stack, resulting in a bug.

- Python
Published by vmoens 11 months ago

torchrl - TorchRL 0.9.0 Release Notes

We are excited to announce the release of TorchRL 0.9.0! This release introduces a comprehensive LLM API for language model fine-tuning, extensive torch.compile compatibility across all algorithms, and numerous performance improvements.

๐Ÿš€ Major Features

๐Ÿค– LLM API - Complete Framework for Language Model Fine-tuning

TorchRL now includes a comprehensive LLM API for post-training and fine-tuning of language models! This new framework provides everything you need for RLHF, supervised fine-tuning, and tool-augmented training:

The LLM API follows TorchRL's modular design principles, allowing you to mix and match components for your specific use case. Check out the complete documentation and GRPO implementation example to get started!

Unified LLM Wrappers

  • TransformersWrapper: Seamless integration with Hugging Face models
  • vLLMWrapper: High-performance inference with vLLM engines
  • Consistent API: Both wrappers provide unified input/output interfaces using TensorClass objects
  • Multiple input modes: Support for history, text, and tokenized inputs
  • Configurable outputs: Text, tokens, masks, and log probabilities

Advanced Conversation Management

  • History class: Advanced bidirectional conversation management with automatic chat template detection
  • Multi-model support: Automatic template detection for various model families (Qwen, DialoGPT, Falcon, DeepSeek, etc.)
  • Assistant token masking: Identify which tokens were generated by the assistant for RL applications
  • Tool calling support: Handle function calls and tool responses in conversations
  • Batch operations: Efficient tensor operations for processing multiple conversations

๐Ÿ› ๏ธ Tool Integration

  • PythonInterpreter transform: Built-in Python code execution capabilities
  • MCPToolTransform: General tool calling support
  • Extensible architecture: Easy to add custom tool transforms
  • Safe execution: Controlled environment for tool execution

๐ŸŽฏ Specialized Objectives

  • GRPOLoss: Group Relative Policy Optimization loss function optimized for language models
  • SFTLoss: Supervised fine-tuning loss with assistant token masking support
  • MCAdvantage: Monte-Carlo advantage estimation for LLM training
  • KL divergence rewards: Built-in KL penalty computation

โšก High-Performance Collectors

  • LLMCollector: Async data collection with distributed training support
  • RayLLMCollector: Multi-node distributed collection using Ray
  • Weight synchronization: Automatic model weight updates across distributed setups
  • Trajectory management: Efficient handling of variable-length conversations

๐Ÿ”„ Flexible Environments

  • ChatEnv: Transform-based architecture for conversation management
  • Transform-based rewards: Modular reward computation and data loading
  • Dataset integration: Built-in support for loading prompts from datasets
  • Thinking prompts: Chain-of-thought reasoning support

๐Ÿ“š Complete Implementation Example

A full GRPO implementation is provided in sota-implementations/grpo/ with: - Multi-GPU support with efficient device management - Mixed precision training - Gradient accumulation - Automatic checkpointing - Comprehensive logging with Weights & Biases - Hydra configuration system - Asynchronous training support with Ray

๐Ÿ†• New Features

LLM API Components

  • LLMMaskedCategorical (#3041) - Categorical distribution with masking for LLM token selection
  • AddThinkingPrompt transform (#3027) - Add chain-of-thought reasoning prompts
  • MCPToolTransform (#2993) - Model Context Protocol tool integration
  • PythonInterpreter transform (#2988) - Python code execution in LLM environments
  • ContentBase (#2985) - Base class for structured content in LLM workflows
  • LLM Tooling (#2966) - Comprehensive tool integration framework
  • History API (#2965) - Advanced conversation management system
  • LLM collector (#2879) - Specialized data collection for language models
  • vLLM wrapper (#2830) - High-performance vLLM integration
  • Transformers policy (#2825) - Hugging Face transformers integration

Environment Enhancements

  • IsaacLab wrapper (#2937) - NVIDIA Isaac Lab environment support
  • Complete PettingZooWrapper state support (#2953) - Full state management for multi-agent environments
  • ConditionalPolicySwitch transform (#2711) - Dynamic policy switching based on conditions
  • Async environments (#2864) - Asynchronous environment execution
  • VecNormV2 (#2867) - Improved vector normalization with batched environment support

Algorithm Improvements

  • Async GRPO (#2997) - Asynchronous Group Relative Policy Optimization
  • Expert Iteration and SFT (#3017) - Expert iteration and supervised fine-tuning algorithms
  • Async SAC (#2946) - Asynchronous Soft Actor-Critic implementation
  • Multi-node Ray support for GRPO (#3040) - Distributed GRPO training

Data Management

  • RayReplayBuffer (#2835) - Distributed replay buffer using Ray
  • RayReplayBuffer usage examples (#2949) - Comprehensive usage examples
  • Policy factory for collectors (#2841) - Flexible policy creation in collectors
  • Local and Remote WeightUpdaters (#2848) - Distributed weight synchronization

Performance Optimizations

  • Deactivate vmap in objectives (#2957) - Improved performance by disabling vectorized operations
  • Hold a single copy of low/high in bounded specs (#2977) - Memory optimization for bounded specifications
  • Use TensorDict.newunsafe in step (#2905) - Performance improvement in environment steps
  • Memoize calls to encode and related methods (#2907) - Caching for improved performance

Utility Features

  • Compose.pop (#3026) - Remove transforms from composition
  • Add optional Explained Variance logging (#3010) - Enhanced logging capabilities
  • Enabling worker level control on framesperbatch (#3020) - Granular control over data collection
  • collector.start() (#2935) - Explicit collector lifecycle management
  • Timer transform (#2806) - Timing capabilities for environments
  • MultiAction transform (#2779) - Multi-action environment support
  • Transform for partial steps (#2777) - Partial step execution support

๐Ÿ”ง Performance Improvements

  • VecNormV2: Improved vector normalization with better bias correction timing (#2900, #2901)
  • MaskedCategorical cross_entropy: Faster loss computation (#2882)
  • Avoid padding in transformer wrapper: Memory and performance optimization (#2881)
  • Set padded token log-prob to 0.0: Improved numerical stability (#2857)
  • Better device checks: Enhanced device management (#2909)
  • Local dtype maps: Optimized dtype handling (#2936)

๐Ÿ› Bug Fixes

LLM API Fixes

  • Variable length vllm wrapper answer stacking (#3049) - Fixed stacking issues with variable-length responses
  • LLMCollector trajectory collection methods (#3018) - Fixed trajectory collection when multiple trajectories complete simultaneously
  • Fix IFEval GRPO runs (#3012) - Resolved issues with IFEval dataset runs
  • Fix cuda cache empty in GRPO scripts (#3016) - Memory management improvements
  • Right log-prob size in transformer wrapper (#2856) - Fixed log probability tensor sizing
  • Fix gc import (#2862) - Import error resolution

Environment Fixes

  • Brax memory leak fix (#3052) - Resolved memory leaks in Brax environments
  • Fix behavior of partial, nested dones in PEnv and TEnv (#2959) - Improved done state handling
  • Fix shifted value computation with an LSTM (#2941) - LSTM value computation fixes
  • Fix single action pass to gym when action key is not "action" (#2942) - Action key handling improvements
  • Fix PEnv device copies (#2840) - Device management in parallel environments

Data Management Fixes

  • Fix minari dataloading (#3054) - Resolved Minari dataset loading issues
  • RB.add unsqueezes tds when applying the transform (#3047) - Replay buffer transform handling
  • Fix PRB serialization (#2963) - Prioritized replay buffer serialization
  • Fix lazy-stack in RBs (#2880) - Lazy stacking in replay buffers
  • Keep original class in LazyStackStorage through lazy_stack (#2873) - Class preservation in lazy stacking

Algorithm Fixes

  • Fix deprecated list index (#3005) - Updated deprecated list indexing
  • updatepolicyweights_() with cudagraph (#3003) - CUDA graph compatibility
  • Fix compile compatibility of PPO losses (#2889) - Compilation compatibility
  • Fix .item() warning on tensors that require grad (#2885) - Gradient tensor handling
  • Fix KL penalty (#2908) - KL divergence computation fixes

Specification and Type Fixes

  • Fixes the Categorical is_in with non-long integer (#2981) - Type compatibility improvements
  • Categorical spec samples the right dtype when masked (#2980) - Masked categorical sampling
  • Binary can have empty shape (#2979) - Empty shape handling
  • ActionMask is compatible with composite action specs (#3022) - Composite action specification support
  • Fix composite setitem (#2778) - Composite specification item setting

General Fixes

  • Fix various test failures (#2994) - Test suite improvements
  • Fix wrong split_trajectories import (#3023) - Import error resolution
  • Fix typo (#2969) - Documentation typo fixes
  • Fix device in PPO tests (#2971) - Device handling in tests
  • Fix device in args of PPO losses (#2969) - PPO loss device arguments

๐Ÿ“š Documentation

  • Document the LLM env and transform API (#2991) - Comprehensive LLM API documentation
  • Update documentation for _AcceptedKeys in a2c.py (#2987) - A2C documentation improvements
  • WeightUpdaterBase docs update after renaming (#3007) - Updated documentation for renamed components
  • Fix doc pipeline (#2992) - Documentation build improvements
  • Fix Doc (#2919) - General documentation fixes
  • Fix doc setup (#2922) - Documentation setup improvements
  • Better doc for Transform class (#2797) - Transform class documentation
  • Add docstring for MCTSForest.extend (#2795) - MCTS documentation
  • Fix tutorials (#2772, #2768) - Tutorial fixes and improvements

๐Ÿงช Testing and Quality

  • Fix wrong import (#3033) - Import error fixes
  • Fix error catches (#2982) - Error handling improvements
  • Fix warnings in tests (#2886) - Warning suppression in tests
  • Test and fix life cycle of env with dynamic non-tensor spec (#2812) - Environment lifecycle testing
  • Capture deprec warnings (#2799) - Deprecation warning handling
  • Fix old deps tests (#2500) - Dependency testing improvements

๐Ÿ”„ Refactoring and Code Quality

  • Refactor the weight update logic (#2914) - Improved weight update architecture
  • Refactor LLM data structures (#2834) - LLM data structure improvements
  • Rename RLHF files to LLM (#2833) - File organization improvements
  • Refactor TransformersWrapper class (#2871) - Transformers wrapper improvements
  • Refactor vLLMWrapper class (#2870) - vLLM wrapper improvements
  • Remove fromvllm and fromhf_transformers (#2874) - Cleanup of deprecated methods
  • Simplify LLMEnv (#2897) - LLM environment simplification

๐Ÿš€ CI and Infrastructure

  • Fix win CI (#3028) - Windows CI improvements
  • Fix SDL install (#2978) - SDL installation fixes
  • Build wheels on osx 15 (#2934) - macOS 15 compatibility
  • Fix tensordict upper version to 0.9 (#2933) - Dependency version management
  • Fix nightly and benchmark CIs (#2930) - CI pipeline improvements
  • Fix envnames in SOTA tests (#2921) - Test environment naming
  • egl for all (#2915) - EGL support improvements
  • Fix LLM tests (#2918) - LLM test suite fixes
  • Fix old deps (#2916) - Dependency management
  • Upgrade to cuda 12.8 (#2820) - CUDA version upgrade
  • Fix libs workflows (#2800) - Library workflow improvements

๐Ÿ“ฆ Dependencies and Setup

  • Remove distutils imports (#2836) - Modern Python compatibility
  • Fix nopythonabi_suffix error (#2863) - Python ABI suffix handling
  • Upgrade to v0.7 (#2745) - Dependency version updates
  • Fix Cairo-2 Chess import error (#2743) - Chess environment dependencies

๐Ÿ—‘๏ธ Deprecations and Removals

  • Enact deprecations (#2917) - Implementation of planned deprecations
  • Remove LLM features for release (#2912) - Temporary removal for release stability
  • Softly change default behavior of auto_unwrap (#2793) - Default behavior changes
  • Gracing old *Spec with v0.8 versioning (#2751) - Specification versioning
  • Remove InPlaceSampler (#2750) - Deprecated sampler removal
  • Remove OrnsteinUhlenbeckProcessWrapper (#2749) - Deprecated wrapper removal
  • Remove AdditiveGaussianWrapper (#2748) - Deprecated wrapper removal
  • Remove NormalParamWrapper (#2747) - Deprecated wrapper removal
  • Change the default MLP depth (#2746) - Default configuration changes

๐Ÿ”ง Minor Improvements

  • Fix sota runs (#3042) - SOTA implementation improvements
  • remove unused variables in GRPO scripts (#3038) - Code cleanup
  • Fix deprecated list index (#3005) - Deprecation warning fixes
  • gitignore ipynb (#2954) - Git ignore improvements
  • Quick edits to .md files (#2931) - Documentation improvements
  • Fix typos in advantages.py (#2492) - Documentation typo fixes
  • Remove redundant return (#2925) - Code cleanup
  • Fix some typos (#2811) - Documentation improvements

๐Ÿ“Š Migration Guide

LLM API Usage

The new LLM API provides a complete framework for language model fine-tuning. Key components include:

```python from torchrl.envs.llm import ChatEnv from torchrl.modules.llm import TransformersWrapper from torchrl.objectives.llm import GRPOLoss from torchrl.collectors.llm import LLMCollector

Create environment with Python tool execution

env = ChatEnv( tokenizer=tokenizer, systemprompt="You are an assistant that can execute Python code.", batchsize=[1] ).append_transform(PythonInterpreter())

Wrap your language model

llm = TransformersWrapper( model=model, tokenizer=tokenizer, input_mode="history" )

Set up GRPO training

lossfn = GRPOLoss(llm, critic, gamma=0.99) collector = LLMCollector(env, llm, framesper_batch=100)

Training loop

for data in collector: loss = loss_fn(data) loss.backward() optimizer.step() ```

Breaking Changes

  • Some deprecated wrappers have been removed (NormalParamWrapper, AdditiveGaussianWrapper, etc.)
  • Default MLP depth has been changed
  • Default behavior of auto_unwrap has been modified

Performance Recommendations

  • Use the new VecNormV2 for improved normalization performance. Can be used through a keyword arg in regular VecNorm transform.
  • Leverage async environments and collectors for better throughput.
  • Consider using RayReplayBuffer for distributed training scenarios.

๐Ÿ™ Acknowledgments

We would like to thank all contributors who made this release possible, especially those who contributed to the LLM API framework and the comprehensive testing and documentation improvements.


For detailed usage examples and tutorials, please refer to the TorchRL documentation and the LLM API reference.

- Python
Published by vmoens 11 months ago

torchrl - v0.8.1: Async collectors patch

Async Collector execution

This release major upgrades is a patch to collector.start() to allow collectors (single or multi-proc) to run asynchronously. #2935

An example is provided in the async SAC example. #2946

Single-agent reset

Fixes #2958 where partial resets are not handled correctly when a BatchedEnv is transformed - as the "done" checks were inconsistent. We now enforce that root "_reset" entries always precede their respective leaves.

Fix shifted values in GAE using LSTMs

Using an LSTM within GAE is facilitated by ensuring that shifted=True and shifted=False work properly (with appropriate warnings/errors if other hyperparameters need to be set). #2941

Full Changelog: https://github.com/pytorch/rl/compare/v0.8.0...v0.8.1

- Python
Published by vmoens about 1 year ago

torchrl - v0.8.0: Async envs and better weight update API

TorchRL v0.8.0: Async envs and better weight update API

  • Async environments: #2864 introduces asynchronous environments, which can be built using different backends (currently "threading" or "multiprocessing"). Instantiating an async env is roughly the same as a parallel one: python from torchrl.envs import AsyncEnvPool env = AsyncEnvPool([partial(GymEnv, "Pendulum-v1"), partial(GymEnv, "Pendulum-v1")], backend="threading") These environments support the regular environment methods (reset, step or rollout) but their main advantage lies in their new async methods: python s0 = env.rand_action(env.reset()) env.async_step_send(s0) # receive result = env.async_step_recv() In this example, result will contain the results of the call to step for one or two environments. The environment indices can be found in the result['env_index'] entry (the name of that key is stored in env._env_idx_key).
  • Support for environments with tensorclass attributes (#2788)
  • Distributed RayReplayBuffer (#2835)
  • Gymnasium 1.1 compatibility (#2898): we managed to make TorchRL compatible with Gymnasium 1.1 as this version lets users choose how to handle partial resets, which facilitates integration in the library.
  • VecNormV2, a new version of vecnorm which is more numerically stable and easier to handle. This can be created directly through the usual VecNorm by passing the new_api keyword argument.
  • policy factory for collectors: you can now pass a factory for your policy instead of passing the real object. Given that the collector will update the weights of the policy when asked to, this will in most cases not cause any synchronization problem with the copy that is used by the training pipeline.
  • An Update API for policy weights in collector: we have isolated the weight update API in a torchrl.collectors.WeightUpdaterBase abstract class. This should the entry point for any user wanting to implement their own weight update strategy, alleviating the need to subclass or patch the collector or the policy directly.

Packaging

We relaxed TorchRL dependency to make it compatible with any pytorch version. The current status is: - tensordict dependency will from now on be enforced (>=0.8.1,<0.9.0 for this release) - **For PyTorch prior to 2.7.0**, backward compatibility is guaranteed to some extend (most classes should work, unless new features are used) but C++ binaries (for prioritized replay buffers) will not work. - **For PyTorch >= 2.7.0**, C++ binaries should work across versions. In other words, torchrl binaries for 0.8.0 will work with PyTorch 2.7.0, 2.8.0 etc., and the same goes for the future TorchRL 0.9.0... A big thanks to @janeyx99 for enabling this!

New features

[Feature] Add EnvBase.all_actions (#2780) (67c3e9a4a) by @kurtamohler ghstack-source-id: 7abf9d469f740be5f14daffa2330811f7572dad9 [Feature] Add MCTSForest/Tree.to_string (#2794) (f8626690c) by @kurtamohler ghstack-source-id: 2127bf24d66e44fb310d12ff5f72e92aa0371cd7 [Feature] Add include_hash_inv arg to ChessEnv (#2766) (3be85c691) by @kurtamohler ghstack-source-id: f6920d781835902a6db02f74c5e5a3041243c5e3 [Feature] Add option for auto-resetting envs in GAE (#2851) (f5f3ae4e2) by @lin-erica Co-authored-by: Erica Lin elin@theaiinstitute.com [Feature] Async environments (#2864) (4f00025af) by @vmoens ghstack-source-id: 0a70ce0129d2ee6f85bb22adda3c332ff65e7501 [Feature] Capture wrong spec transforms (1/N) (#2805) (d3dca73f3) by @vmoens ghstack-source-id: f2d938b3dfe88af66622099f60cd7e3026289a02 [Feature] Collectors for async envs (#2893) (4ba50667e) by @vmoens ghstack-source-id: 764c21d0f2c3b217440e1a6f12ee797b17820c1d [Feature] DensifyReward postproc (#2823) (53065cf56) by @vmoens ghstack-source-id: ef6a0f52601642c8944f63f9e3ac9e963425734e [Feature] Dynamic specs for makecompositefromtd (#2829) (413571b8d) by @vmoens ghstack-source-id: 79e31e737c9f67ff20ce9fe32081e5b0a83de947 [Feature] Enable Hash.inv (#2757) (32c4623b3) by @kurtamohler ghstack-source-id: 956708121067855e519382a37764f06f53b16aa7 [Feature] Env with tensorclass attributes (#2788) (ab76027c1) by @vmoens ghstack-source-id: dc00ea3d23e015756974cd5c2ce638b55e5f6f92 [Feature] Gymnasium 1.1 compatibility (#2898) (78cd7550d) by @vmoens ghstack-source-id: e0891867f4318380f01c15449f9f26070b78536d [Feature] History API (#2890) (fd10fe213) by @vmoens ghstack-source-id: 5b9723f6e1c327625e1a9be6f6eac68b91ed8492 [Feature] History.defaultspec (#2894) (8ce11a859) by @vmoens ghstack-source-id: 40b8a492765a85adaccb591f1bc173754bacc313 [Feature] Local and Remote WeightUpdaters (#2848) (27d3680e7) by @vmoens ghstack-source-id: 2962530f87b596d038e3a13a934ea09064af2964 [Feature] Make PPO ready for text-based data (#2857) (595ddb4fa) by @vmoens ghstack-source-id: eeda5e2355e573e74cf7c080994cd47520ecd45b [Feature] MultiAction transform (#2779) (621776a21) by @vmoens ghstack-source-id: 0a6f7f916ee6f9c6d450c511385bdfdb1d911da0 [Feature] NonTensor batched arg (#2816) (b97bdb5e6) by @vmoens ghstack-source-id: c6de1bd1f1475b8d02df2ff3eb7438a50f2ae450 [Feature] Pass lists of policyfactory (#2888) (82f8ec26d) by @vmoens ghstack-source-id: e42b100096c6e38365f8a80681473746f51d8a77 [Feature] RayReplayBuffer (#2835) (50af98432) by @vmoens ghstack-source-id: 32eff06494037a1a30e532539794035c035f1e81 [Feature] Set padded token log-prob to 0.0 (#2856) (b9ddfa967) by @vmoens ghstack-source-id: 2b2993e0b15afae17326e6583390d57068712d4f [Feature] Support lazy tensordict inputs in ppo loss (#2883) (c9caf3d9c) by @vmoens ghstack-source-id: 89098ba3ca61b1524aeddc68f54c377f29c8dc8b [Feature] TensorDictPrimer with single defaultvalue callable (#2732) (59e85458b) by @vmoens ghstack-source-id: a9a677f24fc1e6a47312d0a96ab60daae543ff78 [Feature] Timer transform (#2806) (104b88092) by @vmoens ghstack-source-id: e42f2aece15f90afc457e1fb3e41a1f7be1a6a85 [Feature] Transform for partial steps (#2777) (7c034e331) by @vmoens ghstack-source-id: 587f91e33dfe1d59b73c4b2f2f1c21760ee79d2e [Feature] VecNormV2 (#2867) (40fcdb6bb) by @vmoens ghstack-source-id: 639d07ff54be200d54621c2c4619ebd0d3d7d79e [Feature] VecNormV2: Usage with batched envs (#2901) (b08e7ace3) by @vmoens ghstack-source-id: 5e14ed982b71b0e5192b0687c5259a3b49a81157 [Feature] pass policy-factory in mp data collectors (#2859) (31af2c529) by @vmoens ghstack-source-id: bce8abe9853d5ec187f91ffbcd8b940fa18ec8ab [Feature] policy factory for collectors (#2841) (49a8a42a0) by @vmoens ghstack-source-id: 96b928e938b8b07fc7de23483358202737571f8e [Feature] reset_time in Timer (#2807) (5a4637969) by @vmoens ghstack-source-id: 36a74fd20b78e1cdde6bca19b4f95c3d9062d761 [Feature] transformers policy (#2825) (eea932c3c) by @vmoens ghstack-source-id: 870c221b4ebae132a44944f0be0ee78da540d115

Fixes

[BugFix] Apply inverse transform to input of TransformedEnv._reset (#2787) (1ed5d293a) by @kurtamohler ghstack-source-id: 5f7c1fbd19b716f2b1602c34cf2ae1362f7bc7f6 [BugFix] Avoid calling reset during env init (#2770) (09e93c19c) by @vmoens ghstack-source-id: 5ab8281c34aacfd7dbbfc0e285d88bcae0aededf [BugFix] Ensure that Composite.set returns self as TensorDict does (#2784) (e084c02dd) by @vmoens ghstack-source-id: 23fe46b61dc2c9548fd9de7e4100431918fd0370 [BugFix] Fix .item() warning on tensors that require grad (#2885) (b66fcd4cf) by @vmoens ghstack-source-id: 502bdda3f5700dc900cf5c748839c965b1d67c1b [BugFix] Fix KL penalty (#2908) (96c300322) by @vmoens ghstack-source-id: 475dccb0bcddbfe3bd2d826c5389834fb95e1ab8 [BugFix] Fix MultiAction reset (#2789) (76aa9bc0c) by @kurtamohler ghstack-source-id: a2f7bfdd7522a214430182dac65687a977b1a10d [BugFix] Fix PEnv device copies (#2840) (6e40548ab) by @vmoens ghstack-source-id: df39fd2e4cd72f24c645b0ac32b46ab3e8d847fc [BugFix] Fix batchlocked check in checkenvspecs + error message callable (#2817) (9c98b82c3) by @vmoens ghstack-source-id: c722b164133c27c05dd21add3e7f3158189dd515 [BugFix] Fix calls to _resetenvpreprocess (#2798) (ea76ffb62) by @vmoens ghstack-source-id: 59925635a87b196a5bcb0fb251afe4cc7b8b103e [BugFix] Fix collector timeouts (#2774) (f6084b657) by @vmoens ghstack-source-id: cb71d95143beb22db1fe1752e72f70c19f43be79 [BugFix] Fix collector with no buffers and devices (#2809) (d4f88460a) by @vmoens ghstack-source-id: 5367df9fcfdf549108be852476b049a0b978e348 [BugFix] Fix compile compatibility of PPO losses (#2889) (9bc85f4f0) by @vmoens ghstack-source-id: b346033641e5d27560fbfa011a006446e56a4e31 [BugFix] Fix composite setitem (#2778) (c2a149d66) by @vmoens ghstack-source-id: f33b49beb4cf8c0c8b156559b1abbee8ac77db20 [BugFix] Fix env.fulldonespec~s~ (#2815) (f5c0666c2) by @vmoens ghstack-source-id: ba0d371d10b3f46ec1172fbec639ccc4d5559659 [BugFix] Fix forced batch-size in _skiptensordict (#2808) (3acf49101) by @vmoens ghstack-source-id: dac84e8b8835e870bce1772d7893c30b6f9af59c [BugFix] Fix gc import (#2862) (a183f02a5) by @vmoens ghstack-source-id: b732d4f805d98ceaaa45326d619fce623c10482f [BugFix] Fix lazy-stack in RBs (#2880) (e80732ede) by @vmoens ghstack-source-id: 38399ee991bc065445f4eb1c84b71e7d844d794c [BugFix] Fix property getter in RayReplayBuffer (#2869) (04d70c1e4) by @vmoens [BugFix] Fix slow and flaky non-tensor parallel env test (#2926) by @vmoens ghstack-source-id: fcb5caa56e05176958b3468a7d6f69e363cfe558 [BugFix] Fix update shape mismatch in skiptensordict (#2792) (3e42e7a2f) by @vmoens ghstack-source-id: 27e7d444c126e48fdb70d951a0cc7beaee1db3a8 [BugFix] Fixed VideoRecorder crash when passing fps (#2827) (5ec9bc56e) by Alexandre Brown [BugFix] GAE warning when gamma/lmbda are tensors (#2838) (d56111599) by @louisfaury Co-authored-by: Louis Faury louis.faury@helsing.ai [BugFix] Keep original class in LazyStackStorage through lazystack (#2873) (70f5c0649) by @vmoens ghstack-source-id: 661cd65c86648ffb2ee6ead40110ac3d57477514 [BugFix] NonTensor should not convert anything to numpy (#2771) (3da27506c) by @vmoens ghstack-source-id: 7644f6c695490f34d6455703418c59cfa718a9f0 [BugFix] PPOs with composite distribution (#2791) (edfa25d96) by @louisfaury Co-authored-by: Louis Faury louis.faury@helsing.ai [BugFix] Refactor _skiptensordict to avoid update calls (#2802)" (#2802) (e0d3eee3b) by @vmoens ghstack-source-id: 0f31b879f1e4643080530db8f7c7091e281b560f [BugFix] Remove neg dim checks in expand for all specs (#2906) (c5afe3c90) by @vmoens ghstack-source-id: f718328527275e0be591c5d12c334add8f65f7a4 [BugFix] Right log-prob size in transformer wrapper (#2854) (f81deacd1) by @vmoens ghstack-source-id: 98baa635ca07d5bf7e69a9e3bc43012ae2d91bf0 [BugFix] Test and fix life cycle of env with dynamic non-tensor spec (#2812) (b538c66c5) by @vmoens ghstack-source-id: 77da3a6baf0cb42525dd3a564b36ac03a531d17a [BugFix] Tree make node fix (#2839) (ba8be9c44) by Rolo [BugFix] Use brackets to get non-tensor data in gym envs (#2769) (84f6b0483) by @vmoens ghstack-source-id: 3101141eb5b7435c7a4047f5ee84b66c1d74af13 [BugFix] correct dim for resolving dtype in splitandpadsequence (#2801) (21c4d87c7) by KubaMichalczyk Co-authored-by: Jakub Michalczyk jakub.michalczyk@sportec-solutions.com

Refactors

[Refactor] Avoid padding in transformer wrapper (#2881) (9c4c086a5) by @vmoens ghstack-source-id: de28bab17fc3d59889ea9f2fd152de5001b92320 [Refactor] Fix repeats order (#2887) (93ba865c2) by @vmoens ghstack-source-id: 0bedd5c756f92d23083905ae8a6ddd992ba0b415 [Refactor] MaskedCategorical crossentropy usage for faster loss (#2882) (3e1f4ff1c) by @vmoens ghstack-source-id: 84330cf08ad8798e2cd4f6a8f3ec146a9de8e1e4 [Refactor] Refactor the weight update logic (#2914) (0da904425) by @vmoens ghstack-source-id: 72b710ab1788090364c068c59b28a21e09221236 [Refactor] Rename weight updaters (#2892) (efe938956) by @vmoens ghstack-source-id: 8889046277b94db0076fa72787295fd9419ab183 [Refactor] TransformersWrapper class (#2871) (5d7256122) by @vmoens ghstack-source-id: 8d5442611e9f1cf499cd59ed3e61a0602459c94d [Refactor] VecNormV2: update before norm, biascorrection at the right time (#2900) (c3310b87f) by @vmoens ghstack-source-id: a90aeb268a83dc2e45735f7b6b19b4e63e572ba7

Miscellaneous

[BE] Ensure abstractmethods are implemented for specs (#2790) (bd78913fe) by @vmoens ghstack-source-id: 7b943aa84bc497e7e8195f633cb15105de137f04 [BE] Fix some typos (#2811) (0ae140568) by @antoinebrl [BE] Make better logits in cost tests (#2775) (42ed42c71) by @vmoens ghstack-source-id: be9ea92b3f3d2592e426eaeaff7b81e50472cf16 [BE] Remove deprec specs from tests (#2767) (27a8ecc29) by @vmoens ghstack-source-id: 717bb31b1773c5c8b180c456f1bbad8a022dc55a [BE] _set_seed returns None + type annotations (#2903) (3a9f244de) by @antoinebrl [CI] Fix envnames in SOTA tests (#2921) by @vmoens ghstack-source-id: 3b518e2a81e9d988db2fbd12883eabbe486d32db [CI] Fix libs workflows (#2800) (8dd1be7c3) by @vmoens [CI] Fix nightly and benchmark CIs (#2930) by @vmoens ghstack-source-id: f39b2573ba58e7808389af3782aed8809759fa2b [CI] Fix old deps (#2916) (4162db690) by @vmoens ghstack-source-id: 109de71c622760679d449d906b0f33b3f1866975 [CI] Fix wheels (#2876) (e1d3fd488) by @vmoens ghstack-source-id: 0f2602146c4371d2fc6ac33f139b1eebb0829559 [CI] Upgrade to cuda 12.8 (#2820) (8c9dc050f) by @vmoens ghstack-source-id: e0ad7d6c00d53b74b23022836535c453a37df238 [CI] egl for all (#2915) (425952b96) by @vmoens ghstack-source-id: 1b5e13c44f5dff1a55f9c78a174f7164f37d76a1 [Deprecation] Enact deprecations (#2917) (b247526a1) by @vmoens ghstack-source-id: 690a9f62e274e9f14a89532dd7d07176188560e9 [Deprecation] Softly change default behavior of autounwrap (#2793) (2046bc536) by @vmoens ghstack-source-id: c28c11ecf68fba0ffde652205ea8e46f8da07cf1 [Doc] Add docstring for MCTSForest.extend (#2795) (a3a1ebefe) by @kurtamohler ghstack-source-id: 7fa8834376a1afd9187d7f1d43a97f70d713a160 [Doc] Better doc for Transform class (#2797) (dd59290d9) by @vmoens ghstack-source-id: 16e563bc810586d31772b58f9923439b632985c7 [Doc] Fix (and deactivate) tutorials (#2785) (f1c42e083) by @vmoens ghstack-source-id: 56c7757c36a2d609688ce0777a49d54763d3e691 [Doc] Fix Doc (#2919) by @vmoens ghstack-source-id: dacd1e7467994c73b22b5f111ac6c486d43d7b58 [Doc] Fix EnvCreator's doc (#2868) (586a5413d) by @louisfaury Co-authored-by: Louis Faury louis.faury@helsing.ai [Doc] Fix doc CI (#2932) by @vmoens ghstack-source-id: 5a27f9153c77d659ee691117c6af60a2d4b022bf [Doc] Fix formatting errors (#2786) (03d658630) by @vmoens ghstack-source-id: ac1f3da66c1374d3d19fed88e80f8ed5407b3459 [Doc] Fix tutorials (#2768) (75f113ff5) by @vmoens [Doc] Fix tutos (#2772) (b27ee6d8d) by @vmoens [Doc] Solve ref issues in docstrings (#2776) (f5445a4bd) by @vmoens ghstack-source-id: 09823fa85a94115291e7434478776fb0834f9b39 [Doc] Update discrete.py (#2850) (619fec69c) by oswald [Docs] Fix doc setup (#2922) by @vmoens ghstack-source-id: 91d359ad591a7f8062191825d04ebda112f2cf7d [Environment] Fix lib CI failures (#2923) by @vmoens ghstack-source-id: c046f5ded86a3a07d66eaddcaef24b69c2d77c01 [Environment] Fix lib CI failures (#2929) by @vmoens ghstack-source-id: febe20b0915025cc253f6aa0404258f3a020e1e6 [Lint] pyupgrade (#2819) (40b147e81) by @vmoens ghstack-source-id: dcdf51db31b8f6bcfad7fd4dc53f5b5ad8098c5d [Minor] Quick edits to .md files (#2931) by @vmoens ghstack-source-id: d0fbd3da72d41e6cdbfc4761990cb939993eb816 [Performance] Memoize calls to encode and related methods within step (#2907) (0475cbf64) by @vmoens ghstack-source-id: 8acd4839d4ba5f45373d0a0fcb52b15c149d37f1 [Performance] Use TensorDict.newunsafe in step (#2905) (e5cba04df) by @vmoens ghstack-source-id: 8a117fb7f9c5b24173408d217f59ec23da7db33c [Quality] Better device checks (#2909) (382430db3) by @vmoens ghstack-source-id: 7174415de2b4221c6c5fca4a31525ed26bc8d6f9 [Quality] Limit warning filter to torchrl (#2762) (85d1e70d3) by @antoinebrl [Quality] Remove redundant return (#2925) (21ef7253c) by @b10902118 [Setup] Fix nopythonabisuffix error (#2863) (7df831753) by @vmoens ghstack-source-id: 55c845efd936558116f8fdc356f22aca88943f99 [Setup] Remove distutils imports (#2836) (4c55b65b9) by @antoinebrl [Test] Capture deprec warnings (#2799) (fb641de13) by @vmoens ghstack-source-id: bcbf41c245c979d0f21524889ad2be8ef4c10c40 [Test] Fix warnings in tests (#2886) (6f634c6fb) by @vmoens ghstack-source-id: d4ed75d4dae2f0d62adff567d5dcc5fd2f98ce3a [Versioning] Bump 0.8.0 (#2920) by @vmoens ghstack-source-id: 74b3ef75b2b911c097c1d985068958b697a00134

A big thanks to the community supporting this project! There would be no TorchRL if it wasn't for its users.

- Python
Published by vmoens about 1 year ago

torchrl - 0.7.2: ParallelEnv fix

We are releasing TorchRL 0.7.2, a minor update that addresses several important bug fixes to improve the stability and reliability of our library.

This release is particularly crucial as it resolves a critical issue (#2840) where, under certain conditions, the device setting of the parallel environment would prevent the tensors in the buffers from being properly cloned. This resulted in rollouts returning the same tensor instances across steps, potentially leading to incorrect behavior and results.

Due to the severity of this bug, we strongly recommend that all users upgrade to TorchRL 0.7.2 to ensure the accuracy and reliability of their experiments.

The full list of changes can be found below:

  • [Doc] Fix formatting errors by @vmoens (#2786)
  • [BugFix] correct dim for resolving dtype in splitandpadsequence by @KubaMichalczyk and vmoens (#2801)
  • [BugFix] Fix collector with no buffers and devices by @vmoens (#2809)
  • [BE] Fix some typos by antoinebrl and @vmoens (#2811)
  • [Doc] Add docstring for MCTSForest.extend by @kurtamohler and @vmoens (#2795)
  • [CI] Fix libs workflows by @vmoens (#2800)
  • [BugFix] Fix env.fulldonespec~s~ by @vmoens (#2815)
  • [BugFix] Fix batchlocked check in checkenv_specs + error message caโ€ฆ by @vmoens (#2817)
  • [BugFix] GAE warning when gamma/lmbda are tensors by louisfaury and @vmoens (#2838)
  • [BugFix] Tree make node fix by rolo and @vmoens (#2839)
  • [BugFix] Fix PEnv device copies by @vmoens (#2840)

Full Changelog: https://github.com/pytorch/rl/compare/v0.7.1...v0.7.2

- Python
Published by vmoens about 1 year ago

torchrl - 0.7.1: Bug fixes and documentation improvements

We are pleased to announce the release of torchrl v0.7.1, which includes several bug fixes, documentation updates, and backend improvements.

Bug Fixes

  • Fixed collector timeouts (#2774)
  • Fixed composite setitem (#2778)
  • Ensured that Composite.set returns self as TensorDict does (#2784)
  • Fixed PPOs with composite distribution (#2791)
  • Used brackets to get non-tensor data in gym envs (#2769)
  • Avoided calling reset during env init (#2770)
  • NonTensor should not convert anything to numpy (#2771)

Documentation Updates:

  • Fixed tutorials (#2768)
  • Solved ref issues in docstrings (#2776)
  • Fixed formatting errors (#2786)

Backend Improvements:

  • Made better logits in cost tests (#2775)
  • Ensured abstractmethods are implemented for specs (#2790)
  • Removed deprec specs from tests (#2767)

Thank you to @antoinebrl, and @louisfaury for contributing to this release!

Full Changelog: https://github.com/pytorch/rl/compare/v0.7.0...v0.7.1

- Python
Published by vmoens over 1 year ago

torchrl - 0.7.0: Compile compatibility, Chess and better multi-head policies

As always, we want to warmly thank the RL community who's supporting this project. A special thanks to our first time contributors:

  • @priba made their first contribution in https://github.com/pytorch/rl/pull/2543
  • @carschandler made their first contribution in https://github.com/pytorch/rl/pull/2545
  • @4d616e61 made their first contribution in https://github.com/pytorch/rl/pull/2624
  • @valterschutz made their first contribution in https://github.com/pytorch/rl/pull/2626
  • @raresdan made their first contribution in https://github.com/pytorch/rl/pull/2616
  • @oslumbers made their first contribution in https://github.com/pytorch/rl/pull/2609
  • @codingWhale13 made their first contribution in https://github.com/pytorch/rl/pull/2682

as well as all the users who wrote issues, suggestions, started discussions here, on discord, on the pytorch forum or elsewhere! We value your feedback!

BC-Breaking changes and Deprecated behaviors

Removed classes

As announced, we removed the following classes:

  • AdditiveGaussianWrapper
  • InPlaceSampler
  • NormalParamWrapper
  • OrnsteinUhlenbeckProcessWrapper

Default MLP config

The default MLP depth has passed from 3 to 0 (i.e., now MLP(in_features=3, out_features=4) is equivalent to a regular nn.Linear layer).

Locking envs

Environments specs are now carefully locked by default (#2729, #2730). This means that python env.observation_spec = spec is allowed (specs will be unlocked/re-locked automatically) but python env.observation_spec["value"] = spec will not work. The core idea here is that we want to cache as much info as we can, such as action keys or whether the env has dynamic specs. We can only do that if we can guarantee that the env has not been modified. Locking the specs provides us such guarantee. Note that a version of this already existed but it was not as robust as the new one.

Changes to composite distributions

TL;DR: We're changing the way log-probs and entropies are collected and written in ProbabilisticTensorDictModule and in CompositeDistribution. The "sample_log_prob" default key will soon be "<value>_log_prob (or ("path", "to", "<value>_log_prob") for nested keys). For CompositeDistribution, a different log-prob will be written for each leaf tensor in the distribution. This new behavior is controlled by the tensordict.nn.set_composite_lp_aggregate(mode: bool) function or by the COMPOSITE_LP_AGGREGATE environment variable. We strongly encourage users to adopt the new behavior by setting tensordict.nn.set_composite_lp_aggregate(False).set() at the beginning of their training script.

The behavior of CompositeDistribution and its interaction with on-policy losses such as PPO has changed. The PPO documentation now includes a section about multi-head policies and the examples also give such information.

See the tensordict v0.7.0 release notes or #2707 to know more.

[Deprecation] Change the default MLP depth (#2746) (12e6bce60) by @vmoens ghstack-source-id: bd34b8e9112c4fc3a30bd095e3ac073a7d0b5469 [Deprecation] Gracing old *Spec with v0.8 versioning (#2751) (fa697fe59) by @vmoens ghstack-source-id: e7c6e0a4b8520da887fe7e602a351c3c72a08c4c [Deprecation] Remove AdditiveGaussianWrapper (#2748) (6c7f4fbda) by @vmoens ghstack-source-id: 78f248e1239a04fc5213aa4418a158f741679593 [Deprecation] Remove InPlaceSampler (#2750) (0feef11f9) by @vmoens ghstack-source-id: eeae1bf0611a5d293f533767eee7b9700e720cc8 [Deprecation] Remove NormalParamWrapper (#2747) (a38604e47) by @vmoens ghstack-source-id: 4a70178f54f9e25d602c86a0b61248d66f3e39bd [Deprecation] Remove OrnsteinUhlenbeckProcessWrapper (#2749) (0111a8795) by @vmoens ghstack-source-id: 401fdfaca2e27122d5a67fc7177e1015047f0098

New features

Compile compatibility

We gave a strong focus on a better compatibility with torch.compile across the SOTA training scripts which now all accept a compile=1 argument. The overall speedups range from 1 to 4x

Screenshot 2025-02-05 at 21 20 54

Loss module speedups are displayed in the README.md page.

Replay buffers are also mostly compatible with compile now (with the notable exception of distributed and memmaped ones).

Specs: autospec, <attr>_spec_unbatched

You can now use env.auto_spec_ to set the specs automatically based on a dummy rollout.

For batched environments, the unbatched spec can now be accessed via env.<attr>_spec_unbatched. This is useful to create random policies, for example.

New transforms

We added TrajCounter (#2532), Hash and Tokenizer (#2648, #2700) and LineariseReward (#2681).

LazyStackStorage

We provide a new ListStorage-based storage (LazyStackStorage) that automatically represents samples as a LazyStackedTensorDict which makes it easy to store ragged tensors (although not contiguously in memory) #2723.

ChessEnv

A new torchrl.envs.ChessEnv allows users to train agents to play chess!

Tutorials on exporting torchrl modules

We also opensourced a tutorial to export TorchRL modules on hardware: #2557

Full list of features

[Feature, Test] Adding tests for envs that have no specs (#2621) (c72583f75) by @vmoens ghstack-source-id: 4c75691baa1e70f417e518df15c4208cff189950 [Feature,Refactor] Chess improvements: fen, pgn, pixels, san, action mask (#2702) (d425777b8) by @vmoens ghstack-source-id: f294a2bc99a17911c9b62558d530b148d3c0350f [Feature] A2C compatibility with compile (#2464) (507766a88) by @vmoens ghstack-source-id: 66a7f0d1dd82d6463d61c1671e8e0a14ac9a55e7 [Feature] ActionDiscretizer custom sampling (#2609) (3da76f006) @oslumbers Co-authored-by: Oliver Slumbers oliver.slumbers@helsing.ai [Feature] Add Hash transform (#2648) (50011dcf1) @kurtamohler ghstack-source-id: dccf63fe4f9d5f76947ddb7d5dedcff87ff8cdc5 [Feature] Add Choice spec (#2713) (9368ca68e) @kurtamohler ghstack-source-id: afa315a311845ab39ade3e75046f32757f9d94f1 [Feature] Add LossModule.reset_parameters_recursive (#2546) (218d5bf70) by @kurtamohler [Feature] Add Stack transform (#2567) (594462d6b) by @kurtamohler [Feature] Add deterministicsample to masked categorical (#2708) (49d9897af) by @vmoens ghstack-source-id: d34fcf9b44d7a7c60dbde80b0835189f990ef226 [Feature] Adds ordinal distributions (#2520) (c851e1698) by @louisfaury Co-authored-by: @louisfaury [Feature] Avoid some recompiles of ReplayBuffer.extend/sample (#2504) (0f29c7e93) @kurtamohler [Feature] CQL compatibility with compile (#2553) (e2be42e82) by @vmoens ghstack-source-id: d362d6c17faa0eb609009bce004bb4766e345d5e [Feature] CROSSQ compatibility with compile (#2554) (01a421e76) by @vmoens ghstack-source-id: 98a2b30e8f6a1b0bc583a9f3c51adc2634eb8028 [Feature] CatFrames.makerbtransformandsampler (#2643) (9ee1ae7ee) by @vmoens ghstack-source-id: 7ecf952ec9f102a831aefdba533027ff8c4c29cc [Feature] ChessEnv (#2641) (17983d43e) by @vmoens ghstack-source-id: 087c3b12cd621ea11a252b34c4896133697bce1a [Feature] Composite.batchsize (#2597) (2e82cab19) by @vmoens ghstack-source-id: 621884a559a71e80a4be36c7ba984fd08be47952 [Feature] Composite.pop (#2598) (8d16c12bd) by @vmoens ghstack-source-id: 64d5bd736657ef56e37d57726dfcfd25b16b699f [Feature] Composite.separates (#2599) (83e0b0568) by @vmoens ghstack-source-id: fbfc4308a81cd96ecc61723df8c0eb870c442def [Feature] Custom conversion tool for gym specs (#2726) (dbc8e2ee0) by @vmoens ghstack-source-id: d38bb02f15267a9b1637b3ed25fb44ef013e2456 [Feature] DDPG compatibility with compile (#2555) (7d7cd9538) by @vmoens ghstack-source-id: f18928a419f81794d6870fd4e9fe1205c1b137e1 [Feature] DQN compatibility with compile (#2571) (f149811da) by @vmoens ghstack-source-id: 113dc8c4a5562d217ed867ace1942b2f6b8a39f9 [Feature] DT compatibility with compile (#2556) (fbfe10488) by @vmoens ghstack-source-id: 362b6e88bad4397f35036391729e58f4f7e4a25d [Feature] Discrete SAC compatibility with compile (#2569) (9e2d214fa) by @vmoens ghstack-source-id: ddc131acedbbe451b28758e757a8c240ebd72b80 [Feature] Ensure out-place policy compatibility in rollout and collectors (#2717) (ec370c6b6) by @vmoens ghstack-source-id: 41a6aa56e0a045a20224b96f9537a7ae3ae14494 [Feature] EnvBase.autospecs (#2601) (d537dcb63) by @vmoens ghstack-source-id: 329679238c5172d7ff13097ceaa189479d4f4145 [Feature] EnvBase.checkenvspecs (#2600) (00d3199ec) by @vmoens ghstack-source-id: 332dbf92db496c71c5ce6aba340ad123eac0f5d6 [Feature] GAIL compatibility with compile (#2573) (6482766b8) by @vmoens ghstack-source-id: 98c7602ec0343d7a83cb19bddeb579484c42e77e [Feature] IQL compatibility with compile (#2649) (2cfc2abd6) by @vmoens ghstack-source-id: 77bca166701d28dd69ef3964f55ab4f3e4b17fed [Feature] LLMHashingEnv (#2635) (30d21e599) by @vmoens ghstack-source-id: d1a20ecd023008683cf18cf9e694340cfdbdac8a [Feature] LazyStackStorage (#2723) (fe3f00c6c) by @vmoens ghstack-source-id: e9c031470aa0bdafbb2b26c73c06b25685a128e5 [Feature] Linearise reward transform (#2681) (ff1ff7e9c) by @louisfaury Co-authored-by: @louisfaury [Feature] Log each entropy for composite distributions in PPO (#2707) (319bb68f0) by @louisfaury Co-authored-by: @louisfaury [Feature] Log pbar rate in SOTA implementations (#2662) (1ce25f19a) by @vmoens ghstack-source-id: 283cc1bb4ad2d60281296d2cfb78ec41c77f4129 [Feature] MCTSForest (#2307) (e9d167711) by @vmoens ghstack-source-id: 9ac5cd3de39a4dbe1c7c33cb71ff6f45a886ae65 [Feature] Make PPO compatible with composite actions and log-probs (#2665) (256a7002c) by @vmoens ghstack-source-id: c41718e697f9b6edda17d4ddb5bd6d41402b7c30 [Feature] PPO compatibility with compile (#2652) (f5a187d7d) by @vmoens ghstack-source-id: 0ed29f352fcd85f0dc0683d90e95bdbecf6c14f9 [Feature] Re-enable cache for specs (#2730) (4262ab91e) by @vmoens ghstack-source-id: 797132312bfd9749f8926a2dd0b03eff65b8f51c [Feature] SAC compatibility with compile (#2655) (87a59fb30) by @vmoens ghstack-source-id: b57caeaf6e2d3690fb3311f4c9b8cca8575d3974 [Feature] Send info dict to the storage device in RBs (#2527) (d524d0d6b) by @vmoens ghstack-source-id: 4ed60d649b17f96b49f90d234e679937c60a3c32 [Feature] TD3 compatibility with compile (#2658) (1b7eda199) by @vmoens ghstack-source-id: fb94307557f2b8604403b48211e3da6fb2139e28 [Feature] TD3-bc compatibility with compile (#2657) (91064bc27) by @vmoens ghstack-source-id: 8a33e39829f620c1e1a579a0255162ba93eaca91 [Feature] TensorSpec.enumerate() (#2354) (14b63e4f0) by @vmoens ghstack-source-id: 9db2f5ee47a197eb0403cb4622266fb03b99360f [Feature] TrajCounter transform (#2532) (05aeb8975) by @vmoens ghstack-source-id: 62a3091e5c9072f26266143319f30de1729c0d4e [Feature] UnaryTransform for input entries (#2700) (093a1599f) by @vmoens ghstack-source-id: bb0ea97f47bdad6ba5e73692969fece4e2efbfb4 [Feature] example_data for NonTensor spec (#2698) (80690d221) by @vmoens ghstack-source-id: 6fe5d82763dfcc9044d6debe88f0f34bb739c987 [Feature] automatically determine returncontiguous (#2724) (cac93eb0e) by @vmoens ghstack-source-id: 6d1fc31d87cb021e6286cdb07db2d9b0e2302f7d [Feature] env.stepmdp (#2636) (4bc40a808) by @vmoens ghstack-source-id: 145e37cd772fdd74e35e5ffe6accc5c81ad689f3 [Feature] flexible batchlocked for jumanji (#2382) (35a78139b) by @vmoens ghstack-source-id: e356b6511ff3da8a6c583747214cfa90f42c9083 [Feature] lock / unlock_ graphs (#2729) (601483e71) by @vmoens ghstack-source-id: 01e375e636b97b26a89f9bbab2e955db6c85978a [Feature] multiagent data standardization: PPO advantages (#2677) (b7a0d11e5) by @matteobettini Co-authored-by: Vincent Moens vmoens@meta.com [Feature] nocudasync arg in collectors (#2727) (280297aee) by @vmoens ghstack-source-id: 9baba31b3ee844882fd4b6a6f69874946caf3b3e [Feature] singlespec (#2549) (58c384713) by @vmoens ghstack-source-id: 27e247ea1775e455999a114dd6d95fac748376c4 [Feature] spec.cardinality (#2638) (dd26ae79f) by @vmoens ghstack-source-id: 1160900f8a81dd51dc72436e1af69c8248bff162 [Feature] spec.isempty(recurse) (#2596) (097d8ad98) by @vmoens ghstack-source-id: faa3b1df5133c77462d6dd013d3854d684cc7e94 [Feature] timeit.printevery (#2653) (187de7c8b) by @vmoens ghstack-source-id: 19165bbfbea5cdc0a6b159493fb02571bab872f3 [Minor,Feature] Add prefix arg to timeit.todict (#2576) (7bc84d15d) by @vmoens ghstack-source-id: f1ff685caf6e8950d02dfc44ad2c1eb496495ad1 [Minor,Feature] `groupoptimizers` (#2577) (7829bd3f3) by @vmoens ghstack-source-id: 81a94ed641544a420bb1c455921ca6a17ecd6a22

Doc

[Doc] Add AOTInductor back (#2564) (9f8f77cdb) by @vmoens ghstack-source-id: 774eb5973045861f284fdc67f74945b1eecdeaf2 [Doc] Add Tokenizer and auto-reset doc link (#2754) (ee4006a6b) by @vmoens ghstack-source-id: 90f55b568e85ae151bea4370025144c19e74602b [Doc] Add Stack transform link in docs (#2689) (c5f1565de) by @kurtamohler [Doc] Adding recurrent policies to export tutorial (#2559) (705123870) by @vmoens ghstack-source-id: 1f1af399b120db8bbb1789748641f44fd3b1bd5e [Doc] Better doc for SliceSampler (#2607) (90572ac11) by @vmoens ghstack-source-id: 7d79ef7d37c4dc2ffbdff5b422cf5da24d93c0da [Doc] Fix broken links and formatting issues in doc (#2574) (5a2d9e205) by @vmoens ghstack-source-id: 4e3f84fe436de6a6e9696894cd06318a98e4a23b [Doc] Fix modules doc (#2531) (edbf3dee3) by @vmoens [Doc] Fix tutorials (#2560) (2f3b4cd4d) by @vmoens ghstack-source-id: 6c9114384015e76e96b3bbd0c8893cc42344537a [Doc] Fix typo in torchrl/modules/distributions/continuous.py (#2624) (b2e9f291a) by @Mana [Doc] Fix typos (#2682) (f672c708f) by Nils Kiele Co-authored-by: Vincent Moens vincentmoens@gmail.com [Doc] MADDPG bug fix of buffer device and improve explaination (#2519) (3e4b2928e) by @matteobettini [Doc] Minor fixes to the docs and type hints (#2548) (50a35f69b) by @thomasbbrunner [Doc] Tutorial on exporting TorchRL models (#2557) (c0187a93e) by @vmoens ghstack-source-id: b93146e22d8376563e7ac302b5cff95f09ae50d4 [Doc] Typo in docs for actors.py (#2545) (19dbeebf0) by @carschandler [Doc] Update docstring for TruncatedNormal with correct parameter names (#2625) (d22266d05) by @valterschutz Co-authored-by: Valter Schutz valterschutz@proton.me [Doc] actor docstrings (#2626) (825779935) by @valterschutz Co-authored-by: Valter Schutz valterschutz@proton.me [Doc] fix several typos (#2603) (de153bf45) by @carschandler [Doc] torchrl_demo.py revamp (#2561) (304e707ef) by @vmoens ghstack-source-id: 2f0087850e4a7d4d4393f0662156af9bfca8e3e1 [Example] Efficient Trajectory Sampling with CompletedTrajRepertoire (#2642) (b840a772c) by @vmoens ghstack-source-id: 4d5c587c69230aa8f3a1b9b6fe19f52fa683d703 [Example] RNN-based policy example (#2675) (d009835b4) by @vmoens ghstack-source-id: ef0087e9b5cba40be428f57ef70ecd2f63483d03 [Example] Using Collector's device args (#2705) (539c2158d) by @vmoens ghstack-source-id: 9aec8daa53000bdfd6091be706c7bc46778d5983

Performance

[Performance] Accelerate slice sampler on GPU (#2672) (84c3ec322) by @vmoens ghstack-source-id: a4dc1515d8b51f5ec150b2fae4e1a84254f2af09 [Performance] Avoid cloning trajs in SliceSampler (#2671) (4fd54fef4) by @vmoens ghstack-source-id: 2e133fcea716b202694cfa84df3f6e4ba3507bbc [Performance] Improve performance of compiled ReplayBuffer (#2529) (2a07f4c0f) by @kurtamohler [Benchmark] Add benchmark for compiled ReplayBuffer.extend/sample (#2514) (5e03a5518) @kurtamohler ghstack-source-id: d4562697e2c1a8392cf5bdcadb50f8b7b6939e41

Better engineering

[BE] Add trailing spaces when necessary (#2581) (600760f5b) by @vmoens ghstack-source-id: 198b5b5668cce8336d44206c10dacb8a9b1a9785 [BE] Add type annotation for tensorkeys to facilitate auto-complete (#2696) (4b3279a3f) by @vmoens ghstack-source-id: b4a8fe38e7c6b028759eef082f65f26036bc0250 [Refactor,CI] Refactor SOTA tests (#2583) (c0ba3ff54) by @vmoens ghstack-source-id: b14c59bb1ca7bf056bde05fa0abd01fa7e9b3710 [Refactor] Allow safe-tanh for torch >= 2.6.0 (#2580) (1474f8517) by @vmoens ghstack-source-id: 92df1954451453ee051bbde499f6db5ebaafed49 [Refactor] Deprecate recurrentmode API to use decorators/CMs instead (#2584) (14b277513) by @vmoens ghstack-source-id: 80f705e022abc111df3960fc09576d5e266ed4dd [Refactor] Refactor trees (#2634) (57dc25a44) by @vmoens ghstack-source-id: 368ba4c4402b6db0bc8b0688802ce161db9776b7 [Refactor] Rename Recorder and LogReward (#2616) (607ebc52d) by Goia Rares Dan Tiago [Refactor] Use unbatched in VMAS (#2593) (a126a6f94) by @vmoens ghstack-source-id: 2190278de44ba59a3bc8d38398fddae9ecc42a84 [Refactor] Use default device instead of CPU in losses (#2687) (c3b9d1dc7) by @vmoens ghstack-source-id: 8b98062c3ae88d8780ef7428fdfa07e305c790b9 [Refactor] compile compatibility improvements (#2578) (db7f08d76) by @vmoens ghstack-source-id: 95f8241b56e42b80e828485cb5f377288bff6f5e [Quality,BE] Better doc for stepmdp (#2639) (ef5a37d8a) by @vmoens ghstack-source-id: 1f5aed6fb2e97ead9d379f9545ae742f7728c585 [Quality] Better TD construction in codebase (#2565) (a4c1ee3b3) by @vmoens ghstack-source-id: 9e280d9d7d4a735e5055beb0450d933547530e55 [Quality] Better warning when c++ binaries failed to be imported (#2541) (0a13cbd5e) by @vmoens [Quality] IMPALA auto-device (#2654) (526b38d5c) by @vmoens ghstack-source-id: abbb3048f33c9f7f6a623e32e139871093ea74fa [Minor] Fix doc and MARL tests (#2759) (ad7d2a10b) by @vmoens ghstack-source-id: 9308be3ebc7fac30b5bde321792eb97069d55996 [Minor] Fix fbcode imports of mocking classes (#2526) (da0bf1897) by @vmoens ghstack-source-id: 74f9f3bedf8f48988a1956084548f6cd2f720934 [Minor] Make fbcode happy with imports (#2517) (a70b258cd) by @vmoens ghstack-source-id: d4bfce9d51269bc0ab6154ee4c2d1e1ff7af0895

Bug fixes

[BugFix, BE] Document and fix fps passing in recorder and loggers (#2694) (61e05b3d9) by @vmoens ghstack-source-id: b3996a9a27643eb5da8a78135f6b9fcef3685f17 [BugFix,Doc] Fix BATCHEDPIPETIMEOUT refs and doc (#2695) (dc25a55a7) by @vmoens ghstack-source-id: 6e43c4ff1c319545cf0952abf6f35f3e7ed473e0 [BugFix,Doc] Revert dynamic shape in export tutorial (#2563) (9d292a007) by @vmoens ghstack-source-id: fc856218e840469a5bb0143241d100e9cc612538 [BugFix,Test,Benchmark] Fix graph breaks induced by device context manager (#2602) (152bc81b7) by @vmoens ghstack-source-id: 0df2728928280a43de4abd30afed20826b0af091 [BugFix,Test] test chess rendering (#2721) (ddbb6fdd5) by @vmoens ghstack-source-id: 59b37e6fa2f8c11f600eea334da0bd8257ed382c [BugFix] Account for composite actions in gym (#2718) (1246db197) by @vmoens ghstack-source-id: c09b59904a89d45fa24a61a5e8a24fe307320794 [BugFix] Account for terminating data in SAC losses (#2606) (c8676f4a8) by @vmoens ghstack-source-id: dc1870292786c262b4ab6a221b3afb551e0efb9b [BugFix] ActionDiscretizer scalar integration (#2619) (830f2f26c) by @vmoens ghstack-source-id: b22102f3730914b125ef0f813f4d2f22dec0b26e [BugFix] Allow expanding TensorDictPrimer transforms shape with parent batch size (#2552) (83a7a57da) by Albert Bou Co-authored-by: Vincent Moens vmoens@meta.com [BugFix] Avoid KeyError in slice sampler (for compile) (#2670) (21eeca42c) by @vmoens ghstack-source-id: 6e2a3036f0e50d365387cced50a761b97a47317d [BugFix] Better account of composite distributions in PPO (#2622) (90c8e40f6) by @vmoens ghstack-source-id: 3d86f99bc5b20a53e4092d786e96a5f7e83405ac [BugFix] Compatibility of tensordict primers with batched envs (specifically for LSTM and GRU) (#2668) (f4709c143) by @vmoens ghstack-source-id: e1da58ecfd36ca01b8a11fe90e5f3c5fe77f064c [BugFix] Fix MARL PPO tutorial actionspec call (#2628) (1ca134cc3) by @vmoens ghstack-source-id: 1d9058c45b28c0f0279e4243a2a0f96c622a51d8 [BugFix] Fix batching envs with non tensor data (#2674) (ab4250ec7) by @vmoens ghstack-source-id: daba8a95459cfa978da09291757b6380fab4f308 [BugFix] Fix call to tree.plot in tests (#2547) (09d6866e0) by @vmoens ghstack-source-id: 4a5babbf46294ab6ed4a791e26cfacaf3a41a2e0 [BugFix] Fix collector length with non-empty batch size (#2575) (b87597922) by @vmoens ghstack-source-id: 0c6a7a49f0570fad083340a64dd89c0f4c220c06 [BugFix] Fix compile weakrefs errors (#2742) (ffa99b2a2) by @vmoens ghstack-source-id: 3cb4c62f465a3c0581064b3ff89290b9d225eb3f [BugFix] Fix device transfer for collectors with initrandomframes mixed devices (#2704) (1d45117ba) by @vmoens ghstack-source-id: 1684399a7c84dd19b396db6c903fbf68c971c73d [BugFix] Fix export aoticompileandpackage API change (#2629) (1cffffee9) by @vmoens ghstack-source-id: 07a0f063f8955815157c2a3eac02c6460a82f672 [BugFix] Fix failing tests (#2582) (863121a27) by @vmoens ghstack-source-id: a43a2e3dbf76cd63c57ae00028df04b41a4e2f2b [BugFix] Fix getdefaultdevice calls in older PT versions (#2586) (705ecc2bb) by @vmoens ghstack-source-id: fd3a739d38feba075073801dda362be598822a94 [BugFix] Fix imports (#2605) (d90b9e3d1) by @vmoens ghstack-source-id: db85f2611c1c0b22e9179b4fdd6c2dcea78ac8dd [BugFix] Fix initrandomframes=0 (#2645) (19dfefc84) by @vmoens ghstack-source-id: 38a544ea15631f9affb4c385c09e7c4df94af55d [BugFix] Fix missing min/max alpha clamps in losses (#2684) (ed656a15f) by @vmoens [BugFix] Fix output of SipHash(as_tensor=False) (#2664) (1fc9577c4) by @kurtamohler [BugFix] Fix partial device transfers in collector (#2703) (afb81de51) by @vmoens ghstack-source-id: 2cd74c2d6fceaf079122ae801b67bdbfc29cddaf [BugFix] Fix pendulum device (#2516) (6799a7f5d) by @vmoens ghstack-source-id: bcaf20de6e317d4bda0e1511e0b1e46653a6f352 [BugFix] Fix safe probabilistic backward by removing in-place modif (#2755) (2f8c118e3) by @vmoens ghstack-source-id: 574eb1f9b662c1eb5be25e97020e11b3fadf625e [BugFix] Fix tests failing because of https://github.com/pytorch/pytorch/pull/137602 (165163abe) by @vmoens [BugFix] Fix typing for python 3.9 (#2631) (e7062a1d6) by @vmoens ghstack-source-id: 663da84096214611804a726e2d38d27a6f21c958 [BugFix] Fix typing in chess env (#2646) (cb8e241b2) by @vmoens ghstack-source-id: ad6086bbb7d1ee528ca24ec1d1232da47372e2b5 [BugFix] Fix typing in llm env (#2647) (e3c304733) by @vmoens ghstack-source-id: b5608f91756b5a81141941903158417a111e0710 [BugFix] Fix version parsing in extensions (#2542) (997d90e1b) by @vmoens ghstack-source-id: 903f2b01b508b81b1b4f92c4297d390da79fe8a2 [BugFix] PettingZoo dict action spaces (#2692) (1a6c9e2d0) by @matteobettini [BugFix] Remove erroneous python 3.8 compatibility classifier (#2540) (528875a9f) by @vmoens [BugFix] Remove raisers in specs (#2651) (bb6f87adb) by @vmoens ghstack-source-id: a005a62847aa2ff1d286f2c4ad13fd14f9e631d3 [BugFix] Rename RayCollector example file to avoid ImportError (#2525) (8eac84ad2) by Albert Bou [BugFix] Support for tensor collection in the PPOLoss (#2543) (0eabb7897) by Pau Riba Co-authored-by: Pau Riba pau.riba@helsing.ai [BugFix] Temporarily remove unsafe caching in envs (#2728) (dc63e820d) by @vmoens ghstack-source-id: a139cf6dc9fcfcfa525a6aa6375163d379593550 [BugFix] Wrong spec returned (#2604) (a1e21f598) by @matteobettini [BugFix] actionspecunbatched whenever necessary (#2592) (d30599ec0) by @vmoens ghstack-source-id: ec87794dabaf5023dac85cfc898a7c000e93331d [BugFix] adapt log-prob TD batch-size to advantage shape in PPO (#2756) (cb37521e1) by @vmoens ghstack-source-id: 8ccd12f65f4a74a42356a630e0e5a1f015337d4a [BugFix] make buffers zero-dim in exploration modules (#2591) (a47b32c07) by @vmoens ghstack-source-id: fd2705eb9132169da4871b27b354f7895c644061 [BugFix] patch randaction in TransformedEnv to read the baseenv method (#2699) (2c19fcc70) by @vmoens ghstack-source-id: 04e2e85e2675cf34c349ebadb8fa85a5aff2e532 [BugFix] requestedframesperbatch in distributed collectors (#2579) (408cf7d04) by @vmoens ghstack-source-id: 49289de6956460d9aed13d982eb8003eafc35118 [BugFix] skipdone_states in SAC (#2613) (de61e4d5e) by @vmoens ghstack-source-id: 39d97360e3b0e45dd8c327487eac50ddafe2254d

CI and Tests

[Test] Add tests and a few fixes for ChessEnv (#2661) (7bbd7e3b6) @kurtamohler ghstack-source-id: d0fbb520e35c74305041340722a7560ac2f958f2 [Test] Add tests for CatFrames with PermuteTransform (#2715) (d4e401993) @kurtamohler ghstack-source-id: e554d1cda8d7e4458c9397f1f93345c855e68e5c [Test] Add tests for Tree (#2738) (bb9440b40) @kurtamohler ghstack-source-id: 8f7aa07a4d36aa3664eaa19cc35bd66fb9e61c24 [Test] Fix warnings in SOTA tests (#2710) (a90106475) by @vmoens ghstack-source-id: c79223b5d6548a6c5a6ef649f6eb8e1703258815 [Test] More comprehensive tests for autospec (#2640) (6c7d233a4) by @vmoens ghstack-source-id: 75352490436fd706af3d36f9b8016e80a8a3f46a [Test] Skip tokenizer tests if transformers is not in workspace (#2744) (20a19fe2a) by @vmoens ghstack-source-id: b92facfd14cba62511e7888567c94d3986419ab5 [Test] Str2StrEnv test (#2725) (5fd509232) by @vmoens ghstack-source-id: 45a0e5f4b33c4624758171b9fe31f1e3932ff5e4 [CI, BugFix] Py3.8 for old deps (#2568) (f3275dab0) by @vmoens ghstack-source-id: 13c7923c0e5c8725c12c3bacc6c21b250d9f7457 [CI] Change doc image (#2632) (2511c04a5) by @vmoens ghstack-source-id: eceab242294ec55135d79f29e848345a5d5d455e [CI] Cuda 12.4 (#2733) (37a514d6c) by @vmoens ghstack-source-id: 2f3842a17d03e530add9608ee4525347a7c6a0e5 [CI] Fix Cairo-2 Chess import error (#2743) (10f015e0c) by @vmoens ghstack-source-id: c2bcbfc4522bd1b4f1fea3dbb006dc9552b09cb4 [CI] Fix docs upload (#2587) (0f592266f) by @vmoens ghstack-source-id: 49d7df06340fc432c29cd9f2d0ed2ae3d5619a38 [CI] Fix dreamer run in SOTA tests (#2627) (aed03fda4) by @vmoens ghstack-source-id: dfe3ab6fe0d29fcdcaf57f31f84d04e07e36bad3 [CI] Fix nightly build (#2666) (133d70936) by @vmoens ghstack-source-id: 5502fa94b6abcc154e020dcb165093fdc30ca025 [CI] Fix olddeps dmcontrol (#2734) (3ac61270f) by @vmoens ghstack-source-id: 750edcb8cd6b17167f77fb7c9ebd538608cfbde6 [CI] Fix windows build (#2760) (03f56ffb0) by @vmoens [CI] Install stable torch/tv in docs when on release branch (#2761) (57bdc6aec) by @vmoens ghstack-source-id: 7c39c049c7cff0ee112be2d07597f2e291d2fafd [CI] Local import of PIL (#2720) (d628a507f) by @vmoens ghstack-source-id: 6eb4ace11022632e902a7277dd51344bb9fe1f65 [CI] Longer timeout for windows (#2765) (4c06ce2b8) by @vmoens ghstack-source-id: 381e7e39d650e0178178a78076321a2210237b39 [CI] Make MAXIDLECOUNT a feature of tests (#2752) (963f3cdf6) by @vmoens ghstack-source-id: 2bf31dfff3d7862a54abeea86c8c5cc47a0f302d [CI] Remove gym import in testlibs.py (#2719) (f2cf5e044) by @vmoens ghstack-source-id: b0474588cfc81ed135d70efb58203c0b503f4ff0 [CI] Revert upgrade of upload image in docs (#2585) (236d38f8a) by @vmoens ghstack-source-id: f323dd2667a073b6c763ed17a793ecd0eec6b7be [CI] Upgrade GHA versions (#2740) (cd4f359ef) by @vmoens ghstack-source-id: 1876f1f0c18cb11c74edc9d96c17fdc985bc7b1a [CI] Upgrade cu121 to cu124 (#2764) (5da1f6522) by @vmoens ghstack-source-id: 4b3c9c0c31a60a5e151ff13b21e54853dc426416 [CI] Upgrade to v0.7 (#2745) (0ecfbe36e) by @vmoens ghstack-source-id: e548bbbb4578d44a8eee000ab0a40c89713afc27 [CI] linuxjob_v2.yml (#2570) (527a26a27) by @vmoens ghstack-source-id: ae13b53bd2885263e80019c087171421f5f7d0d5 [CI] minarihf (dda0df165) by @vmoens ghstack-source-id: 6eb84d906dfbc66839706f328e214014aef7b65f [CI] workflow permissions (#2706) (b000685f3) by @vmoens ghstack-source-id: f520a1b1e7697b1147cb29e66e2ecb1d07cb4cbc

- Python
Published by vmoens over 1 year ago

torchrl - v0.6.0: compiled losses and partial steps

What's Changed

We introduce wrappers for ML-Agents and OpenSpiel. See the doc here for OpenSpiel and here for MLAgents.

We introduce support for partial steps, allowing you to run rollouts that ends only when all envs are done without resetting those who have reached a termination point.

We add the capability of passing replay buffers directly to data collectors, to avoid inter-process synced communications - thereby drastically speeding up data collection. See the doc of the collectors for more info.

The GAIL algorithm has also been integrated in the library (#2273).

We ensure that all loss modules are compatible with torch.compile without graph breaks (for a typical built). Execution of compiled losses is usually in the range of 2x faster than its eager counterpart.

Finally, we have sadly decided not to support Gymnasium v1.0 and future releases as the new autoreset API is fundamentally incompatible with TorchRL. Furthermore, it does not guarantee the same level of reproducibility as previous releases. See this discussion for more information.

We provide wheels for aarch64 machines, but not being able to upload them to PyPI we provide them attached to these release notes.

Deprecations

  • [Deprecation] Deprecate default num_cells in MLP (#2395) by @vmoens
  • [Deprecations] Deprecate in view of v0.6 release #2446 by @vmoens

New environments

  • [Feature] Add OpenSpielWrapper and OpenSpielEnv (#2345) by @kurtamohler
  • [Feature] Add env wrapper for Unity MLAgents (#2469) by @kurtamohler

New features

  • [Feature] Add group_map support to MLAgents wrappers (#2491) by @kurtamohler
  • [Feature] Add scheduler for alpha/beta parameters of PrioritizedSampler (#2452) Co-authored-by: Vincent Moens by @LTluttmann
  • [Feature] Check number of kwargs matches num_workers (#2465) Co-authored-by: Vincent Moens by @antoine.broyelle
  • [Feature] Compiled and cudagraph for policies #2478 by @vmoens
  • [Feature] Consistent Dropout (#2399) Co-authored-by: Vincent Moens by @depictiger
  • [Feature] Deterministic sample for Masked one-hot #2440 by @vmoens
  • [Feature] Dict specs in vmas (#2415) Co-authored-by: Vincent Moens by @55539777+matteobettini
  • [Feature] Ensure transformation keys have the same number of elements (#2466) by @f.broyelle
  • [Feature] Make benchmarked losses compatible with torch.compile #2405 by @vmoens
  • [Feature] Partial steps in batched envs #2377 by @vmoens
  • [Feature] Pass replay buffers to MultiaSyncDataCollector #2387 by @vmoens
  • [Feature] Pass replay buffers to SyncDataCollector #2384 by @vmoens
  • [Feature] Prevent loading existing mmap files in storages if they already exist #2438 by @vmoens
  • [Feature] RNG for RBs (#2379) by @vmoens
  • [Feature] Randint on device for buffers #2470 by @vmoens
  • [Feature] SAC compatibility with composite distributions. (#2447) by @albertbou92
  • [Feature] Store MARL parameters in module (#2351) by @vmoens
  • [Feature] Support wrapping IsaacLab environments with GymEnv (#2380) by @yu-fz
  • [Feature] TensorDictMap #2306 by @vmoens
  • [Feature] TensorDictMap Query module #2305 by @vmoens
  • [Feature] TensorDictMap hashing functions #2304 by @vmoens
  • [Feature] breakwhenall_done in rollout #2381 by @vmoens
  • [Feature] inline hold_out_net #2499 by @vmoens
  • [Feature] replaybufferchunk #2388 by @vmoens

New Algorithms

  • [Algorithm] GAIL (#2273) Co-authored-by: Vincent Moens by @Sebastian.dittert

Fixes

  • [BugFix, CI] Set TD_GET_DEFAULTS_TO_NONE=1 in all CIs (#2363) by @vmoens
  • [BugFix] Add MultiCategorical support in PettingZoo action masks (#2485) Co-authored-by: Vincent Moens by @matteobettini
  • [BugFix] Allow for composite action distributions in PPO/A2C losses (#2391) by @albertbou92
  • [BugFix] Avoid reshape(-1) for inputs to DreamerActorLoss (#2496) by @kurtamohler
  • [BugFix] Avoid reshape(-1) for inputs to objectives modules (#2494) Co-authored-by: Vincent Moens by @kurtamohler
  • [BugFix] Better dumps/loads (#2343) by @vmoens
  • [BugFix] Extend RB with lazy stack #2453 by @vmoens
  • [BugFix] Extend RB with lazy stack (revamp) #2454 by @vmoens
  • [BugFix] Fix Compose input spec transform (#2463) Co-authored-by: Louis Faury @louisfaury
  • [BugFix] Fix DeviceCastTransform #2471 by @vmoens
  • [BugFix] Fix LSTM in GAE with vmap (#2376) by @vmoens
  • [BugFix] Fix MARL-DDPG tutorial and other MODE usages (#2373) by @vmoens
  • [BugFix] Fix displaying of tensor sizes in buffers #2456 by @vmoens
  • [BugFix] Fix dumps for SamplerWithoutReplacement (#2506) by @vmoens
  • [BugFix] Fix get-related errors (#2361) by @vmoens
  • [BugFix] Fix invalid CUDA ID error when loading Bounded variables across devices (#2421) by @cbhua
  • [BugFix] Fix listing of updated keys in collectors (#2460) by @vmoens
  • [BugFix] Fix old deps tests #2500 by @vmoens
  • [BugFix] Fix support for MiniGrid envs (#2416) by @kurtamohler
  • [BugFix] Fix tictactoeenv.py #2417 by @vmoens
  • [BugFix] Fixes to RenameTransform (#2442) Co-authored-by: Vincent Moens by @thomasbbrunner
  • [BugFix] Make sure keys are exclusive in envs (#1912) by @vmoens
  • [BugFix] TensorDictPrimer updates spec instead of overwriting (#2332) Co-authored-by: Vincent Moens by @matteobettini
  • [BugFix] Use a RL-specific NO_DEFAULT instead of TD's one (#2367) by @vmoens
  • [BugFix] compatibility to new Composite dist log_prob/entropy APIs #2435 by @vmoens
  • [BugFix] torch 2.0 compatibility fix #2475 by @vmoens

Performance

  • [Performance] Faster CatFrames.unfolding with padding="same" (#2407) by @kurtamohler
  • [Performance] Faster PrioritizedSliceSampler._padded_indices (#2433) by @kurtamohler
  • [Performance] Faster SliceSampler._tensor_slices_from_startend (#2423) by @kurtamohler
  • [Performance] Faster target update using foreach (#2046) by @vmoens

Documentation

  • [Doc] Better doc for inverse transform semantic #2459 by @vmoens
  • [Doc] Correct minor erratum in knowledge_base entry (#2383) by @depictiger
  • [Doc] Document losses in README.md #2408 by @vmoens
  • [Doc] Fix README example (#2398) by @vmoens
  • [Doc] Fix links to tutos (#2409) by @vmoens
  • [Doc] Fix pip3install typos in Readme (#2342) by @43245438+TheRisenPhoenix
  • [Doc] Fix policy in getting started (#2429) by @vmoens
  • [Doc] Fix tutorials for release #2476 by @vmoens
  • [Doc] Fix wrong default value for flatten_tensordicts in ReplayBufferTrainer (#2502) by @vmoens
  • [Doc] Minor fixes to comments and docstrings (#2443) by @thomasbbrunner
  • [Doc] Refactor README (#2352) by @vmoens
  • [Docs] Use more appropriate ActorValueOperator in PPOLoss documentation (#2350) by @GaetanLepage
  • [Documentation] README rewrite and broken links (#2023) by @vmoens

Not user facing

  • [CI, BugFix] Fix CI (#2489) by @vmoens
  • [CI] Add aarch64-linux wheels (#2434) by @vmoens
  • [CI] Disable compile tests on windows #2510 by @vmoens
  • [CI] Fix 3.12 gymnasium installation #2474 by @vmoens
  • [CI] Fix CI errors (#2394) by @vmoens
  • [CI] Fix GPU benchmark upload (#2508) by @vmoens
  • [CI] Fix Minari tests (#2419) Co-authored-by: Vincent Moens by @42100908+younik
  • [CI] Fix benchmark workflows (#2488) by @vmoens
  • [CI] Fix broken workflows (#2418) by @vmoens
  • [CI] Fix broken workflows (#2428) by @vmoens
  • [CI] Fix gymnasium version in minari #2512 by @vmoens
  • [CI] Fix h5py dependency in olddeps #2513 by @vmoens
  • [CI] Fix windows build legacy #2450 by @vmoens
  • [CI] Fix winndows compile tests #2511 by @vmoens
  • [CI] Remove 3.8 jobs #2412 by @vmoens
  • [CI] Resolve DMC and mujoco pinned versions (#2396) by @vmoens
  • [CI] Run docs on all PRs (#2413) by @vmoens
  • [CI] pin DMC and mujoco (#2374) by @vmoens
  • [Minor] Fix testcomposeaction_spec (#2493) Co-authored-by: Louis Faury by @louisfaury
  • [Minor] Fix typos in advantages.py (#2492) Co-authored-by: Louis Faury by @louisfaury
  • [Quality] Split utils.h and utils.cpp (#2348) by @vmoens
  • [Refactor] Limit the deepcopies in collectors #2451 by @vmoens
  • [Refactor] Refactor calls to get without default that raise KeyError (#2353) by @vmoens
  • [Refactor] Rename specs to simpler names (#2368) by @vmoens
  • [Refactor] Use empty_like in storage construction #2455 by @vmoens
  • [Versioning] Fix torch deps (#2340) by @vmoens
  • [Versioning] Gymnasium 1.0 incompatibility errors #2484 by @vmoens
  • [Versioning] Versions for 0.6 (#2509) by @vmoens

New Contributors

As always, we want to show how appreciative we are of the vibrant open-source community that keeps TorchRL alive.

  • @yu-fz made their first contribution in https://github.com/pytorch/rl/pull/2380
  • @cbhua made their first contribution in https://github.com/pytorch/rl/pull/2421
  • @younik made their first contribution in https://github.com/pytorch/rl/pull/2419
  • @thomasbbrunner made their first contribution in https://github.com/pytorch/rl/pull/2442
  • @LTluttmann made their first contribution in https://github.com/pytorch/rl/pull/2452
  • @louisfaury made their first contribution in https://github.com/pytorch/rl/pull/2463
  • @antoinebrl made their first contribution in https://github.com/pytorch/rl/pull/2466

Full Changelog: https://github.com/pytorch/rl/compare/v0.5.0...v0.6.0

- Python
Published by vmoens over 1 year ago

torchrl - v0.5.0: Dynamic specs, envs with non-tensor data and replay buffer checkpointers

What's Changed

This new release makes it possible to run environments that output non-tensor data. #1944

We also introduce dynamic specs, allowing environments to change the size of the observations / actions during the course of a rollout. This feature is compatible with parallel environment and collectors! #2143

Additionally, it is now possible to update a Replay Buffer in-place by assigning values at a given index. #2224

Finally, TorchRL is now compatible with Python 3.12 (#2282, #2281).

As always, a huge thanks to the vibrant OSS community that helps us developt this library!

New algorithms

  • [Algorithm] CrossQ by @BY571 in https://github.com/pytorch/rl/pull/2033
  • [Algorithm] TD3+BC by @BY571 in https://github.com/pytorch/rl/pull/2249

Features

  • [Feature] ActionDiscretizer by @vmoens in https://github.com/pytorch/rl/pull/2247
  • [Feature] Add KL approximation in PPO loss metadata by @albertbou92 in https://github.com/pytorch/rl/pull/2166
  • [Feature] Add modules.AdditiveGaussianModule by @kurtamohler in https://github.com/pytorch/rl/pull/2296
  • [Feature] Add modules.OrnsteinUhlenbeckProcessModule by @kurtamohler in https://github.com/pytorch/rl/pull/2297
  • [Feature] Autocomplete for losses by @vmoens in https://github.com/pytorch/rl/pull/2148
  • [Feature] Crop Transform by @albertbou92 in https://github.com/pytorch/rl/pull/2336
  • [Feature] Dynamic specs by @vmoens in https://github.com/pytorch/rl/pull/2143
  • [Feature] Extract primers from modules that contain RNNs by @albertbou92 in https://github.com/pytorch/rl/pull/2127
  • [Feature] Jumanji from_pixels=True by @vmoens in https://github.com/pytorch/rl/pull/2129
  • [Feature] Make ProbabilisticActor compatible with Composite distributions by @vmoens in https://github.com/pytorch/rl/pull/2220
  • [Feature] Replay buffer checkpointers by @vmoens in https://github.com/pytorch/rl/pull/2137
  • [Feature] Some improvements to VecNorm by @vmoens in https://github.com/pytorch/rl/pull/2251
  • [Feature] Split-trajectories and represent as nested tensor by @vmoens in https://github.com/pytorch/rl/pull/2043
  • [Feature] makeordinal_device by @vmoens in https://github.com/pytorch/rl/pull/2237
  • [Feature] assigning values to RB storage by @vmoens in https://github.com/pytorch/rl/pull/2224

Bug fixes

  • [BugFix,Feature] Allow non-tensor data in envs by @vmoens in https://github.com/pytorch/rl/pull/1944
  • [BugFix] Allow zero alpha value for PrioritizedSampler by @albertbou92 in https://github.com/pytorch/rl/pull/2164
  • [BugFix] Expose MARL modules by @vmoens in https://github.com/pytorch/rl/pull/2321
  • [BugFix] Fit vecnorm out_keys by @vmoens in https://github.com/pytorch/rl/pull/2157
  • [BugFix] Fix Brax by @vmoens in https://github.com/pytorch/rl/pull/2233
  • [BugFix] Fix OOB sampling in PrioritizedSliceSampler by @vmoens in https://github.com/pytorch/rl/pull/2239
  • [BugFix] Fix VecNorm test in test_collectors.py by @vmoens in https://github.com/pytorch/rl/pull/2162
  • [BugFix] Fix to in MultiDiscreteTensorSpec by @Quinticx in https://github.com/pytorch/rl/pull/2204
  • [BugFix] Fix and test PRB priority update across dims and rb types by @vmoens in https://github.com/pytorch/rl/pull/2244
  • [BugFix] Fix another ctx test by @vmoens in https://github.com/pytorch/rl/pull/2284
  • [BugFix] Fix async gym env with non-sync resets by @vmoens in https://github.com/pytorch/rl/pull/2170
  • [BugFix] Fix async gym when all reset by @vmoens in https://github.com/pytorch/rl/pull/2144
  • [BugFix] Fix brax wrapping by @vmoens in https://github.com/pytorch/rl/pull/2190
  • [BugFix] Fix collector tests where device ordinal is needed by @vmoens in https://github.com/pytorch/rl/pull/2240
  • [BugFix] Fix collectors with non tensors by @vmoens in https://github.com/pytorch/rl/pull/2232
  • [BugFix] Fix done/terminated computation in slice samplers by @vmoens in https://github.com/pytorch/rl/pull/2213
  • [BugFix] Fix info reading with async gym by @vmoens in https://github.com/pytorch/rl/pull/2150
  • [BugFix] Fix isaac - bis by @vmoens in https://github.com/pytorch/rl/pull/2119
  • [BugFix] Fix lib tests by @vmoens in https://github.com/pytorch/rl/pull/2218
  • [BugFix] Fix max value within buffer during update priority by @vmoens in https://github.com/pytorch/rl/pull/2242
  • [BugFix] Fix max-priority update by @vmoens in https://github.com/pytorch/rl/pull/2215
  • [BugFix] Fix non-tensor passage in _StepMDP by @vmoens in https://github.com/pytorch/rl/pull/2260
  • [BugFix] Fix non-tensor passage in _StepMDP by @vmoens in https://github.com/pytorch/rl/pull/2262
  • [BugFix] Fix prefetch in samples without replacement - .sample() compatibility issues by @vmoens in https://github.com/pytorch/rl/pull/2226
  • [BugFix] Fix sampling in NonTensorSpec by @vmoens in https://github.com/pytorch/rl/pull/2172
  • [BugFix] Fix sampling of values from NonTensorSpec by @vmoens in https://github.com/pytorch/rl/pull/2169
  • [BugFix] Fix slice sampler end computation at the cursor place by @vmoens in https://github.com/pytorch/rl/pull/2225
  • [BugFix] Fix sliced PRB when only traj is provided by @vmoens in https://github.com/pytorch/rl/pull/2228
  • [BugFix] Fix strict length in PRB+SliceSampler by @vmoens in https://github.com/pytorch/rl/pull/2202
  • [BugFix] Fix strict_length in prioritized slice sampler by @vmoens in https://github.com/pytorch/rl/pull/2194
  • [BugFix] Fix tanh normal mode by @vmoens in https://github.com/pytorch/rl/pull/2198
  • [BugFix] Fix tensordict private imports by @vmoens in https://github.com/pytorch/rl/pull/2275
  • [BugFix] Fix test_specs.py by @vmoens in https://github.com/pytorch/rl/pull/2214
  • [BugFix] Fix torch 2.3 compatibility of padding indices by @vmoens in https://github.com/pytorch/rl/pull/2216
  • [BugFix] Fix truncated normal by @vmoens in https://github.com/pytorch/rl/pull/2147
  • [BugFix] Fix typo in weight assignment in PRB by @vmoens in https://github.com/pytorch/rl/pull/2241
  • [BugFix] Fix update_priority generic signature for Samplers by @vmoens in https://github.com/pytorch/rl/pull/2252
  • [BugFix] Fix vecnorm state-dicts by @vmoens in https://github.com/pytorch/rl/pull/2158
  • [BugFix] Global import of optional library by @matteobettini in https://github.com/pytorch/rl/pull/2217
  • [BugFix] Gym async with _reset full of True by @vmoens in https://github.com/pytorch/rl/pull/2145
  • [BugFix] MLFlow logger by @GJBoth in https://github.com/pytorch/rl/pull/2152
  • [BugFix] Make DMControlEnv aware of truncated signals by @vmoens in https://github.com/pytorch/rl/pull/2196
  • [BugFix] Make _reset follow done shape by @matteobettini in https://github.com/pytorch/rl/pull/2189
  • [BugFix] EnvBase._complete_done to complete "terminated" key properly by @kurtamohler in https://github.com/pytorch/rl/pull/2294
  • [BugFix] LazyTensorStorage only allocates data on the given device by @matteobettini in https://github.com/pytorch/rl/pull/2188
  • [BugFix] done = done | truncated in collector by @vmoens in https://github.com/pytorch/rl/pull/2333
  • [BugFix] buffer iter for samplers without replacement + prefetch by @JulianKu in https://github.com/pytorch/rl/pull/2185
  • [BugFix] buffer __iter__ for samplers without replacement + prefetch by @JulianKu in https://github.com/pytorch/rl/pull/2178
  • [BugFix] missing deprecated kwargs by @fedebotu in https://github.com/pytorch/rl/pull/2125

Docs

  • [Doc] Add Custom Options for VideoRecorder by @N00bcak in https://github.com/pytorch/rl/pull/2259
  • [Doc] Add documentation for masks in tensor specs by @kurtamohler in https://github.com/pytorch/rl/pull/2289
  • [Doc] Better doc for maketensordictprimer by @vmoens in https://github.com/pytorch/rl/pull/2324
  • [Doc] Dynamic envs by @vmoens in https://github.com/pytorch/rl/pull/2191
  • [Doc] Edit README for local installs by @vmoens in https://github.com/pytorch/rl/pull/2255
  • [Doc] Fix algorithms references in tutos by @vmoens in https://github.com/pytorch/rl/pull/2320
  • [Doc] Fix documentation mismatch for default argument by @TheRisenPhoenix in https://github.com/pytorch/rl/pull/2149
  • [Doc] Fix links in doc by @vmoens in https://github.com/pytorch/rl/pull/2151
  • [Doc] Fix mistakes in docs for Trainer checkpointing backends by @kurtamohler in https://github.com/pytorch/rl/pull/2285
  • [Doc] Indicate necessary context to run multiprocessed collectors in doc by @GJBoth in https://github.com/pytorch/rl/pull/2126
  • [Doc] Restore colab links by @vmoens in https://github.com/pytorch/rl/pull/2197
  • [Doc] Update README.md by @KPCOFGS in https://github.com/pytorch/rl/pull/2155
  • [Doc] defaultinteractiontype doc by @vmoens in https://github.com/pytorch/rl/pull/2177
  • [Docs] InitTracker cleanup by @matteobettini in https://github.com/pytorch/rl/pull/2330
  • [Docs] Reintroduce BenchMARL pointers in MARL tutos by @matteobettini in https://github.com/pytorch/rl/pull/2159

Performance

  • [Performance, Refactor, BugFix] Faster loading of uninitialized storages by @vmoens in https://github.com/pytorch/rl/pull/2221
  • [Performance] consolidate TDs in ParallelEnv without buffers by @vmoens in https://github.com/pytorch/rl/pull/2231

Others

  • Fix "Run in Colab" and "Download Notebook" links in tutorials by @kurtamohler in https://github.com/pytorch/rl/pull/2268
  • Fix brax examples by @Jendker in https://github.com/pytorch/rl/pull/2318
  • Fixed several broken links in readme.md by @drMJ in https://github.com/pytorch/rl/pull/2156
  • Revert "[BugFix] Fix non-tensor passage in _StepMDP" by @vmoens in https://github.com/pytorch/rl/pull/2261
  • Revert "[BugFix] Fix tensordict private imports" by @vmoens in https://github.com/pytorch/rl/pull/2276
  • Revert "[BugFix] buffer __iter__ for samplers without replacement + prefetch" by @vmoens in https://github.com/pytorch/rl/pull/2182
  • [CI, Tests] Fix windows tests by @vmoens in https://github.com/pytorch/rl/pull/2337
  • [CI] Bump jinja2 from 3.1.3 to 3.1.4 in /docs by @dependabot in https://github.com/pytorch/rl/pull/2250
  • [CI] Fix CI by @vmoens in https://github.com/pytorch/rl/pull/2245
  • [CI] Fix nightly by @vmoens in https://github.com/pytorch/rl/pull/2279
  • [CI] Fix wheels by @vmoens in https://github.com/pytorch/rl/pull/2274
  • [CI] Pin transformers version to < 4.42.0 to make vmap happy by @vmoens in https://github.com/pytorch/rl/pull/2278
  • [CI] Upgrade SDL to install pygame 2.6 by @vmoens in https://github.com/pytorch/rl/pull/2248
  • [CI] Windows build fix by @vmoens in https://github.com/pytorch/rl/pull/2335
  • [CI] python 3.12 nightlies by @vmoens in https://github.com/pytorch/rl/pull/2281
  • [Example,BugFix] Add a Async gym env example by @vmoens in https://github.com/pytorch/rl/pull/2139
  • [MINOR] Fix unclear language by @software-samurai in https://github.com/pytorch/rl/pull/2165
  • [Minor] Code quality improvements by @vmoens in https://github.com/pytorch/rl/pull/2140
  • [Quality] Fix low/high in SOTA implementations by @vmoens in https://github.com/pytorch/rl/pull/2266
  • [Quality] Fix repr of MARL modules by @vmoens in https://github.com/pytorch/rl/pull/2192
  • [Quality] Remove global seeding in set_seed by @vmoens in https://github.com/pytorch/rl/pull/2195
  • [Quality] Warn if the sampler is not prioritized but update_priority is called by @vmoens in https://github.com/pytorch/rl/pull/2253
  • [Quality] better error message for CompositeSpec shape mismatch by @vmoens in https://github.com/pytorch/rl/pull/2223
  • [Refactor] Deprecate NormalParamWrapper by @vmoens in https://github.com/pytorch/rl/pull/2308
  • [Refactor] Remove _run_checks from TensorDict.__init__ by @vmoens in https://github.com/pytorch/rl/pull/2256
  • [Refactor] Update all instances of exploration *Wrapper to *Module by @kurtamohler in https://github.com/pytorch/rl/pull/2298
  • [Refactor] Use td.transpose in multi-step transform by @vmoens in https://github.com/pytorch/rl/pull/2288
  • [Refactor] tensordict.tensordict -> tensordict.C by @vmoens in https://github.com/pytorch/rl/pull/2286
  • [Tests] Fix VMAS tests by @matteobettini in https://github.com/pytorch/rl/pull/2287
  • [Tests] Fix windows tests by @vmoens in https://github.com/pytorch/rl/pull/2219
  • [Versioning] Add python 3.12 to setup.py by @vmoens in https://github.com/pytorch/rl/pull/2282
  • [Versioning] Allow any torch version for local builds by @vmoens in https://github.com/pytorch/rl/pull/2130
  • [Versioning] Bump torch 2.0 as minimal version by @vmoens in https://github.com/pytorch/rl/pull/2200
  • [Versioning] v0.5 bump by @vmoens in https://github.com/pytorch/rl/pull/2267
  • [Versioning] windows build - add legacy back and .bat env-script by @vmoens in https://github.com/pytorch/rl/pull/2339
  • init by @vmoens in https://github.com/pytorch/rl/pull/2322

New Contributors

  • @GJBoth made their first contribution in https://github.com/pytorch/rl/pull/2126
  • @TheRisenPhoenix made their first contribution in https://github.com/pytorch/rl/pull/2149
  • @drMJ made their first contribution in https://github.com/pytorch/rl/pull/2156
  • @KPCOFGS made their first contribution in https://github.com/pytorch/rl/pull/2155
  • @software-samurai made their first contribution in https://github.com/pytorch/rl/pull/2165
  • @JulianKu made their first contribution in https://github.com/pytorch/rl/pull/2178
  • @Quinticx made their first contribution in https://github.com/pytorch/rl/pull/2204
  • @kurtamohler made their first contribution in https://github.com/pytorch/rl/pull/2268
  • @N00bcak made their first contribution in https://github.com/pytorch/rl/pull/2259
  • @Jendker made their first contribution in https://github.com/pytorch/rl/pull/2318

Full Changelog: https://github.com/pytorch/rl/compare/v0.4.0...v0.5.0

- Python
Published by vmoens almost 2 years ago

torchrl - v0.4.0

New Features:

  • Better video rendering
    • [Feature] A PixelRenderTransform by @vmoens in https://github.com/pytorch/rl/pull/2099
    • [Feature] Video recording in SOTA examples by @vmoens in https://github.com/pytorch/rl/pull/2070
    • [Feature] VideoRecorder for datasets and replay buffers by @vmoens in https://github.com/pytorch/rl/pull/2069
  • Replay buffer: sampling trajectories is now much easier, cleaner and faster
    • [Benchmark] Benchmark slice sampler by @vmoens in https://github.com/pytorch/rl/pull/1992
    • [Feature] Add PrioritizedSliceSampler by @Cadene in https://github.com/pytorch/rl/pull/1875
    • [Feature] Span slice indices on the left and on the right by @vmoens in https://github.com/pytorch/rl/pull/2107
    • [Feature] batched trajectories - SliceSampler compatibility by @vmoens in https://github.com/pytorch/rl/pull/1775
    • [Performance] Faster slice sampler by @vmoens in https://github.com/pytorch/rl/pull/2031
  • Datasets: allow preprocessing datasets after download
    • [Feature] Preproc for datasets by @vmoens in https://github.com/pytorch/rl/pull/1989
  • Losses: reduction parameters and non-functional execution
    • [Feature] Add reduction parameter to On-Policy losses. by @albertbou92 in https://github.com/pytorch/rl/pull/1890
    • [Feature] Adds value clipping in ClipPPOLoss loss by @albertbou92 in https://github.com/pytorch/rl/pull/2005
    • [Feature] Offline objectives reduction parameter by @albertbou92 in https://github.com/pytorch/rl/pull/1984
  • Environment API: support "fork" start method in ParallelEnv, better handling of auto-resetting envs.
    • [Feature] Use non-default mp start method in ParallelEnv by @vmoens in https://github.com/pytorch/rl/pull/1966
    • [Feature] Auto-resetting envs by @vmoens in https://github.com/pytorch/rl/pull/2073
  • Transforms
    • [Feature] Allow any callable to be used as transform by @vmoens in https://github.com/pytorch/rl/pull/2027
    • [Feature] invert transforms appended to a RB by @vmoens in https://github.com/pytorch/rl/pull/2111
    • [Feature] Extend TensorDictPrimer default_value options by @albertbou92 in https://github.com/pytorch/rl/pull/2071
    • [Feature] Fine grained DeviceCastTransform by @vmoens in https://github.com/pytorch/rl/pull/2041
    • [Feature] BatchSizeTransform by @vmoens in https://github.com/pytorch/rl/pull/2030
    • [Feature] Allow non-sorted keys in CatFrames by @vmoens in https://github.com/pytorch/rl/pull/1913
    • [Feature] env.append_transform by @vmoens in https://github.com/pytorch/rl/pull/2040
  • New environment and improvements:
    • [Environment] Meltingpot by @matteobettini in https://github.com/pytorch/rl/pull/2054
    • [Feature] Return depth from RoboHiveEnv by @sriramsk1999 in https://github.com/pytorch/rl/pull/2058
    • [Feature] PettingZoo possibility to choose reset strategy by @matteobettini in https://github.com/pytorch/rl/pull/2048

Other features

  • [Feature] Add time_dim arg in value modules by @vmoens in https://github.com/pytorch/rl/pull/1946
  • [Feature] Batched actions wrapper by @vmoens in https://github.com/pytorch/rl/pull/2018
  • [Feature] Better repr of RBs by @vmoens in https://github.com/pytorch/rl/pull/1991
  • [Feature] Execute rollouts with regular nn.Module instances by @vmoens in https://github.com/pytorch/rl/pull/1947
  • [Feature] Logger by @vmoens in https://github.com/pytorch/rl/pull/1858
  • [Feature] Passing lists of keyword arguments in reset for batched envs by @vmoens in https://github.com/pytorch/rl/pull/2076
  • [Feature] RB MultiStep transform by @vmoens in https://github.com/pytorch/rl/pull/2008
  • [Feature] Replace RewardClipping with SignTransform in Atari examples by @albertbou92 in https://github.com/pytorch/rl/pull/1870
  • [Feature] reset_parameters for multiagent nets by @matteobettini in https://github.com/pytorch/rl/pull/1970
  • [Feature] optionally set truncated = True at the end of rollouts by @vmoens in https://github.com/pytorch/rl/pull/2042

Miscellaneous

  • Fix onw typo by @kit1980 in https://github.com/pytorch/rl/pull/1917
  • Rename SOTA-IMPLEMENTATIONS.md to README.md by @matteobettini in https://github.com/pytorch/rl/pull/2093
  • Revert "[BugFix] Fix Isaac" by @vmoens in https://github.com/pytorch/rl/pull/2118
  • Update getting-started-5.py by @vmoens in https://github.com/pytorch/rl/pull/1894
  • [BugFix, Performance] Fewer imports at root by @vmoens in https://github.com/pytorch/rl/pull/1930
  • [BugFix,CI] Fix Windows CI by @vmoens in https://github.com/pytorch/rl/pull/1983
  • [BugFix,CI] Fix sporadically failing tests in CI by @vmoens in https://github.com/pytorch/rl/pull/2098
  • [BugFix,Refactor] Dreamer refactor by @BY571 in https://github.com/pytorch/rl/pull/1918
  • [BugFix] Adaptable non-blocking for mps and non cuda device in batched-envs by @vmoens in https://github.com/pytorch/rl/pull/1900
  • [BugFix] Call contiguous on rollout results in TestMultiStepTransform by @vmoens in https://github.com/pytorch/rl/pull/2025
  • [BugFix] Dedicated tests for on policy losses reduction parameter by @albertbou92 in https://github.com/pytorch/rl/pull/1974
  • [BugFix] Extend with a list of tensordicts by @vmoens in https://github.com/pytorch/rl/pull/2032
  • [BugFix] Fix Atari DQN ensembling by @vmoens in https://github.com/pytorch/rl/pull/1981
  • [BugFix] Fix CQL/IQL pbar update by @vmoens in https://github.com/pytorch/rl/pull/2020
  • [BugFix] Fix Exclude / Double2Float transforms by @vmoens in https://github.com/pytorch/rl/pull/2101
  • [BugFix] Fix Isaac by @vmoens in https://github.com/pytorch/rl/pull/2072
  • [BugFix] Fix KLPENPPOLoss KL computation by @vmoens in https://github.com/pytorch/rl/pull/1922
  • [BugFix] Fix MPS sync in device transform by @vmoens in https://github.com/pytorch/rl/pull/2061
  • [BugFix] Fix OOB TruncatedNormal LP by @vmoens in https://github.com/pytorch/rl/pull/1924
  • [BugFix] Fix R2Go once more by @vmoens in https://github.com/pytorch/rl/pull/2089
  • [BugFix] Fix Ray collector example error by @albertbou92 in https://github.com/pytorch/rl/pull/1908
  • [BugFix] Fix Ray collector on Python > 3.8 by @albertbou92 in https://github.com/pytorch/rl/pull/2015
  • [BugFix] Fix RoboHiveEnv tests by @sriramsk1999 in https://github.com/pytorch/rl/pull/2062
  • [BugFix] Fix _reset data passing in parallel env by @vmoens in https://github.com/pytorch/rl/pull/1880
  • [BugFix] Fix a bug in SliceSampler, indexes outside sampler lengths were produced by @vladisai in https://github.com/pytorch/rl/pull/1874
  • [BugFix] Fix args/kwargs passing in advantages by @vmoens in https://github.com/pytorch/rl/pull/2001
  • [BugFix] Fix batch-size expansion in functionalization by @vmoens in https://github.com/pytorch/rl/pull/1959
  • [BugFix] Fix broken gym tests by @vmoens in https://github.com/pytorch/rl/pull/1980
  • [BugFix] Fix clip_fraction in PO losses by @vmoens in https://github.com/pytorch/rl/pull/2021
  • [BugFix] Fix colab in tutos by @vmoens in https://github.com/pytorch/rl/pull/2113
  • [BugFix] Fix env.shape regex matches by @vmoens in https://github.com/pytorch/rl/pull/1940
  • [BugFix] Fix examples by @vmoens in https://github.com/pytorch/rl/pull/1945
  • [BugFix] Fix exploration in losses by @vmoens in https://github.com/pytorch/rl/pull/1898
  • [BugFix] Fix flaky rb tests by @vmoens in https://github.com/pytorch/rl/pull/1901
  • [BugFix] Fix habitat by @vmoens in https://github.com/pytorch/rl/pull/1941
  • [BugFix] Fix jumanji by @vmoens in https://github.com/pytorch/rl/pull/2064
  • [BugFix] Fix loadstatedict and is_empty td bugfix impact by @vmoens in https://github.com/pytorch/rl/pull/1869
  • [BugFix] Fix mpstartmethod for ParallelEnv with singleforserial by @vmoens in https://github.com/pytorch/rl/pull/2007
  • [BugFix] Fix multiple context syntax in multiagent examples by @matteobettini in https://github.com/pytorch/rl/pull/1943
  • [BugFix] Fix offline CatFrames by @vmoens in https://github.com/pytorch/rl/pull/1953
  • [BugFix] Fix offline CatFrames for pixels by @vmoens in https://github.com/pytorch/rl/pull/1964
  • [BugFix] Fix prints of size error when no file is associated with memmap by @vmoens in https://github.com/pytorch/rl/pull/2090
  • [BugFix] Fix replay buffer extension with lists by @vmoens in https://github.com/pytorch/rl/pull/1937
  • [BugFix] Fix reward2go for nd tensors by @vmoens in https://github.com/pytorch/rl/pull/2087
  • [BugFix] Fix robohive by @vmoens in https://github.com/pytorch/rl/pull/2080
  • [BugFix] Fix sampling without replacement with ndim storages by @vmoens in https://github.com/pytorch/rl/pull/1999
  • [BugFix] Fix slice sampler compatibility with split_trajs and MultiStep by @vmoens in https://github.com/pytorch/rl/pull/1961
  • [BugFix] Fix slicesampler terminated/truncated signaling by @vmoens in https://github.com/pytorch/rl/pull/2044
  • [BugFix] Fix strict-length for spanning trajectories by @vmoens in https://github.com/pytorch/rl/pull/1982
  • [BugFix] Fix strict_length=True in SliceSampler by @vmoens in https://github.com/pytorch/rl/pull/2037
  • [BugFix] Fix unwanted lazy stacks by @vmoens in https://github.com/pytorch/rl/pull/2102
  • [BugFix] Fix update in serial / parallel env by @vmoens in https://github.com/pytorch/rl/pull/1866
  • [BugFix] Fix vmas stacks by @vmoens in https://github.com/pytorch/rl/pull/2105
  • [BugFix] Fixed import for importlib by @DanilBaibak in https://github.com/pytorch/rl/pull/1914
  • [BugFix] Make KL-controllers independent of the model by @vmoens in https://github.com/pytorch/rl/pull/1903
  • [BugFix] Make sure ParallelEnv does not overflow mem when policy requires grad by @vmoens in https://github.com/pytorch/rl/pull/1909
  • [BugFix] More robust _StepMDP and multi-purpose envs by @vmoens in https://github.com/pytorch/rl/pull/2038
  • [BugFix] No grad on collector reset by @matteobettini in https://github.com/pytorch/rl/pull/1927
  • [BugFix] Non exclusive terminated and truncated by @vmoens in https://github.com/pytorch/rl/pull/1911
  • [BugFix] Refactor reductions by @vmoens in https://github.com/pytorch/rl/pull/1968
  • [BugFix] Remove split_trajectories's reference to ("next", "done"). by @initmaks in https://github.com/pytorch/rl/pull/2094
  • [BugFix] Remove reset on last step of a rollout by @matteobettini in https://github.com/pytorch/rl/pull/1936
  • [BugFix] Robust sync for non_blocking=True by @vmoens in https://github.com/pytorch/rl/pull/2034
  • [BugFix] Set default value for normalize_advantage to False. by @DobromirM in https://github.com/pytorch/rl/pull/2050
  • [BugFix] Set strict=False in tensordict.select() calls for objective classes by @albertbou92 in https://github.com/pytorch/rl/pull/2004
  • [BugFix] SliceSampler device and index mesh by @vmoens in https://github.com/pytorch/rl/pull/1996
  • [BugFix] Solve recursion issue in losses hook by @vmoens in https://github.com/pytorch/rl/pull/1897
  • [BugFix] Update cql docstring example by @BY571 in https://github.com/pytorch/rl/pull/1951
  • [BugFix] Update iql docstring example by @BY571 in https://github.com/pytorch/rl/pull/1950
  • [BugFix] Use same signature for append_transform in all cases by @vmoens in https://github.com/pytorch/rl/pull/2091
  • [BugFix] Use setdefault in cachevalues by @vmoens in https://github.com/pytorch/rl/pull/1910
  • [BugFix] Use traj_terminated in SliceSampler by @Cadene in https://github.com/pytorch/rl/pull/1884
  • [BugFix] Vmap randomness for value estimator by @BY571 in https://github.com/pytorch/rl/pull/1942
  • [BugFix] better device consistency in EGreedy by @vmoens in https://github.com/pytorch/rl/pull/1867
  • [BugFix] checkenvspecs seeding logic by @vmoens in https://github.com/pytorch/rl/pull/1872
  • [BugFix] fix formatting for VideoRecorder docstring by @sriramsk1999 in https://github.com/pytorch/rl/pull/1985
  • [BugFix] fix trunc normal device by @vmoens in https://github.com/pytorch/rl/pull/1931
  • [BugFix] missing annotations import by @vmoens in https://github.com/pytorch/rl/pull/2074
  • [BugFix] state typo in RNG control module by @vmoens in https://github.com/pytorch/rl/pull/1878
  • [BugFix] toobservationnorm now works with keys which are not strings by @maxweissenbacher in https://github.com/pytorch/rl/pull/2045
  • [BugFix] union -> intersection in _StepMDP check by @vmoens in https://github.com/pytorch/rl/pull/2039
  • [CI,Doc] Sanitize version by @vmoens in https://github.com/pytorch/rl/pull/2120
  • [CI] Doc on release tag by @vmoens in https://github.com/pytorch/rl/pull/2116
  • [CI] Fix CI issues by @vmoens in https://github.com/pytorch/rl/pull/2084
  • [CI] Fix Doc CI by @matteobettini in https://github.com/pytorch/rl/pull/2106
  • [CI] Fixes sympy error by fixing mpmath version by @vmoens in https://github.com/pytorch/rl/pull/1988
  • [CI] Install ffmpeg in Robohive tests by @vmoens in https://github.com/pytorch/rl/pull/2063
  • [CI] Install stable torch and tensordict for release tests by @vmoens in https://github.com/pytorch/rl/pull/1978
  • [CI] Remove all macos x86 jobs by @vmoens in https://github.com/pytorch/rl/pull/2117
  • [CI] Remove x86 OSX jobs by @vmoens in https://github.com/pytorch/rl/pull/2112
  • [CI] Schedule workflows for releases by @vmoens in https://github.com/pytorch/rl/pull/2114
  • [CI] Temporarily remove snapshot from CI by @vmoens in https://github.com/pytorch/rl/pull/2000
  • [CI] Unpin mpmath by @vmoens in https://github.com/pytorch/rl/pull/1997
  • [CI] Upgrade 3.8 to 3.10 GPU jobs by @vmoens in https://github.com/pytorch/rl/pull/2013
  • [Deprecation] Deprecate in prep for release by @vmoens in https://github.com/pytorch/rl/pull/1820
  • [Doc,Feature] Better doc for modules and list of kwargs when possible by @vmoens in https://github.com/pytorch/rl/pull/1990
  • [Doc] Fix tutos by @vmoens in https://github.com/pytorch/rl/pull/1863
  • [Doc] Getting started tutos by @vmoens in https://github.com/pytorch/rl/pull/1886
  • [Doc] Improve PrioritizedSampler doc and get rid of np dependency as much as possible by @vmoens in https://github.com/pytorch/rl/pull/1881
  • [Doc] Installation instructions in API ref by @vmoens in https://github.com/pytorch/rl/pull/1871
  • [Doc] Per-release doc by @vmoens in https://github.com/pytorch/rl/pull/2108
  • [Documentation] Correct MaskedEnv Example in ActionMask Transform Documentation by @Jonathanace in https://github.com/pytorch/rl/pull/2060
  • [Examples] Move examples to sota-implementations by @vmoens in https://github.com/pytorch/rl/pull/2016
  • [Minor] Add env.shape attribute by @vmoens in https://github.com/pytorch/rl/pull/1938
  • [Minor] Lint by @vmoens in https://github.com/pytorch/rl/pull/2096
  • [Minor] Move distributed examples to examples by @vmoens in https://github.com/pytorch/rl/pull/2097
  • [Minor] Remove duplicate if statement in storages by @vmoens in https://github.com/pytorch/rl/pull/2066
  • [Minor] Remove warnings in test_cost by @vmoens in https://github.com/pytorch/rl/pull/1902
  • [Minor] Support init lazy storages with add by @vmoens in https://github.com/pytorch/rl/pull/2028
  • [Minor] Use the main branch for the M1 build wheels by @DanilBaibak in https://github.com/pytorch/rl/pull/1965
  • [Performance] Faster DMC by @vmoens in https://github.com/pytorch/rl/pull/2002
  • [Quality] Capture errors in specs transforms by @vmoens in https://github.com/pytorch/rl/pull/2092
  • [Quality] Make sure deprec warnings are displayed by @vmoens in https://github.com/pytorch/rl/pull/2088
  • [Refactor,Feature] Refactor collector shapes and stack_result in sync collector by @vmoens in https://github.com/pytorch/rl/pull/1994
  • [Refactor] Clearer separation between singletask and shareindividual_td by @vmoens in https://github.com/pytorch/rl/pull/2026
  • [Refactor] Faster and more generic multi-agent nets by @vmoens in https://github.com/pytorch/rl/pull/1921
  • [Refactor] Refactor split_trajectories by @vmoens in https://github.com/pytorch/rl/pull/1955
  • [Refactor] Remove remnant legacy functional calls by @vmoens in https://github.com/pytorch/rl/pull/1973
  • [Refactor] Use filter_empty=False in apply for params by @vmoens in https://github.com/pytorch/rl/pull/1882
  • [Refactor] Use filter_empty=True in apply by @vmoens in https://github.com/pytorch/rl/pull/1879
  • [Tutorial] PettingZoo Parallel competitive tutorial by @matteobettini in https://github.com/pytorch/rl/pull/2047
  • [Versioning] Deprecations for 0.4 by @vmoens in https://github.com/pytorch/rl/pull/2109
  • [Versioning] New torch version by @vmoens in https://github.com/pytorch/rl/pull/2110
  • [Versioning] v0.4.0 by @vmoens in https://github.com/pytorch/rl/pull/1860

New Contributors

  • @vladisai made their first contribution in https://github.com/pytorch/rl/pull/1874
  • @Cadene made their first contribution in https://github.com/pytorch/rl/pull/1884
  • @sriramsk1999 made their first contribution in https://github.com/pytorch/rl/pull/1985
  • @DobromirM made their first contribution in https://github.com/pytorch/rl/pull/2050
  • @Jonathanace made their first contribution in https://github.com/pytorch/rl/pull/2060
  • @maxweissenbacher made their first contribution in https://github.com/pytorch/rl/pull/2045
  • @initmaks made their first contribution in https://github.com/pytorch/rl/pull/2094

A big thanks to our dear contributors as well as the entire user base for helping with this lib!

Full Changelog: https://github.com/pytorch/rl/compare/v0.3.0...v0.4.0

- Python
Published by vmoens about 2 years ago

torchrl - v0.3.1

This release provides a bunch of bug fixes and speedups.

What's Changed

[BugFix] Fix broken gym tests (#1980) [BugFix,CI] Fix Windows CI (#1983) [Minor] Cleanup [CI] Install stable torch and tensordict for release tests (#1978) [Refactor] Remove remnant legacy functional calls (#1973) [Minor] Use the main branch for the M1 build wheels (#1965) [BugFix] Fixed import for importlib (#1914) [BugFix] Fix offline CatFrames for pixels (#1964) [BugFix] Fix offline CatFrames (#1953) [BugFix] Fix batch-size expansion in functionalization (#1959) [BugFix] Update iql docstring example (#1950) [BugFix] Update cql docstring example (#1951) [BugFix] Fix examples (#1945) [BugFix] Remove reset on last step of a rollout (#1936) [BugFix] Vmap randomness for value estimator (#1942) [BugFix] Fix multiple context syntax in multiagent examples (#1943) [BugFix] Fix habitat (#1941) [BugFix] Fix env.shape regex matches (#1940) [Minor] Add env.shape attribute (#1938) [BugFix] Fix replay buffer extension with lists (#1937) [BugFix] No grad on collector reset (#1927) [BugFix] fix trunc normal device (#1931) [BugFix, Performance] Fewer imports at root (#1930) [BugFix] Fix OOB TruncatedNormal LP (#1924) [BugFix] Fix KLPENPPOLoss KL computation (#1922) [Doc] Fix onw typo (#1917) [BugFix] Make sure ParallelEnv does not overflow mem when policy requires grad (#1909) [BugFix] Non exclusive terminated and truncated (#1911) [BugFix] Use setdefault in cachevalues (#1910) [BugFix] Fix Ray collector example error (#1908) [BugFix] Make KL-controllers independent of the model (#1903) [Minor] Remove warnings in testcost (#1902) [BugFix] Adaptable non-blocking for mps and non cuda device in batched-envs (#1900) [BugFix] Fix flaky rb tests (#1901) [BugFix] Fix exploration in losses (#1898) [BugFix] Solve recursion issue in losses hook (#1897) [Doc] Update getting-started-5.py (#1894) [Doc] Getting started tutos (#1886) [BugFix] Use trajterminated in SliceSampler (#1884) [Doc] Improve PrioritizedSampler doc and get rid of np dependency as much as possible (#1881) [BugFix] Fix reset data passing in parallel env (#1880) [BugFix] state typo in RNG control module (#1878) [BugFix] Fix a bug in SliceSampler, indexes outside sampler lengths were produced (#1874) [BugFix] checkenvspecs seeding logic (#1872) [BugFix] Fix update in serial / parallel env (#1866) [Doc] Installation instructions in API ref (#1871) [BugFix] better device consistency in EGreedy (#1867) [BugFix] Fix loadstatedict and isempty td bugfix impact (#1869) [Doc] Fix tutos (#1863)

Full Changelog: https://github.com/pytorch/rl/compare/v0.3.0...v0.3.1

- Python
Published by vmoens about 2 years ago

torchrl - v0.3.0: Data hub, universal env converter and more!

In this release, we focused on building a Data Hub for offline RL, providing a universal 2gym conversion tool (#1795) and improving the doc.

TorchRL Data Hub

TorchRL now offers many offline datasets in robotics and control or gaming, all under a single data format (TED for TorchRL Episode Data Format). All datasets are one step away of being downloaded: dataset = <Name>ExperienceReplay(dataset_id, root="/path/to/storage", download=True) is all you need to get started. This means that you can now download OpenX #1751 or Roboset #1743 datasets and combine them in a single replay buffer #1768 or swap one another in no time and with no extra code. We allow many new sampling techniques, like sampling slices of trajectories with or without repetition etc. As always you can append your favourite transform to these transforms.

TorchRL2Gym universal converter

1795 introduces a new universal converter for simulation libraries to gym.

As RL practitioner, it's sometimes difficult to accommodate for the many different environment APIs that exist. TorchRL now provides a way of registering any env in gym(nasium). This allows users to build their dataset in torchrl and integrate them in their code base with no effort if they are already using gym as a backend. It also allows to transform DMControl or Brax envs (among others) to gym without the need for an extra library.

PPO and A2C compatibility with distributed models

Functional calls can now be turned off for PPO and A2C loss modules, allowing users to run RLHF training loops at scale! #1804

ย TensorDict-free replay buffers

You can now use TorchRL's replay buffer with ANY tensor-based structure, whether it involves dict, tuples or lists. In principle, storing data contiguously on disk given any gym environment is as simple as

```python rb = ReplayBuffer(storage=LazyMemmapStorage(capacity)) obs, reward, terminal, truncated, info = env.step(action) rb.add((obs, obs, reward, terminal, truncated, info, action))

sampling a tuple obs, reward, terminal, truncated, info

obs, obs_, reward, terminal, truncated, info = rb.sample() ```

This is independent of TensorDict and it supports many components of our replay buffers as well as transforms. Check the doc here.

ย Multiprocessed replay buffers

TorchRL's replay buffers can now be shared across processes. Multiprocessed RBs can not only be read from but also extended on different workers. #1724

SOTA checks

We introduce a list of scripts to check that our training scripts work ok before each release: #1822

Throughput of Gym and DMControl

We removed loads of checks in GymLikeEnv if some basic conditions are met, which improves the throughput significantly for simple envs. #1803

ย Algorithms

We introduce discrete CQL #1666 , discrete IQL #1793 and Impala #1506.

What's Changed: PR description

  • [BugFix] Fix incorrect deprecation warning by @mikemykhaylov in https://github.com/pytorch/rl/pull/1655
  • [Bug] TensorDictMaxValueWriter raises error when no sample in a batch is accepted by @albertbou92 in https://github.com/pytorch/rl/pull/1664
  • [BugFix] Fix "done" instead of "terminated" mistakes by @MarCnu in https://github.com/pytorch/rl/pull/1661
  • [Feature] CatFrames constant padding by @albertbou92 in https://github.com/pytorch/rl/pull/1663
  • doc(README): remove typo by @Deep145757 in https://github.com/pytorch/rl/pull/1665
  • [Docs] Update README.md by @vaibhav-009 in https://github.com/pytorch/rl/pull/1667
  • [Minor] Update dreamer example tests by @vmoens in https://github.com/pytorch/rl/pull/1668
  • [Feature] Introduce grouping in VMAS by @matteobettini in https://github.com/pytorch/rl/pull/1658
  • [BugFix] assertion error message, envs/util.py by @laszloKopits in https://github.com/pytorch/rl/pull/1669
  • [Doc] Set action_spec instead of input_spec by @FrankTianTT in https://github.com/pytorch/rl/pull/1657
  • [BugFix] Fix submitit IP address/node name retrieval by @vmoens in https://github.com/pytorch/rl/pull/1672
  • [Doc] Document (and test) compound actor by @vmoens in https://github.com/pytorch/rl/pull/1673
  • [Doc] Update rollout_recurrent.png to account for terminal by @vmoens in https://github.com/pytorch/rl/pull/1677
  • [Doc] Add EGreedyWrapper back in the doc by @vmoens in https://github.com/pytorch/rl/pull/1680
  • [Doc] Fix TanhDelta docstring by @matteobettini in https://github.com/pytorch/rl/pull/1683
  • [Doc] Add discord badge on README by @vmoens in https://github.com/pytorch/rl/pull/1686
  • [CI] Downgrade RAY to fix CI by @vmoens in https://github.com/pytorch/rl/pull/1687
  • [BugFix] MaxValueWriter cuda compatibility by @albertbou92 in https://github.com/pytorch/rl/pull/1689
  • Upload docs for preview on HUD by @DanilBaibak in https://github.com/pytorch/rl/pull/1682
  • [Doc] Update pendulum and rnn tutos by @vmoens in https://github.com/pytorch/rl/pull/1691
  • [Algorithm] Discrete CQL by @BY571 in https://github.com/pytorch/rl/pull/1666
  • [BugFix] Minor fix in the logging of PPO and A2C examples by @albertbou92 in https://github.com/pytorch/rl/pull/1693
  • [CI] Enable retry mechanism by @DanilBaibak in https://github.com/pytorch/rl/pull/1681
  • [Refactor] Minor changes in prep of https://github.com/pytorch/tensordict/pull/541 by @vmoens in https://github.com/pytorch/rl/pull/1696
  • [BugFix] fix dreamer actor by @FrankTianTT in https://github.com/pytorch/rl/pull/1697
  • [Refactor] Deprecate direct usage of memmap tensors by @vmoens in https://github.com/pytorch/rl/pull/1684
  • Revert "[Refactor] Deprecate direct usage of memmap tensors" by @vmoens in https://github.com/pytorch/rl/pull/1698
  • [Refactor] Deprecate direct usage of memmap tensors by @vmoens in https://github.com/pytorch/rl/pull/1699
  • [Doc] Fix discord link by @vmoens in https://github.com/pytorch/rl/pull/1701
  • [BugFix] make sure the params of exploration-wrapper is float by @FrankTianTT in https://github.com/pytorch/rl/pull/1700
  • [Fix] EndOfLifeTransform fix in end of life detection by @albertbou92 in https://github.com/pytorch/rl/pull/1705
  • [CI] Fix benchmark on gpu by @vmoens in https://github.com/pytorch/rl/pull/1706
  • [Algorithm] IMPALA and VTrace module by @albertbou92 in https://github.com/pytorch/rl/pull/1506
  • [Doc] Fix discord link by @vmoens in https://github.com/pytorch/rl/pull/1712
  • [Refactor] Refactor functional calls in losses by @vmoens in https://github.com/pytorch/rl/pull/1707
  • [CI] Fix CI by @vmoens in https://github.com/pytorch/rl/pull/1711
  • [BugFix] Make casting to 'meta' device uniform across cost modules by @vmoens in https://github.com/pytorch/rl/pull/1715
  • [BugFix] Change ppo mujoco example to match paper results by @albertbou92 in https://github.com/pytorch/rl/pull/1714
  • [Minor] Hide params in ddpg actor-critic by @vmoens in https://github.com/pytorch/rl/pull/1716
  • [BugFix] Fix holdoutnet by @vmoens in https://github.com/pytorch/rl/pull/1719
  • [BugFix] RewardSum key check by @matteobettini in https://github.com/pytorch/rl/pull/1718
  • [Feature] Allow usage of a different device on main and sub-envs in ParallelEnv and SerialEnv by @vmoens in https://github.com/pytorch/rl/pull/1626
  • [Refactor] Better weight update in collectors by @vmoens in https://github.com/pytorch/rl/pull/1723
  • [Feature] Shared replay buffers by @vmoens in https://github.com/pytorch/rl/pull/1724
  • [CI] FIx nightly builds on osx by @vmoens in https://github.com/pytorch/rl/pull/1726
  • [BugFix] callactor_net does not handle multiple inputs by @albertbou92 in https://github.com/pytorch/rl/pull/1728
  • [Feature] Python-based RNN Modules by @albertbou92 in https://github.com/pytorch/rl/pull/1720
  • [BugFix, Test] Fix flaky gym vecenvs tests by @vmoens in https://github.com/pytorch/rl/pull/1727
  • [BugFix] Fix non-full TensorStorage indexing by @vmoens in https://github.com/pytorch/rl/pull/1730
  • [Feature] Minari datasets by @vmoens in https://github.com/pytorch/rl/pull/1721
  • [Feature] All VMAS scenarios available by @matteobettini in https://github.com/pytorch/rl/pull/1731
  • [Feature] pickle-free RB checkpointing by @vmoens in https://github.com/pytorch/rl/pull/1733
  • [CI] Fix doc upload by @vmoens in https://github.com/pytorch/rl/pull/1738
  • [BugFix] Fix RNNs trajectory split in VMAP calls by @vmoens in https://github.com/pytorch/rl/pull/1736
  • [CI] Fix doc upload by @vmoens in https://github.com/pytorch/rl/pull/1739
  • [BugFix, Feature] Fix DDQN implementation by @vmoens in https://github.com/pytorch/rl/pull/1737
  • [Algorithm] Update DQN example by @albertbou92 in https://github.com/pytorch/rl/pull/1512
  • [BugFix] Use rsync in doc workflow by @vmoens in https://github.com/pytorch/rl/pull/1741
  • [BugFix] Fix compat with new memmap API by @vmoens in https://github.com/pytorch/rl/pull/1744
  • [Feature] Roboset datasets by @vmoens in https://github.com/pytorch/rl/pull/1743
  • [Algorithm] Simpler IQL example by @BY571 in https://github.com/pytorch/rl/pull/998
  • [Performance] Faster RNNs by @vmoens in https://github.com/pytorch/rl/pull/1732
  • [BugFix, Test] Fix torch.vmap call in RNN tests by @vmoens in https://github.com/pytorch/rl/pull/1749
  • [BugFix] Fix discrete SAC log-prob by @vmoens in https://github.com/pytorch/rl/pull/1750
  • [Minor] Remove dead code in RolloutFromModel by @ianbarber in https://github.com/pytorch/rl/pull/1752
  • [Minor] Fix runnability of RLHF example in examples/rlhf by @ianbarber in https://github.com/pytorch/rl/pull/1753
  • [Feature] SliceSampler by @vmoens in https://github.com/pytorch/rl/pull/1748
  • [CI] Fix windows CI by @vmoens in https://github.com/pytorch/rl/pull/1746
  • [CI] Fix CI for optional dependencies by @vmoens in https://github.com/pytorch/rl/pull/1754
  • [Feature] V-D4RL by @vmoens in https://github.com/pytorch/rl/pull/1756
  • [Benchmark] Fix RB benchmarks by @vmoens in https://github.com/pytorch/rl/pull/1760
  • [BugFix] Fix RLHF by @vmoens in https://github.com/pytorch/rl/pull/1757
  • [BugFix] Fix slice sampler by @vmoens in https://github.com/pytorch/rl/pull/1762
  • [Feature] BurnInTransform by @albertbou92 in https://github.com/pytorch/rl/pull/1765
  • [Bug] Minor change burnin transform by @albertbou92 in https://github.com/pytorch/rl/pull/1770
  • [BugFix] Fix sampling of last item in SliceSampler by @vmoens in https://github.com/pytorch/rl/pull/1774
  • [Feature] Open-X Embodiement datasets by @vmoens in https://github.com/pytorch/rl/pull/1751
  • [BugFix] Fix documentation of threads for batched envs. by @skandermoalla in https://github.com/pytorch/rl/pull/1776
  • [BugFix, CI] Fix OpenML datasets runs by @vmoens in https://github.com/pytorch/rl/pull/1779
  • [Versioning] Bump v0.3.0 and fix m1-wheels by @vmoens in https://github.com/pytorch/rl/pull/1780
  • [Feature] Composite replay buffers by @vmoens in https://github.com/pytorch/rl/pull/1768
  • [BugFix, Feature] Vmap randomness in losses by @BY571 in https://github.com/pytorch/rl/pull/1740
  • [Algorithm] Update discrete SAC example by @BY571 in https://github.com/pytorch/rl/pull/1745
  • [Docs] Pointers to BenchMARL by @matteobettini in https://github.com/pytorch/rl/pull/1710
  • [Feature] Immutable writer for datasets by @vmoens in https://github.com/pytorch/rl/pull/1781
  • [Feature] Remove and check for prints in codebase using flake8-print by @vmoens in https://github.com/pytorch/rl/pull/1758
  • [BUG] Missing import for some Samplers in Data module by @albertbou92 in https://github.com/pytorch/rl/pull/1784
  • [BugFix] Ensure that infos and samples have the same batch-size in SamplerEnsemble by @vmoens in https://github.com/pytorch/rl/pull/1786
  • [BugFix] Writers extend() method should always return indices in data.device by @albertbou92 in https://github.com/pytorch/rl/pull/1785
  • [Doc] Revamp envs doc by @vmoens in https://github.com/pytorch/rl/pull/1787
  • [BugFix] Less flaky gym vecenv test by @vmoens in https://github.com/pytorch/rl/pull/1790
  • [CI] Regroup tests by @vmoens in https://github.com/pytorch/rl/pull/1791
  • [CI] Remove stable GPU tests from CI by @vmoens in https://github.com/pytorch/rl/pull/1792
  • Update README.md to fix CI banner by @vmoens in https://github.com/pytorch/rl/pull/1794
  • [Feature] SamplerWithoutReplacement state dictionary by @matteobettini in https://github.com/pytorch/rl/pull/1788
  • [BugFix] Higher time threshold for PEnv by @vmoens in https://github.com/pytorch/rl/pull/1799
  • [Feature] SignTransform by @albertbou92 in https://github.com/pytorch/rl/pull/1798
  • [Feature] Extend MaxValueWriter with reduce parameter for the rank_key by @albertbou92 in https://github.com/pytorch/rl/pull/1796
  • [BugFix] Fixes bug in MaxValueWriter tests by @albertbou92 in https://github.com/pytorch/rl/pull/1801
  • [Performance] faster gym-like class by @vmoens in https://github.com/pytorch/rl/pull/1803
  • [Feature] GenDGRL by @vmoens in https://github.com/pytorch/rl/pull/1773
  • [Performance] Minor improvements to stepandmaybe_reset in batched envs by @vmoens in https://github.com/pytorch/rl/pull/1807
  • [Algorithm] Discrete IQL by @BY571 in https://github.com/pytorch/rl/pull/1793
  • [Doc] More depth in VMAS docs by @matteobettini in https://github.com/pytorch/rl/pull/1802
  • [BugFix] Remove select() in favor of empty() by @vmoens in https://github.com/pytorch/rl/pull/1811
  • Bump jinja2 from 3.1.2 to 3.1.3 in /docs by @dependabot in https://github.com/pytorch/rl/pull/1812
  • [BugFix] Make TransformedEnv mirror allow_done_after_reset property of base env by @matteobettini in https://github.com/pytorch/rl/pull/1810
  • [Doc] Update StepCounter doc by @skandermoalla in https://github.com/pytorch/rl/pull/1813
  • [Feature] Improve info_dict reader by @vmoens in https://github.com/pytorch/rl/pull/1809
  • [CI, Minor] Regroup Gen-DGRL CI with other libs by @vmoens in https://github.com/pytorch/rl/pull/1814
  • [Versioning] Housekeeping in setup.py by @vmoens in https://github.com/pytorch/rl/pull/1816
  • [Feature] TorchRL2Gym conversion by @vmoens in https://github.com/pytorch/rl/pull/1795
  • [BugFix, CI] Fix snapshop imports in stable CI by @vmoens in https://github.com/pytorch/rl/pull/1821
  • [Feature] More flexibility in loading PettingZoo by @matteobettini in https://github.com/pytorch/rl/pull/1817
  • [Docs] Fix doc of ToTensorImage transforms.py by @skandermoalla in https://github.com/pytorch/rl/pull/1824
  • [BugFix] Fix device of container generated values in transforms by @vmoens in https://github.com/pytorch/rl/pull/1827
  • [Feature] Atari DQN dataset by @vmoens in https://github.com/pytorch/rl/pull/1815
  • [Feature] Non-functional objectives (PPO, A2C, Reinforce) by @vmoens in https://github.com/pytorch/rl/pull/1804
  • [Refactor] change default CKPT_BACKEND to torch by @vmoens in https://github.com/pytorch/rl/pull/1830
  • pyproject.toml: remove unknown properties by @GaetanLepage in https://github.com/pytorch/rl/pull/1828
  • [Doc, Feature] Doc improvements for video recording and CSV video formats by @vmoens in https://github.com/pytorch/rl/pull/1829
  • [Feature] PyTrees in replay buffers by @vmoens in https://github.com/pytorch/rl/pull/1831
  • [BugFix] Fix sequential step counts by @vmoens in https://github.com/pytorch/rl/pull/1838
  • [Doc] TED format by @vmoens in https://github.com/pytorch/rl/pull/1836
  • [Doc] References to TED by @vmoens in https://github.com/pytorch/rl/pull/1839
  • [BugFix] Temporarily set lazy legacy to True by @vmoens in https://github.com/pytorch/rl/pull/1840
  • [BugFix] Fix gym info scalar infos by @vmoens in https://github.com/pytorch/rl/pull/1842
  • [Refactor] LAZYLEGACYOP=False by @vmoens in https://github.com/pytorch/rl/pull/1832
  • [Feature] serial_for_single arg in batched envs by @vmoens in https://github.com/pytorch/rl/pull/1846
  • [BugFix] Fix VD4RL by @vmoens in https://github.com/pytorch/rl/pull/1834
  • [Doc] Make tutos runnable without colab by @vmoens in https://github.com/pytorch/rl/pull/1826
  • [Feature] Fine control over devices in collectors by @vmoens in https://github.com/pytorch/rl/pull/1835
  • [Feature, BugFix] Better thread control in penv and collectors by @vmoens in https://github.com/pytorch/rl/pull/1848
  • [CI] Update macos image by @vmoens in https://github.com/pytorch/rl/pull/1849
  • [BugFix] thread setting bug by @vmoens in https://github.com/pytorch/rl/pull/1852
  • Remove unused completed_keys property from StepCounter. by @skandermoalla in https://github.com/pytorch/rl/pull/1854
  • [Feature] Submitit run script by @albertbou92 in https://github.com/pytorch/rl/pull/1822
  • [BugFix] Fix flaky gym penv test by @vmoens in https://github.com/pytorch/rl/pull/1853
  • [CI] Fix macos build by @vmoens in https://github.com/pytorch/rl/pull/1856

New Contributors

  • @mikemykhaylov made their first contribution in https://github.com/pytorch/rl/pull/1655
  • @MarCnu made their first contribution in https://github.com/pytorch/rl/pull/1661
  • @Deep145757 made their first contribution in https://github.com/pytorch/rl/pull/1665
  • @vaibhav-009 made their first contribution in https://github.com/pytorch/rl/pull/1667
  • @laszloKopits made their first contribution in https://github.com/pytorch/rl/pull/1669
  • @ianbarber made their first contribution in https://github.com/pytorch/rl/pull/1752
  • @dependabot made their first contribution in https://github.com/pytorch/rl/pull/1812
  • @GaetanLepage made their first contribution in https://github.com/pytorch/rl/pull/1828

Full Changelog: https://github.com/pytorch/rl/compare/v0.2.1...v0.3.0

- Python
Published by vmoens over 2 years ago

torchrl - v0.2.1: Faster parallel envs, fixes in transforms and M1 wheel fix

What's Changed

  • [Feature] Warning for init_random_frames rounding in collectors by @matteobettini in https://github.com/pytorch/rl/pull/1616
  • [Feature] Add support of non-pickable gym env by @duburcqa in https://github.com/pytorch/rl/pull/1615
  • [BugFix] Add keys to GAE in PPO/A2C by @vmoens in https://github.com/pytorch/rl/pull/1618
  • [BugFix] Fix gym benchmark by @vmoens in https://github.com/pytorch/rl/pull/1619
  • [BugFix] Fix shape setting in CompositeSpec by @vmoens in https://github.com/pytorch/rl/pull/1620
  • [Deprecation] Deprecate ambiguous device for memmap replay buffer by @vmoens in https://github.com/pytorch/rl/pull/1624
  • [CI] Fix CI (python and cuda versions) by @vmoens in https://github.com/pytorch/rl/pull/1621
  • [Feature] Max Value Writer by @albertbou92 in https://github.com/pytorch/rl/pull/1622
  • [CI] Cython<3 for d4rl by @vmoens in https://github.com/pytorch/rl/pull/1634
  • [BugFix] make cursor a torch.long tensor by @vmoens in https://github.com/pytorch/rl/pull/1639
  • [BugFix] Gracefully handle C++ import error in TorchRL by @vmoens in https://github.com/pytorch/rl/pull/1640
  • [Feature] stepandmaybe_reset in env by @vmoens in https://github.com/pytorch/rl/pull/1611
  • [BugFix] Avoid overlapping temporary dirs during training by @vmoens in https://github.com/pytorch/rl/pull/1635
  • [Feature] Exclude all private keys in collectors by @vmoens in https://github.com/pytorch/rl/pull/1644
  • [BugFix] Fix tutos by @vmoens in https://github.com/pytorch/rl/pull/1648
  • [Feature] Lazy imports for implement_for during torchrl import by @vmoens in https://github.com/pytorch/rl/pull/1646
  • [Refactor] Put all buffers on CPU in examples by @vmoens in https://github.com/pytorch/rl/pull/1645
  • [BugFix] Fix storage device by @vmoens in https://github.com/pytorch/rl/pull/1650
  • [BugFix] Fix EXAMPLES.md by @vmoens in https://github.com/pytorch/rl/pull/1649
  • [Release] 0.2.1 by @vmoens in https://github.com/pytorch/rl/pull/1642

New Contributors

  • @duburcqa made their first contribution in https://github.com/pytorch/rl/pull/1615

Full Changelog: https://github.com/pytorch/rl/compare/v0.2.0...v0.2.1

- Python
Published by vmoens over 2 years ago

torchrl - 0.2.0: Faster collection, MARL compatibility and RLHF prototype

TorchRL 0.2.0

This release provides many new features and bug fixes.

TorchRL now publishes Apple Silicon compatible wheels. We drop coverage of python 3.7 in favour of 3.11.

New and updated algorithms

Most algorithms have been cleaned and designed to reach (at least) SOTA results.

image

Compatibility with MARL settings has been drastically improved, and we provide a good amount of MARL examples within the library:

image

A prototype RLHF training script is also proposed (#1597)

A whole new category of offline RL algorithms have been integrated: Decision transformers.

  • [Algorithm] Update offpolicy examples by @BY571 in https://github.com/pytorch/rl/pull/1206
  • [Algorithm] Online Decision transformer by @BY571 in https://github.com/pytorch/rl/pull/1149
  • [Algorithm] QMixer loss and multiagent models by @matteobettini in https://github.com/pytorch/rl/pull/1378
  • [Algorithm] RLHF end-to-end, clean by @vmoens in https://github.com/pytorch/rl/pull/1597
  • [Algorithm] Update A2C examples by @albertbou92 in https://github.com/pytorch/rl/pull/1521
  • [Algorithm] Update DDPG Example by @BY571 in https://github.com/pytorch/rl/pull/1525
  • [Algorithm] Update DT by @BY571 in https://github.com/pytorch/rl/pull/1560
  • [Algorithm] Update PPO examples by @albertbou92 in https://github.com/pytorch/rl/pull/1495
  • [Algorithm] Update SAC Example by @BY571 in https://github.com/pytorch/rl/pull/1524
  • [Algorithm] Update TD3 Example by @BY571 in https://github.com/pytorch/rl/pull/1523

New features

One of the major new features of the library is the introduction of the terminated / truncated / done distinction at no cost within the library. All third-party and primary environments are now compatible with this, as well as losses and data collection primitives (collector etc). This feature is also compatible with complex data structures, such as those found in MARL training pipelines.

All losses are now compatible with tensordict-free inputs, for a more generic deployment.

New transforms

Atari games can now benefit from a EndOfLifeTransform that allows to use the end-of-life as a done state in the loss (#1605)

We provide a KL transform to add a KL factor to the reward in RLHF settings.

Action masking is made possible through the ActionMask transform (#1421)

VC1 is also integrated for better image embedding.

  • [Feature] Allow sequential transforms to work offline by @vmoens in https://github.com/pytorch/rl/pull/1136
  • [Feature] ClipTransform + rename min/maximum -> low/high by @vmoens in https://github.com/pytorch/rl/pull/1500
  • [Feature] End-of-life transform by @vmoens in https://github.com/pytorch/rl/pull/1605
  • [Feature] KL Transform for RLHF by @vmoens in https://github.com/pytorch/rl/pull/1196
  • [Features] Conv3dNet and PermuteTransform by @xmaples in https://github.com/pytorch/rl/pull/1398
  • [Feature, Refactor] Scale in ToTensorImage based on the dtype and new from_int parameter by @hyerra in https://github.com/pytorch/rl/pull/1208
  • [Feature] CatFrames used as inverse by @BY571 in https://github.com/pytorch/rl/pull/1321
  • [Feature] Masking actions by @vmoens in https://github.com/pytorch/rl/pull/1421
  • [Feature] VC1 integration by @vmoens in https://github.com/pytorch/rl/pull/1211

New models

We provide GRU alongside LSTM for POMDP training.

MARL model coverage is now richer of a MultiAgentMLP and MultiAgentCNN! Other improvments for MARL include coverage for nested keys in most places of the library (losses, data collection, environments...)/

  • [Feature] Support for GRU by @vmoens in https://github.com/pytorch/rl/pull/1586
  • [Feature] TanhModule by @vmoens in https://github.com/pytorch/rl/pull/1213
  • [Features] Conv3dNet and PermuteTransform by @xmaples in https://github.com/pytorch/rl/pull/1398
  • [Feature] CNN version of MultiAgentMLP by @MarkHaoxiang in https://github.com/pytorch/rl/pull/1479

Other features (misc)

  • [Feature] RLHF Rollouts (reopened) by @vmoens in https://github.com/pytorch/rl/pull/1329
  • [Feature] Add CQL by @BY571 in https://github.com/pytorch/rl/pull/1239
  • [Feature] Allow multiple (nested) action, reward, done keys in env,vec_env and collectors by @matteobettini in https://github.com/pytorch/rl/pull/1462
  • [Feature] Auto-DoubleToFloat by @vmoens in https://github.com/pytorch/rl/pull/1442
  • [Feature] CompositeSpec.lock by @vmoens in https://github.com/pytorch/rl/pull/1143
  • [Feature] Device transform by @vmoens in https://github.com/pytorch/rl/pull/1472
  • [Feature] Dispatch DiscreteSAC loss module by @Blonck in https://github.com/pytorch/rl/pull/1248
  • [Feature] Dispatch PPO loss module by @Blonck in https://github.com/pytorch/rl/pull/1249
  • [Feature] Dispatch REDQ loss module by @Blonck in https://github.com/pytorch/rl/pull/1251
  • [Feature] Dispatch SAC loss module by @Blonck in https://github.com/pytorch/rl/pull/1244
  • [Feature] Dispatch TD3 loss module by @Blonck in https://github.com/pytorch/rl/pull/1254
  • [Feature] Dispatch for DDPG loss module by @Blonck in https://github.com/pytorch/rl/pull/1215
  • [Feature] Dispatch for SAC loss module by @Blonck in https://github.com/pytorch/rl/pull/1223
  • [Feature] Dispatch reinforce loss module by @Blonck in https://github.com/pytorch/rl/pull/1252
  • [Feature] Distpatch IQL loss module by @Blonck in https://github.com/pytorch/rl/pull/1230
  • [Feature] Fix DType casting lazy init by @vmoens in https://github.com/pytorch/rl/pull/1589
  • [Feature] Heterogeneous Environments compatibility by @matteobettini in https://github.com/pytorch/rl/pull/1411
  • [Feature] Log hparams from python dict by @matteobettini in https://github.com/pytorch/rl/pull/1517
  • [Feature] MARL exploration e-greedy compatibility by @matteobettini in https://github.com/pytorch/rl/pull/1277
  • [Feature] Make advantages compatible with Terminated, Truncated, Done by @vmoens in https://github.com/pytorch/rl/pull/1581
  • [Feature] Make losses inherit from TDMBase by @vmoens in https://github.com/pytorch/rl/pull/1246
  • [Feature] Making action masks compatible with q value modules and e-greedy by @matteobettini in https://github.com/pytorch/rl/pull/1499
  • [Feature] Nested keys in OrnsteinUhlenbeckProcess by @matteobettini in https://github.com/pytorch/rl/pull/1305
  • [Feature] Optional mapping of "state" in gym specs by @matteobettini in https://github.com/pytorch/rl/pull/1431
  • [Feature] Parallel environments lazy heterogenous data compatibility by @matteobettini in https://github.com/pytorch/rl/pull/1436
  • [Feature] Pettingzoo: add multiagent dimension to single agent groups by @matteobettini in https://github.com/pytorch/rl/pull/1550
  • [Feature] RLHF Reward Model (reopened) by @vmoens in https://github.com/pytorch/rl/pull/1328
  • [Feature] RLHF dataloading by @vmoens in https://github.com/pytorch/rl/pull/1309
  • [Feature] RLHF networks by @apbard in https://github.com/pytorch/rl/pull/1319
  • [Feature] Refactor categorical dists: Masked one-hot and pass-through gradients by @vmoens in https://github.com/pytorch/rl/pull/1488
  • [Feature] ReplayBuffer.empty by @vmoens in https://github.com/pytorch/rl/pull/1238
  • [Feature] Separate losses by @MateuszGuzek in https://github.com/pytorch/rl/pull/1240
  • [Feature] Single call to value network in advantages [bis] by @vmoens in https://github.com/pytorch/rl/pull/1263
  • [Feature] Single call to value network in advantages by @vmoens in https://github.com/pytorch/rl/pull/1256
  • [Feature] TensorStorage by @vmoens in https://github.com/pytorch/rl/pull/1310
  • [Feature] Threaded collection and parallel envs by @vmoens in https://github.com/pytorch/rl/pull/1559
  • [Feature] Unbind specs by @vmoens in https://github.com/pytorch/rl/pull/1555
  • [Feature] VMAS obs dict by @matteobettini in https://github.com/pytorch/rl/pull/1419
  • [Feature] VMAS: choose between categorical or one-hot actions by @matteobettini in https://github.com/pytorch/rl/pull/1484
  • [Feature] dispatch for DQNLoss by @vmoens in https://github.com/pytorch/rl/pull/1194
  • [Feature] log histograms by @vmoens in https://github.com/pytorch/rl/pull/1306
  • [Feature] make csv logger exist_ok on logging folder by @matteobettini in https://github.com/pytorch/rl/pull/1561
  • [Feature] shifted for all adv by @vmoens in https://github.com/pytorch/rl/pull/1276

New environments and third-party improvements

We now cover SMAC-v2, PettingZoo, IsaacGymEnvs (prototype) and RoboHive. The D4RL dataset can now be used without the eponym library, which permit training with more recent or older versions of gym.

  • [Environment, Docs] SMACv2 and docs on action masking by @matteobettini in https://github.com/pytorch/rl/pull/1466
  • [Environment] Petting zoo by @matteobettini in https://github.com/pytorch/rl/pull/1471
  • [Feature] D4rl direct download by @MateuszGuzek in https://github.com/pytorch/rl/pull/1430
  • [Feature] Gym 'vectorized' envs compatibility by @vmoens in https://github.com/pytorch/rl/pull/1519
  • [Feature] Gym compatibility: Terminal and truncated by @vmoens in https://github.com/pytorch/rl/pull/1539
  • [Feature] IsaacGymEnvs integration by @vmoens in https://github.com/pytorch/rl/pull/1443
  • [Feature] RoboHive integration by @vmoens in https://github.com/pytorch/rl/pull/1119

Performance improvements

We provide several speed improvements, in particular for data collection.

image

  • [Performance] Accelerate GAE by @Blonck in https://github.com/pytorch/rl/pull/1142
  • [Performance] Accelerate TD lambda return estimate by @Blonck in https://github.com/pytorch/rl/pull/1158
  • [Performance] Accelerate _split_and_pad_sequence by @Blonck in https://github.com/pytorch/rl/pull/1147
  • [Performance] Faster GAE by @vmoens in https://github.com/pytorch/rl/pull/1153
  • [Performance] Faster losses by @vmoens in https://github.com/pytorch/rl/pull/1272
  • [Performance] Improve performance and streamline the generating of the gammalambda tensor by @Blonck in https://github.com/pytorch/rl/pull/1171
  • [Performance] Miscellaneous efficiency improvements by @vmoens in https://github.com/pytorch/rl/pull/1513
  • [Performance] Reduce key accessing in transforms by @matteobettini in https://github.com/pytorch/rl/pull/1590
  • [Performance] Some efficiency improvements by @vmoens in https://github.com/pytorch/rl/pull/1250
  • [Performance] Vmas vectorized reset by @matteobettini in https://github.com/pytorch/rl/pull/1146

Bug fixes

  • [BugFIx] Fix entropy signature in truncated normal by @vmoens in https://github.com/pytorch/rl/pull/1536
  • [BugFix,CI] Fix virtualenv not found by @vmoens in https://github.com/pytorch/rl/pull/1280
  • [BugFix] Add torch.no_grad() for rendering in multiagent PPO tutorial by @matteobettini in https://github.com/pytorch/rl/pull/1511
  • [BugFix] Batched envs compatibility with custom keys by @matteobettini in https://github.com/pytorch/rl/pull/1348
  • [BugFix] C++17 by @vmoens in https://github.com/pytorch/rl/pull/1169
  • [BugFix] Check env specs for nested envs by @matteobettini in https://github.com/pytorch/rl/pull/1332
  • [BugFix] CompositeSpec.unsqueeze by @btx0424 in https://github.com/pytorch/rl/pull/1464
  • [BugFix] DDPG select also critic input for actor loss by @matteobettini in https://github.com/pytorch/rl/pull/1563
  • [BugFix] DQN loss dispatch respect configured tensordict keys by @Blonck in https://github.com/pytorch/rl/pull/1285
  • [BugFix] Discrete SAC rewrite by @matteobettini in https://github.com/pytorch/rl/pull/1461
  • [BugFix] Empty-spec tolerance by @vmoens in https://github.com/pytorch/rl/pull/1501
  • [BugFix] Fix Brax reset by @vmoens in https://github.com/pytorch/rl/pull/1195
  • [BugFix] Fix CatFrames by @vmoens in https://github.com/pytorch/rl/pull/1336
  • [BugFix] Fix ClipTransform device by @vmoens in https://github.com/pytorch/rl/pull/1508
  • [BugFix] Fix Cython for D4RL by @vmoens in https://github.com/pytorch/rl/pull/1429
  • [BugFix] Fix DDPG by @vmoens in https://github.com/pytorch/rl/pull/1183
  • [BugFix] Fix DDPG squeezing by @matteobettini in https://github.com/pytorch/rl/pull/1487
  • [BugFix] Fix Dreamer test error by @vmoens in https://github.com/pytorch/rl/pull/1558
  • [BugFix] Fix Gym Categorical/One-hot issues by @vmoens in https://github.com/pytorch/rl/pull/1482
  • [BugFix] Fix KL import errors by @vmoens in https://github.com/pytorch/rl/pull/1207
  • [BugFix] Fix KLTransform execution with LSTM by @vmoens in https://github.com/pytorch/rl/pull/1426
  • [BugFix] Fix KeyError in inverse transform replay buffer by @BY571 in https://github.com/pytorch/rl/pull/1165
  • [BugFix] Fix LSTM - VecEnv compatibility by @vmoens in https://github.com/pytorch/rl/pull/1427
  • [BugFix] Fix LSTM use with padded/masked segments by @smorad in https://github.com/pytorch/rl/pull/1399
  • [BugFix] Fix NoopResetEnv behavior when trials exceeded. by @skandermoalla in https://github.com/pytorch/rl/pull/1477
  • [BugFix] Fix QValueModule multionehot by @smorad in https://github.com/pytorch/rl/pull/1439
  • [BugFix] Fix RLHF tests - transformers v4.34 by @vmoens in https://github.com/pytorch/rl/pull/1601
  • [BugFix] Fix RewardSum spec transform to mimic reward spec by @matteobettini in https://github.com/pytorch/rl/pull/1478
  • [BugFix] Fix SAC alpha optim by @vmoens in https://github.com/pytorch/rl/pull/1192
  • [BugFix] Fix SAC by @vmoens in https://github.com/pytorch/rl/pull/1189
  • [BugFix] Fix SAC by @vmoens in https://github.com/pytorch/rl/pull/1190
  • [BugFix] Fix SACv2 by @vmoens in https://github.com/pytorch/rl/pull/1191
  • [BugFix] Fix SMAC-v2 by @vmoens in https://github.com/pytorch/rl/pull/1538
  • [BugFix] Fix TD3 and compat with https://github.com/pytorch-labs/tensordict/pull/482 by @vmoens in https://github.com/pytorch/rl/pull/1375
  • [BugFix] Fix TD3 inplace updates by @vmoens in https://github.com/pytorch/rl/pull/1219
  • [BugFix] Fix TD3 target net by @vmoens in https://github.com/pytorch/rl/pull/1186
  • [BugFix] Fix LazyStackedCompositeSpec and introducing consolidate_spec by @matteobettini in https://github.com/pytorch/rl/pull/1392
  • [BugFix] Fix step_mdp() by @matteobettini in https://github.com/pytorch/rl/pull/1334
  • [BugFix] Fix action mask test by @vmoens in https://github.com/pytorch/rl/pull/1492
  • [BugFix] Fix brax by @vmoens in https://github.com/pytorch/rl/pull/1346
  • [BugFix] Fix bug in ppo example config by @degensean in https://github.com/pytorch/rl/pull/1396
  • [BugFix] Fix envpool by @vmoens in https://github.com/pytorch/rl/pull/1530
  • [BugFix] Fix error message of .set_keys() in advantage modules by @Blonck in https://github.com/pytorch/rl/pull/1218
  • [BugFix] Fix examples by @vmoens in https://github.com/pytorch/rl/pull/1173
  • [BugFix] Fix locked params modif by @vmoens in https://github.com/pytorch/rl/pull/1307
  • [BugFix] Fix max length by @vmoens in https://github.com/pytorch/rl/pull/1233
  • [BugFix] Fix missing ("next", "observation") key in dispatch of losses by @Blonck in https://github.com/pytorch/rl/pull/1235
  • [BugFix] Fix nested CompositeSpec creation by @vmoens in https://github.com/pytorch/rl/pull/1261
  • [BugFix] Fix nightly tensordict dependency by @skandermoalla in https://github.com/pytorch/rl/pull/1302
  • [BugFix] Fix ppo example by @vmoens in https://github.com/pytorch/rl/pull/1225
  • [BugFix] Fix ppo training NaN occurences by @vmoens in https://github.com/pytorch/rl/pull/1403
  • [BugFix] Fix reward sum within parallel envs by @vmoens in https://github.com/pytorch/rl/pull/1454
  • [BugFix] Fix runtypechecks by @vmoens in https://github.com/pytorch/rl/pull/1570
  • [BugFix] Fix safe tanh for older torch versions by @vmoens in https://github.com/pytorch/rl/pull/1220
  • [BugFix] Fix serialization of parallel envs by @vmoens in https://github.com/pytorch/rl/pull/1197
  • [BugFix] Fix split_trajs by @vmoens in https://github.com/pytorch/rl/pull/1444
  • [BugFix] Fix tanh/atanh vmap compatibility by @vmoens in https://github.com/pytorch/rl/pull/1217
  • [BugFix] Fix the bug of RoundRobinWriter.extend(data) by @xmaples in https://github.com/pytorch/rl/pull/1295
  • [BugFix] Fix tutorials by @vmoens in https://github.com/pytorch/rl/pull/1382
  • [BugFix] Fix typo in CatFrames Transform error message. by @skandermoalla in https://github.com/pytorch/rl/pull/1491
  • [BugFix] Fix vmap in VmapModule (torch 1.13 compat) by @vmoens in https://github.com/pytorch/rl/pull/1350
  • [BugFix] Improve collector buffer initialisation when policy spec is unavailable by @matteobettini in https://github.com/pytorch/rl/pull/1547
  • [BugFix] Instantiate 2 losses with different keys by @matteobettini in https://github.com/pytorch/rl/pull/1553
  • [BugFix] KL module integration by @vmoens in https://github.com/pytorch/rl/pull/1212
  • [BugFix] Key selection in batched envs by @vmoens in https://github.com/pytorch/rl/pull/1253
  • [BugFix] Load collector frames and iter by @matteobettini in https://github.com/pytorch/rl/pull/1557
  • [BugFix] Make VecNorm Transform pickable by @albertbou92 in https://github.com/pytorch/rl/pull/1596
  • [BugFix] Minor fixes PPO / A2C examples by @albertbou92 in https://github.com/pytorch/rl/pull/1591
  • [BugFix] Multiagent "auto" entropy fix in SAC by @matteobettini in https://github.com/pytorch/rl/pull/1494
  • [BugFix] Nested envs compatibility by @matteobettini in https://github.com/pytorch/rl/pull/1347
  • [BugFix] Nested key in replay buffer by @matteobettini in https://github.com/pytorch/rl/pull/1485
  • [BugFix] Nested keys in transforms by @matteobettini in https://github.com/pytorch/rl/pull/1355
  • [BugFix] Nested keys to probabilistic modules by @matteobettini in https://github.com/pytorch/rl/pull/1363
  • [BugFix] Parametric rand_action() in BaseEnv by @matteobettini in https://github.com/pytorch/rl/pull/1267
  • [BugFix] Parametric collectors by @matteobettini in https://github.com/pytorch/rl/pull/1303
  • [BugFix] Patch SAC to allow state_dict manipulation before exec by @vmoens in https://github.com/pytorch/rl/pull/1607
  • [BugFix] PettingZoo seeding by @matteobettini in https://github.com/pytorch/rl/pull/1554
  • [BugFix] Pickable buffer by @albertbou92 in https://github.com/pytorch/rl/pull/1410
  • [BugFix] QValue modules and nested action by @matteobettini in https://github.com/pytorch/rl/pull/1351
  • [BugFix] Reward sum custom key by @matteobettini in https://github.com/pytorch/rl/pull/1413
  • [BugFix] SafeModule not safely handling specs by @matteobettini in https://github.com/pytorch/rl/pull/1352
  • [BugFix] Small patches to SMAC by @matteobettini in https://github.com/pytorch/rl/pull/1533
  • [BugFix] Sparse info in SMACv2 by @matteobettini in https://github.com/pytorch/rl/pull/1546
  • [BugFix] ToTensorImage unsqueeze would not update the observation spec by @hyerra in https://github.com/pytorch/rl/pull/1161
  • [BugFix] Torch 1.13 compat by @vmoens in https://github.com/pytorch/rl/pull/1294
  • [BugFix] Unbreak tensordict import by @vmoens in https://github.com/pytorch/rl/pull/1231
  • [BugFix] Vectorized priority update in replay buffers by @matteobettini in https://github.com/pytorch/rl/pull/1598
  • [BugFix] transposetime with single dim by @vmoens in https://github.com/pytorch/rl/pull/1155
  • [BugFix] RewardSum transform for multiple reward keys by @matteobettini in https://github.com/pytorch/rl/pull/1544
  • [BugFix] step_mdp nested keys by @matteobettini in https://github.com/pytorch/rl/pull/1339
  • [BugFix] include buffers in policy_weights by @vmoens in https://github.com/pytorch/rl/pull/1185
  • [BugFix] loadstatedict in param updates for collectors by @vmoens in https://github.com/pytorch/rl/pull/1145
  • [BugFix] make value estimator with value_key from the PPOLoss init arg by @xmaples in https://github.com/pytorch/rl/pull/1144
  • [BugFix] unlock in tensordictmodules tests by @vmoens in https://github.com/pytorch/rl/pull/1417
  • [BugFix] valid_size not saved as attribute by @tcbegley in https://github.com/pytorch/rl/pull/1337

Miscellaneous

  • Envpool Tests to Nova by @osalpekar in https://github.com/pytorch/rl/pull/1283
  • Fix CI by @matteobettini in https://github.com/pytorch/rl/pull/1368
  • Fix MacOS Mujoco Failure by @osalpekar in https://github.com/pytorch/rl/pull/1450
  • Linux GPU Brax Unittests by @osalpekar in https://github.com/pytorch/rl/pull/1133
  • Linux Gym Unittests to GHA by @osalpekar in https://github.com/pytorch/rl/pull/1139
  • Linux Olddeps tests to Nova by @osalpekar in https://github.com/pytorch/rl/pull/1289
  • Move to More Efficient Windows Runner by @osalpekar in https://github.com/pytorch/rl/pull/1476
  • OptDeps Tests to Nova by @osalpekar in https://github.com/pytorch/rl/pull/1290
  • Remove Distributed CCI job by @osalpekar in https://github.com/pytorch/rl/pull/1374
  • Remove Envpool from CCI by @osalpekar in https://github.com/pytorch/rl/pull/1390
  • Remove old CircleCI Lint by @osalpekar in https://github.com/pytorch/rl/pull/1134
  • Removing Migrated and Unused CCI jobs by @osalpekar in https://github.com/pytorch/rl/pull/1288
  • Revert "[Feature] Single call to value network in advantages" by @vmoens in https://github.com/pytorch/rl/pull/1262
  • Revert "[Refactor,Performance] Faster collectors" by @vmoens in https://github.com/pytorch/rl/pull/1330
  • Sklearn test to Nova by @osalpekar in https://github.com/pytorch/rl/pull/1291
  • Windows Unittests on GHA by @osalpekar in https://github.com/pytorch/rl/pull/1086
  • [Benchmark,CI] Benchmarks in PR (pre) by @vmoens in https://github.com/pytorch/rl/pull/1342
  • [Benchmark,CI] Benchmarks in PR by @vmoens in https://github.com/pytorch/rl/pull/1341
  • [Benchmark] Benchmark Gym vs TorchRL by @vmoens in https://github.com/pytorch/rl/pull/1602
  • [Benchmark] Benchmark losses by @vmoens in https://github.com/pytorch/rl/pull/1287
  • [Benchmark] Benchmark number GPU vectorised environments in VMAS (TorchRL vs RLlib) by @matteobettini in https://github.com/pytorch/rl/pull/1446
  • [Benchmark] Improve benchmark precision + step_mdp + fix GPU by @vmoens in https://github.com/pytorch/rl/pull/1340
  • [CI] Add macOS M1 binaries Wheels by @DanilBaibak in https://github.com/pytorch/rl/pull/1504
  • [CI] Add ninja for MacOS builts by @vmoens in https://github.com/pytorch/rl/pull/1564
  • [CI] Concurrency on gha by @vmoens in https://github.com/pytorch/rl/pull/1152
  • [CI] Deprecate Windows GPU CCI by @osalpekar in https://github.com/pytorch/rl/pull/1387
  • [CI] Doc CI fix by @matteobettini in https://github.com/pytorch/rl/pull/1384
  • [CI] Fix CI PettingZoo by @matteobettini in https://github.com/pytorch/rl/pull/1528
  • [CI] Fix CI by @vmoens in https://github.com/pytorch/rl/pull/1529
  • [CI] Fix GHA gpu tests by @vmoens in https://github.com/pytorch/rl/pull/1356
  • [CI] Fix Jax version in Jumanji by @vmoens in https://github.com/pytorch/rl/pull/1242
  • [CI] Fix Mujoco version by @vmoens in https://github.com/pytorch/rl/pull/1475
  • [CI] Fix RoboHive CI by @vmoens in https://github.com/pytorch/rl/pull/1541
  • [CI] Fix brax and habitat by @vmoens in https://github.com/pytorch/rl/pull/1353
  • [CI] Fix examples CI by @matteobettini in https://github.com/pytorch/rl/pull/1489
  • [CI] Fix failing jobs by @vmoens in https://github.com/pytorch/rl/pull/1318
  • [CI] Fix failing jobs by @vmoens in https://github.com/pytorch/rl/pull/1335
  • [CI] Fix habitat CI by @vmoens in https://github.com/pytorch/rl/pull/1537
  • [CI] Fix jumanji by @vmoens in https://github.com/pytorch/rl/pull/1566
  • [CI] Fix nightly build dependency on tensordict by @vmoens in https://github.com/pytorch/rl/pull/1300
  • [CI] Fix opt deps machine and docker by @vmoens in https://github.com/pytorch/rl/pull/1362
  • [CI] Fix tuto deps by @matteobettini in https://github.com/pytorch/rl/pull/1416
  • [CI] Fix wheels by @vmoens in https://github.com/pytorch/rl/pull/1301
  • [CI] Less old deps by @vmoens in https://github.com/pytorch/rl/pull/1255
  • [CI] Less warnings in CI (costs) by @vmoens in https://github.com/pytorch/rl/pull/1349
  • [CI] Merge Distributed and Linux GPU job by @osalpekar in https://github.com/pytorch/rl/pull/1182
  • [CI] Migrate examples by @vmoens in https://github.com/pytorch/rl/pull/1364
  • [CI] Move linux stable to GHA by @vmoens in https://github.com/pytorch/rl/pull/1503
  • [CI] Reduce CI time by @vmoens in https://github.com/pytorch/rl/pull/1226
  • [CI] Remove CCI Config by @osalpekar in https://github.com/pytorch/rl/pull/1456
  • [CI] Remove examples from CCI by @vmoens in https://github.com/pytorch/rl/pull/1367
  • [CI] Update cuda version by @vmoens in https://github.com/pytorch/rl/pull/1380
  • [CI] Windows GPU Tests by @osalpekar in https://github.com/pytorch/rl/pull/1386
  • [Doc] Add link to paper in readme by @giadefa in https://github.com/pytorch/rl/pull/1298
  • [Doc] Add paper refs in doc and KB by @vmoens in https://github.com/pytorch/rl/pull/1241
  • [Doc] CITATION.cff by @vmoens in https://github.com/pytorch/rl/pull/1229
  • [Doc] Do not clean gh-pages by @vmoens in https://github.com/pytorch/rl/pull/1150
  • [Doc] Fix GPU benchmark by @vmoens in https://github.com/pytorch/rl/pull/1151
  • [Doc] Fix advantage examples by @vmoens in https://github.com/pytorch/rl/pull/1600
  • [Doc] Fix default value of tanh_loc in the documentation of TruncatedNormal. by @skandermoalla in https://github.com/pytorch/rl/pull/1205
  • [Doc] Fix doctest examples by @degensean in https://github.com/pytorch/rl/pull/1393
  • [Doc] Fix exploration modules docstrings by @vmoens in https://github.com/pytorch/rl/pull/1326
  • [Doc] Fix tanh_loc in docstrings by @vmoens in https://github.com/pytorch/rl/pull/1203
  • [Doc] TorchRL Logo by @vmoens in https://github.com/pytorch/rl/pull/1234
  • [Doc] Update citation by @vmoens in https://github.com/pytorch/rl/pull/1228
  • [Doc] Update coding_ppo.py by @kushaangupta in https://github.com/pytorch/rl/pull/1483
  • [Doc] correct typos in pendulum tutorial by @kushaangupta in https://github.com/pytorch/rl/pull/1502
  • [Doc] fixed typos in ppo tutorial by @MatteoGaetzner in https://github.com/pytorch/rl/pull/1314
  • [Docs] Fix multi-agent tutorial by @matteobettini in https://github.com/pytorch/rl/pull/1599
  • [Docs] Multi-agent environments by @matteobettini in https://github.com/pytorch/rl/pull/1383
  • [Example] Multiagent examples: MAPPO-IPPO-MADDPG-IDDPG-IQL-QMIX-VDN by @matteobettini in https://github.com/pytorch/rl/pull/1027
  • [Fix] Remove loss device by @matteobettini in https://github.com/pytorch/rl/pull/1395
  • [Lint] Add TorchFix linter by @kit1980 in https://github.com/pytorch/rl/pull/1580
  • [Minor] Capture error in CatFrame edit by @vmoens in https://github.com/pytorch/rl/pull/1498
  • [Minor] Fix prints by @vmoens in https://github.com/pytorch/rl/pull/1257
  • [Minor] Fix typo by @vmoens in https://github.com/pytorch/rl/pull/1193
  • [Minor] Missing commit from #1488 by @vmoens in https://github.com/pytorch/rl/pull/1490
  • [Minor] Missing lint by @vmoens in https://github.com/pytorch/rl/pull/1556
  • [Minor] More efficient SAC v1 by @vmoens in https://github.com/pytorch/rl/pull/1507
  • [Minor] Remove ya gymnasium deprecation warning in vectorized envs by @vmoens in https://github.com/pytorch/rl/pull/1573
  • [Minor] small fixes by @vmoens in https://github.com/pytorch/rl/pull/1237
  • [Nova] Jumanji Tests to GHA by @osalpekar in https://github.com/pytorch/rl/pull/1282
  • [Nova] Remove windows Unittests from CCI by @osalpekar in https://github.com/pytorch/rl/pull/1159
  • [Nova] Removing CircleCI Gym Unittests by @osalpekar in https://github.com/pytorch/rl/pull/1179
  • [Nova] Vmas Tests to GHA by @osalpekar in https://github.com/pytorch/rl/pull/1284
  • [Quality] Filter out warnings in subprocs by @vmoens in https://github.com/pytorch/rl/pull/1552
  • [Refacto] Migration due to tensordict 473 and 474 by @vmoens in https://github.com/pytorch/rl/pull/1354
  • [Refactor,Performance] Faster collectors (bis) by @vmoens in https://github.com/pytorch/rl/pull/1331
  • [Refactor,Performance] Faster collectors by @vmoens in https://github.com/pytorch/rl/pull/1327
  • [Refactor] Better GymLikeEnv by @vmoens in https://github.com/pytorch/rl/pull/1168
  • [Refactor] Better batch-size handling by RBs by @vmoens in https://github.com/pytorch/rl/pull/1311
  • [Refactor] Better updaters by @vmoens in https://github.com/pytorch/rl/pull/1184
  • [Refactor] Change objectives parameter/buffer/target logic by @vmoens in https://github.com/pytorch/rl/pull/1424
  • [Refactor] Edit ppo params by @vmoens in https://github.com/pytorch/rl/pull/1322
  • [Refactor] Expose all wrappers in torchrl.envs by @vmoens in https://github.com/pytorch/rl/pull/1532
  • [Refactor] Faster envs (2) by @vmoens in https://github.com/pytorch/rl/pull/1457
  • [Refactor] Fix imports by @vmoens in https://github.com/pytorch/rl/pull/1551
  • [Refactor] Follow-up on tensordict PR 473 by @vmoens in https://github.com/pytorch/rl/pull/1361
  • [Refactor] More unravel fixes by @vmoens in https://github.com/pytorch/rl/pull/1357
  • [Refactor] Nested reward and done specs by @vmoens in https://github.com/pytorch/rl/pull/1115
  • [Refactor] Refactor DDPG loss in standalone methods by @vmoens in https://github.com/pytorch/rl/pull/1603
  • [Refactor] Refactor _reset in ParallelEnv by @vmoens in https://github.com/pytorch/rl/pull/1172
  • [Refactor] Refactor losses for generalization by @vmoens in https://github.com/pytorch/rl/pull/1286
  • [Refactor] Remove pkg_resources import by @vmoens in https://github.com/pytorch/rl/pull/1379
  • [Refactor] Remove private calls to _set by @vmoens in https://github.com/pytorch/rl/pull/1370
  • [Refactor] Shape ops in LSTM based on tensor shape, not tensordict by @vmoens in https://github.com/pytorch/rl/pull/1170
  • [Refactor] Use settuple for faster set by @vmoens in https://github.com/pytorch/rl/pull/1372
  • [Refactor] Use wait instead of is_set to get results in ParallelEnv by @vmoens in https://github.com/pytorch/rl/pull/1562
  • [Refactor] Use masking in collectors by @vmoens in https://github.com/pytorch/rl/pull/1412
  • [Refactor] Vmas nested by @matteobettini in https://github.com/pytorch/rl/pull/1366
  • [Refactor] the usage of tensordict keys in loss modules by @Blonck in https://github.com/pytorch/rl/pull/1175
  • [Setup] Update setup.py python versions by @vmoens in https://github.com/pytorch/rl/pull/1496
  • [Test,BugFix] Fix Jax backend tests by @vmoens in https://github.com/pytorch/rl/pull/1162
  • [Test,CI,Feature] Total time per test by @vmoens in https://github.com/pytorch/rl/pull/1232
  • [Test] Remove import of test class by @matteobettini in https://github.com/pytorch/rl/pull/1549
  • [Test] Skip tests in python 3.11 by @vmoens in https://github.com/pytorch/rl/pull/1535
  • [Test] Skip threading tests in OSX by @vmoens in https://github.com/pytorch/rl/pull/1571
  • [Test] Test split trajs by @vmoens in https://github.com/pytorch/rl/pull/1445
  • [Test] Test state_dict and loss modules by @vmoens in https://github.com/pytorch/rl/pull/1527
  • [Tests] Collector compatibility for heterogeneous environments by @matteobettini in https://github.com/pytorch/rl/pull/1414
  • [Tests] DDPG extra critic input tests by @matteobettini in https://github.com/pytorch/rl/pull/1568
  • [Tutorial] Multiagent PPO tutorial by @matteobettini in https://github.com/pytorch/rl/pull/1385
  • [Versioning] Python 3.11 by @vmoens in https://github.com/pytorch/rl/pull/1433
  • [Versioning] Use python 3.8 for GPU tests by @vmoens in https://github.com/pytorch/rl/pull/1577
  • [Versioning] Write version all cases in setup.py by @vmoens in https://github.com/pytorch/rl/pull/1579
  • d4rl Test to Nova by @osalpekar in https://github.com/pytorch/rl/pull/1293
  • python 3.11 in README by @vmoens in https://github.com/pytorch/rl/pull/1434

New Contributors

  • @Blonck made their first contribution in https://github.com/pytorch/rl/pull/1142
  • @hyerra made their first contribution in https://github.com/pytorch/rl/pull/1161
  • @skandermoalla made their first contribution in https://github.com/pytorch/rl/pull/1205
  • @giadefa made their first contribution in https://github.com/pytorch/rl/pull/1298
  • @MatteoGaetzner made their first contribution in https://github.com/pytorch/rl/pull/1314
  • @MateuszGuzek made their first contribution in https://github.com/pytorch/rl/pull/1240
  • @degensean made their first contribution in https://github.com/pytorch/rl/pull/1393
  • @smorad made their first contribution in https://github.com/pytorch/rl/pull/1399
  • @kushaangupta made their first contribution in https://github.com/pytorch/rl/pull/1483
  • @kit1980 made their first contribution in https://github.com/pytorch/rl/pull/1580
  • @MarkHaoxiang made their first contribution in https://github.com/pytorch/rl/pull/1479
  • @DanilBaibak made their first contribution in https://github.com/pytorch/rl/pull/1504

A great THANKS to our contributors, in particular (but not in any particular order) @skandermoalla, @matteobettini, @BY571 and @albertbou92 for their tremendous dedication.

Full Changelog: https://github.com/pytorch/rl/compare/v0.1.1...v0.2.0

- Python
Published by vmoens over 2 years ago

torchrl - v0.1.1

What's Changed

  • [Feature] Stacking specs by @vmoens in https://github.com/pytorch/rl/pull/892
  • [Feature] Multicollector interruptor by @albertbou92 in https://github.com/pytorch/rl/pull/963
  • [BugFix] VMAS api fix by @matteobettini in https://github.com/pytorch/rl/pull/978
  • [CI] Fix D4RL tests in CI by @vmoens in https://github.com/pytorch/rl/pull/976
  • [CI] Fix CI by @vmoens in https://github.com/pytorch/rl/pull/982
  • [Refactor] Binary spec inherits from discrete spec by @matteobettini in https://github.com/pytorch/rl/pull/984
  • [Feature] _DataCollector -> DataCollectorBase by @vmoens in https://github.com/pytorch/rl/pull/985
  • [Feature] Discrete SAC by @BY571 in https://github.com/pytorch/rl/pull/882
  • [Refactor, Doc] Refactor refs to SafeModule to TensorDictModule unless necessary by @vmoens in https://github.com/pytorch/rl/pull/986
  • [BugFix] Quickfix by @vmoens in https://github.com/pytorch/rl/pull/991
  • [Feature] Add Dropout to MLP module by @BY571 in https://github.com/pytorch/rl/pull/988
  • [Feature] Warn when collectors collect more frames than requested by @matteobettini in https://github.com/pytorch/rl/pull/989
  • [BugFix] make "reset", "stepcount", and other donebased keys follow donespec by @matteobettini in https://github.com/pytorch/rl/pull/981
  • [Feature] Bandit datasets by @vmoens in https://github.com/pytorch/rl/pull/912
  • [BugFix] Fix sampling in PPO tutorial by @vmoens in https://github.com/pytorch/rl/pull/996
  • [Refactor] Refactor losses (value function, doc, input batch size) by @vmoens in https://github.com/pytorch/rl/pull/987
  • [BugFix,Feature,Doc] Fix replay buffers sampling info, docstrings and iteration by @vmoens in https://github.com/pytorch/rl/pull/1003
  • [Feature] Replace ValueError by warning in collectors when totalframes is not an exact multiple of framesper_batch by @albertbou92 in https://github.com/pytorch/rl/pull/999
  • [BugFix] Only call replay buffer transforms when there are by @vmoens in https://github.com/pytorch/rl/pull/1008
  • [BugFix] Patch tests in 1008 by @vmoens in https://github.com/pytorch/rl/pull/1009
  • [Feature] Multidim value functions by @vmoens in https://github.com/pytorch/rl/pull/1007
  • [BugFix] Fix exploration (OU and Gaussian) by @vmoens in https://github.com/pytorch/rl/pull/1006
  • [CI] Fix python version in habitat by @vmoens in https://github.com/pytorch/rl/pull/1010
  • Advantages pass time_dimand docfix by @matteobettini in https://github.com/pytorch/rl/pull/1014
  • [Refactor] Faster transformed distributions by @vmoens in https://github.com/pytorch/rl/pull/1017
  • [WIP, CI] Upgrade cuda channel by @vmoens in https://github.com/pytorch/rl/pull/1019
  • [BugFix] Fix collector reset with truncation by @vmoens in https://github.com/pytorch/rl/pull/1021
  • [Refactor] Improve collector performance by @matteobettini in https://github.com/pytorch/rl/pull/1020
  • [BugFix] Fix params and buffer casting for policies by @vmoens in https://github.com/pytorch/rl/pull/1022
  • [Feature] PPO allow entropy logging when entropy_coeff is 0 by @matteobettini in https://github.com/pytorch/rl/pull/1025
  • [Feature] Distributed data collector (ray) by @albertbou92 in https://github.com/pytorch/rl/pull/930
  • [Refactor] Minor changes in tensordict construction by @vmoens in https://github.com/pytorch/rl/pull/1029
  • [CI] Fix Brax 0.9.0 by @vmoens in https://github.com/pytorch/rl/pull/1011
  • [Feature] Multiagent API in vmas by @matteobettini in https://github.com/pytorch/rl/pull/983
  • [Feature] Benchmarking worflow by @vmoens in https://github.com/pytorch/rl/pull/1028
  • [Benchmark] Fix adv benchmark by @vmoens in https://github.com/pytorch/rl/pull/1030
  • [Doc] Refactor DDPG and DQN tutos to narrow the scope by @vmoens in https://github.com/pytorch/rl/pull/979
  • Revert "[Doc] Refactor DDPG and DQN tutos to narrow the scope" by @vmoens in https://github.com/pytorch/rl/pull/1032
  • [BugFix] Advantage normalisation in ClipPPOLoss is done after computing gain1 by @albertbou92 in https://github.com/pytorch/rl/pull/1033
  • [BugFix] Codecov SHA error by @vmoens in https://github.com/pytorch/rl/pull/1035
  • [Doc] DDPG and DQN refactoring -- Doc cleaning by @vmoens in https://github.com/pytorch/rl/pull/1036
  • [BugFix,CI] Fix macos codecov install by @vmoens in https://github.com/pytorch/rl/pull/1039
  • [BugFix] kwargs update in distributed collectors by @vmoens in https://github.com/pytorch/rl/pull/1040
  • [Feature] make_composite_from_td by @vmoens in https://github.com/pytorch/rl/pull/1042
  • [Refactor] Import envpool locally to avoid importing gym at root level by @vmoens in https://github.com/pytorch/rl/pull/1041
  • [Minor] Fix a typo by @FrankTianTT in https://github.com/pytorch/rl/pull/1046
  • [BugFix] Fix param tying in loss modules by @vmoens in https://github.com/pytorch/rl/pull/1037
  • [Refactor] less ad-hoc disableenvchecker check by @vmoens in https://github.com/pytorch/rl/pull/1047
  • [Refactor] Improve distributed collectors by @vmoens in https://github.com/pytorch/rl/pull/1044
  • [Doc] Document tensordict modules by @vmoens in https://github.com/pytorch/rl/pull/1053
  • [Doc] Minor changes to contributing.md by @vmoens in https://github.com/pytorch/rl/pull/1054
  • [Doc] A bit more doc on modules by @vmoens in https://github.com/pytorch/rl/pull/1056
  • [Refactor] Import enum and interaction_type utils by @Goldspear in https://github.com/pytorch/rl/pull/1055
  • [Feature] Deduplicate calls to common layers in PPO by @vmoens in https://github.com/pytorch/rl/pull/1057
  • [BugFix] CompositeSpec nested key deletion by @btx0424 in https://github.com/pytorch/rl/pull/1059
  • [Feature] Add MaskedCategorical distribution by @xiaomengy in https://github.com/pytorch/rl/pull/1012
  • [Refactor] resetting envs in collectors always passes the _reset entry by @vmoens in https://github.com/pytorch/rl/pull/1061
  • [Refactor] Better integration of QValue tools by @vmoens in https://github.com/pytorch/rl/pull/1063
  • MUJOCO_INSTALLATION.md: Fix typo by @traversaro in https://github.com/pytorch/rl/pull/1064
  • [Refactor] Removes "reward" from root tensordicts by @vmoens in https://github.com/pytorch/rl/pull/1065
  • [Test] Fix tests for older pytorch versions by @vmoens in https://github.com/pytorch/rl/pull/1066
  • [Feature] Reward2go Transform by @BY571 in https://github.com/pytorch/rl/pull/1038
  • [CI] Reduce tests by @vmoens in https://github.com/pytorch/rl/pull/1071
  • [Feature] Skip existing for advantage modules by @vmoens in https://github.com/pytorch/rl/pull/1070
  • [BugFix] Fix parallel env data passing on cuda by @vmoens in https://github.com/pytorch/rl/pull/1024
  • [Refactor] Deprecate interaction_mode by @vmoens in https://github.com/pytorch/rl/pull/1067
  • [Doc] Update KB: cannot find -lGL by @vmoens in https://github.com/pytorch/rl/pull/1073
  • [Doc] fix figures display issues in documentation of actors.py by @DamienAllonsius in https://github.com/pytorch/rl/pull/1074
  • [Example] PPO simplified example by @albertbou92 in https://github.com/pytorch/rl/pull/1004
  • [Feature] Update td in step (not overwrite) by @vmoens in https://github.com/pytorch/rl/pull/1075
  • [CI] Remove migrated CircleCI macOS jobs by @seemethere in https://github.com/pytorch/rl/pull/1069
  • [Feature] Target Return Transform by @BY571 in https://github.com/pytorch/rl/pull/1045
  • [Test] Fix tensorboard tests with ImageIO 2.26 by @vmoens in https://github.com/pytorch/rl/pull/1083
  • [Feature] LSTMModule by @vmoens in https://github.com/pytorch/rl/pull/1084
  • [BugFix] Change default of skip_existing to None by @tcbegley in https://github.com/pytorch/rl/pull/1082
  • [Example] A2C simplified example by @albertbou92 in https://github.com/pytorch/rl/pull/1076
  • [BugFix] Fix output_spec transform calls by @vmoens in https://github.com/pytorch/rl/pull/1091
  • [Feature] Indexing Discrete and OneHot specs by @remidomingues in https://github.com/pytorch/rl/pull/1081
  • [Refactor] Refactor DQN by @vmoens in https://github.com/pytorch/rl/pull/1085
  • [Feature] Auto-init updaters and raise a warning if not present by @vmoens in https://github.com/pytorch/rl/pull/1092
  • [BugFix] Remove false warnings in losses by @vmoens in https://github.com/pytorch/rl/pull/1096
  • [CI, BugFix] Fix CI warnings and errors by @vmoens in https://github.com/pytorch/rl/pull/1100
  • [Refactor] Update vmap imports to torch by @vmoens in https://github.com/pytorch/rl/pull/1102
  • [Refactor] Make advantages non-differentiable by default (except in losses) by @vmoens in https://github.com/pytorch/rl/pull/1104
  • [Feature] Indexing specs by @remidomingues in https://github.com/pytorch/rl/pull/1105
  • [BugFix] Fix EnvPoool by @vmoens in https://github.com/pytorch/rl/pull/1106
  • [Feature,Doc] QValue refactoring and QNet + RNN tuto by @vmoens in https://github.com/pytorch/rl/pull/1060
  • [BugFix] Fix Gym imports by @vmoens in https://github.com/pytorch/rl/pull/1023
  • [CI] pytest should not skip tests for dependencies by @rohitnig in https://github.com/pytorch/rl/pull/1048
  • [BugFix, Doc] Fix tutos by @vmoens in https://github.com/pytorch/rl/pull/1107
  • [CI] Fix tutos (2) by @vmoens in https://github.com/pytorch/rl/pull/1109
  • [Doc] Fix doc rendering by @vmoens in https://github.com/pytorch/rl/pull/1112
  • Added the entry for skip-tests in the environment.yml by @rohitnig in https://github.com/pytorch/rl/pull/1113
  • [CI] Upgrade ubuntu version in GHA by @vmoens in https://github.com/pytorch/rl/pull/1116
  • Fix in windows unit test by @mischab in https://github.com/pytorch/rl/pull/1099
  • Revert "Fix in windows unit test" by @mischab in https://github.com/pytorch/rl/pull/1117
  • [Nova] Lint job on GHA by @osalpekar in https://github.com/pytorch/rl/pull/1114
  • [Nova] Remove CircleCI Wheels Builds by @osalpekar in https://github.com/pytorch/rl/pull/1121
  • [BugFix] Set exploration mode to MODE in all losses by default by @vmoens in https://github.com/pytorch/rl/pull/1123
  • [BugFix] Instruct the value key to PPOLoss by @vmoens in https://github.com/pytorch/rl/pull/1124
  • [Feature] CatFrames for offline data by @vmoens in https://github.com/pytorch/rl/pull/1122
  • [CI] Fix windows CI by @vmoens in https://github.com/pytorch/rl/pull/1128
  • [Refactor] Buffers tensorclass compat and tutorial by @vmoens in https://github.com/pytorch/rl/pull/1101
  • [Feature] Marking the time dimension by @vmoens in https://github.com/pytorch/rl/pull/1095
  • [Doc] Add tuto and time dim info in docs by @vmoens in https://github.com/pytorch/rl/pull/1130
  • [Doc] Fix locked samples from RBs and ccl of tuto by @vmoens in https://github.com/pytorch/rl/pull/1132
  • [BugFix] Fix unlock in RB by @vmoens in https://github.com/pytorch/rl/pull/1135
  • [BugFix] extract the info dict from a list by @xmaples in https://github.com/pytorch/rl/pull/1131
  • [Feature] Added support for vector-based rewards from environments in MO-Gymnasium by @dennismalmgren in https://github.com/pytorch/rl/pull/992
  • [Versioning] v0.1.1 by @vmoens in https://github.com/pytorch/rl/pull/1137

New Contributors

  • @FrankTianTT made their first contribution in https://github.com/pytorch/rl/pull/1046
  • @Goldspear made their first contribution in https://github.com/pytorch/rl/pull/1055
  • @btx0424 made their first contribution in https://github.com/pytorch/rl/pull/1059
  • @traversaro made their first contribution in https://github.com/pytorch/rl/pull/1064
  • @DamienAllonsius made their first contribution in https://github.com/pytorch/rl/pull/1074
  • @seemethere made their first contribution in https://github.com/pytorch/rl/pull/1069
  • @remidomingues made their first contribution in https://github.com/pytorch/rl/pull/1081
  • @rohitnig made their first contribution in https://github.com/pytorch/rl/pull/1048
  • @mischab made their first contribution in https://github.com/pytorch/rl/pull/1099
  • @osalpekar made their first contribution in https://github.com/pytorch/rl/pull/1114
  • @xmaples made their first contribution in https://github.com/pytorch/rl/pull/1131
  • @dennismalmgren made their first contribution in https://github.com/pytorch/rl/pull/992

Full Changelog: https://github.com/pytorch/rl/compare/v0.1.0...v0.1.1

- Python
Published by vmoens about 3 years ago

torchrl - v0.1.0 - Beta

First official beta release of the library!

What's Changed

  • QuickFix Versioning by @fedebotu in https://github.com/pytorch/rl/pull/958
  • Version 0.0.5 by @vmoens in https://github.com/pytorch/rl/pull/957
  • [Minor] Warning when loading memmap storage on uninitialized td by @vmoens in https://github.com/pytorch/rl/pull/961
  • [Refactor] Defaults split_trajs to False by @vmoens in https://github.com/pytorch/rl/pull/947
  • [Feature] InitTracker transform by @vmoens in https://github.com/pytorch/rl/pull/962
  • [Feature] RenameTransform by @vmoens in https://github.com/pytorch/rl/pull/964
  • [Feature] Implicit Q-Learning (IQL) by @BY571 in https://github.com/pytorch/rl/pull/933
  • [Refactor] Refactor data collectors constructors by @vmoens in https://github.com/pytorch/rl/pull/970
  • [Feature, Refactor] Iterable replay buffers by @vmoens in https://github.com/pytorch/rl/pull/968
  • [Doc] README rewrite by @vmoens in https://github.com/pytorch/rl/pull/971
  • [Refactor] A less verbose torchrl by @vmoens in https://github.com/pytorch/rl/pull/973
  • [Feature] torch.distributed collectors by @vmoens in https://github.com/pytorch/rl/pull/934
  • [Feature] Offline datasets: D4RL by @vmoens in https://github.com/pytorch/rl/pull/928

Full Changelog: https://github.com/pytorch/rl/compare/v0.0.5...v0.1.0

- Python
Published by vmoens about 3 years ago

torchrl - 0.0.5

We change the env.step API, see https://github.com/pytorch/rl/pull/941 for more info.

What's Changed

  • [BugFix] Fix dreamer training loop by @vmoens in https://github.com/pytorch/rl/pull/915
  • [Doc] PPO Tutorial by @vmoens in https://github.com/pytorch/rl/pull/913
  • [Doc] Create your pendulum tutorial by @vmoens in https://github.com/pytorch/rl/pull/911
  • [BugFix] Deploy doc by @vmoens in https://github.com/pytorch/rl/pull/920
  • [BugFix] Nvidia not found fix by @vmoens in https://github.com/pytorch/rl/pull/922
  • [Feature] Rework to_one_hot and to_categorical to take a tensor as parameter by @riiswa in https://github.com/pytorch/rl/pull/816
  • [Doc] Tutorial revamp by @vmoens in https://github.com/pytorch/rl/pull/926
  • [BugFix] Fix EnvPool spec shapes by @vmoens in https://github.com/pytorch/rl/pull/932
  • [BugFix] Fix CompositeSpec.to_numpy method by @riiswa in https://github.com/pytorch/rl/pull/931
  • [CI] Do not run nightly workflows on forked repos by @XuehaiPan in https://github.com/pytorch/rl/pull/936
  • [Refactor] set_default -> setdefault by @tcbegley in https://github.com/pytorch/rl/pull/935
  • [BugFix] Step and maybe reset by @vmoens in https://github.com/pytorch/rl/pull/938
  • [Doc] Minor doc improvements by @vmoens in https://github.com/pytorch/rl/pull/907
  • [Doc] Add debug doc by @acohen13 in https://github.com/pytorch/rl/pull/940
  • [BugFix] Propagate args to TransformedEnv's state_dict by @fedebotu in https://github.com/pytorch/rl/pull/944
  • [BugFix] Vmas expanded specs by @matteobettini in https://github.com/pytorch/rl/pull/942
  • [Quality] RB constuctors cleanup by @vmoens in https://github.com/pytorch/rl/pull/945
  • [Doc] Refactor KB by @vmoens in https://github.com/pytorch/rl/pull/946
  • [BugFix] Upgrade vision's functional import by @vmoens in https://github.com/pytorch/rl/pull/948
  • [BugFix] Deprecate tensordict.set check skips in transforms by @vmoens in https://github.com/pytorch/rl/pull/951
  • [BugFix] Upgrade tensordict deps by @vmoens in https://github.com/pytorch/rl/pull/953
  • [CI] Fix windows CI by @vmoens in https://github.com/pytorch/rl/pull/954
  • [Refactor] Refactor composite spec keys to match tensordict by @vmoens in https://github.com/pytorch/rl/pull/956
  • [Refactor] Refactor the step to include reward and done in the 'next' tensordict by @vmoens in https://github.com/pytorch/rl/pull/941

New Contributors

  • @XuehaiPan made their first contribution in https://github.com/pytorch/rl/pull/936
  • @acohen13 made their first contribution in https://github.com/pytorch/rl/pull/940
  • @fedebotu made their first contribution in https://github.com/pytorch/rl/pull/944

Full Changelog: https://github.com/pytorch/rl/compare/v0.0.4...v0.0.5

- Python
Published by vmoens about 3 years ago

torchrl - v0.0.4

What's Changed

  • [CI, Doc] Update functorch source installation command by @zou3519 in https://github.com/pytorch/rl/pull/446
  • [BugFix] TransformedEnv attributes inheritance by @vmoens in https://github.com/pytorch/rl/pull/467
  • [Feature] Cleanup mocking envs init and new by @vmoens in https://github.com/pytorch/rl/pull/469
  • [Tests] Adding tensordict __repr__ tests by @sladebot in https://github.com/pytorch/rl/pull/435
  • [Logging]: implement MLFlow logging integration by @rayanht in https://github.com/pytorch/rl/pull/432
  • [BugFix] MLFlow import fix by @vmoens in https://github.com/pytorch/rl/pull/473
  • [BugFix] Fixed pip install by @brandonsj in https://github.com/pytorch/rl/pull/475
  • [Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in https://github.com/pytorch/rl/pull/464
  • [Feature]: ModelBased Envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/333
  • [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in https://github.com/pytorch/rl/pull/476
  • [Tutorial] DQN tutorial by @vmoens in https://github.com/pytorch/rl/pull/474
  • [Feature] reader hooks for GymLike by @vmoens in https://github.com/pytorch/rl/pull/478
  • [BugFix] TensorSpec.zero(None) failure fix by @vmoens in https://github.com/pytorch/rl/pull/483
  • [Feature]: Support for planners and CEM by @nicolas-dufour in https://github.com/pytorch/rl/pull/384
  • [Feature] Replaced device_safe() with device by @ordinskiy in https://github.com/pytorch/rl/pull/485
  • [Feature]: TensorDictPrimer transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/456
  • [Feature]: erase() method for torchrl.timeit by @nicolas-dufour in https://github.com/pytorch/rl/pull/480
  • [Feature] Added support for single collector in syncasynccollector by @nicolas-dufour in https://github.com/pytorch/rl/pull/482
  • [BugFix] removing unwanted device_safe() by @vmoens in https://github.com/pytorch/rl/pull/486
  • [Refactoring] Refactored getstatsrandom_rollout by @nicolas-dufour in https://github.com/pytorch/rl/pull/481
  • [Feature] VIP Integration by @JasonMa2016 in https://github.com/pytorch/rl/pull/487
  • [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/489
  • [Feature]: Deactivate typechecks in envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/490
  • [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in https://github.com/pytorch/rl/pull/400
  • [BugFix] Fix TensorDictPrimer init by @vmoens in https://github.com/pytorch/rl/pull/491
  • [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in https://github.com/pytorch/rl/pull/492
  • [BugFix] Defaulting passing_devices to None by @himjohntang in https://github.com/pytorch/rl/pull/477
  • Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in https://github.com/pytorch/rl/pull/494
  • [BugFix] Multi-agent fixes by @vmoens in https://github.com/pytorch/rl/pull/488
  • [BugFix] Defaulting passing_devices to None by @vmoens in https://github.com/pytorch/rl/pull/495
  • [Feature] Lazy initialization of CatTensors by @vmoens in https://github.com/pytorch/rl/pull/497
  • [Cleanup] Removing cuda 10.2 references by @vmoens in https://github.com/pytorch/rl/pull/498
  • [BugFix] Migration to pytorch org by @vmoens in https://github.com/pytorch/rl/pull/499
  • [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in https://github.com/pytorch/rl/pull/500
  • [BugFix] python version for linting checks by @vmoens in https://github.com/pytorch/rl/pull/502
  • [Feature] Replay Buffers refactor by @bamaxw in https://github.com/pytorch/rl/pull/330
  • [Feature] Rename step_tensordict in step_mdp by @romainjln in https://github.com/pytorch/rl/pull/512
  • [Lint] re-instantiate F821 by @vmoens in https://github.com/pytorch/rl/pull/516
  • [BugFix] runtypechecks for TransformedEnvs by @vmoens in https://github.com/pytorch/rl/pull/513
  • [BugFix] making firstdim and lastdim negative in FlattenObservation when a parent is set by @vmoens in https://github.com/pytorch/rl/pull/511
  • [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in https://github.com/pytorch/rl/pull/504
  • [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in https://github.com/pytorch/rl/pull/515
  • [CI] Add coverage with codecov by @silvestrebahi in https://github.com/pytorch/rl/pull/523
  • Revert "[CI] Add coverage with codecov" by @vmoens in https://github.com/pytorch/rl/pull/525
  • [Quality] Use relative imports for local c++ deps by @apbard in https://github.com/pytorch/rl/pull/526
  • [Feature] Nightly release by @vmoens in https://github.com/pytorch/rl/pull/519
  • [Feature] Add make_tensordict() function by @sicong-huang in https://github.com/pytorch/rl/pull/522
  • [Doc] Misc readme fixes by @GavinPHR in https://github.com/pytorch/rl/pull/532
  • [BugFix] Replacing inferencemode decorator with nograd to fix state_dict loading error by @GavinPHR in https://github.com/pytorch/rl/pull/530
  • [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in https://github.com/pytorch/rl/pull/531
  • [Doc] Add coverage banner by @vmoens in https://github.com/pytorch/rl/pull/533
  • [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in https://github.com/pytorch/rl/pull/543
  • [BugFix] Fix optional imports by @vmoens in https://github.com/pytorch/rl/pull/535
  • [BugFix] Restore missing keys in data collector output by @tcbegley in https://github.com/pytorch/rl/pull/521
  • [Lint] reorganize imports by @apbard in https://github.com/pytorch/rl/pull/545
  • [BugFix] Single-cpu compatibility by @vmoens in https://github.com/pytorch/rl/pull/548
  • [BugFix] vision install and other deps in optdeps by @vmoens in https://github.com/pytorch/rl/pull/552
  • [Feature] Implemented device argument for modules.models by @yushiyangk in https://github.com/pytorch/rl/pull/524
  • [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in https://github.com/pytorch/rl/pull/559
  • [BugFix] Additive gaussian exploration spec fix by @vmoens in https://github.com/pytorch/rl/pull/560
  • [BugFix] Disabling video step for wandb by @vmoens in https://github.com/pytorch/rl/pull/561
  • [BugFix] Various device fix by @vmoens in https://github.com/pytorch/rl/pull/558
  • [Feature] Allow collectors to accept regular modules as policies by @tcbegley in https://github.com/pytorch/rl/pull/546
  • [BugFix] Fix push binary nightly action by @psolikov in https://github.com/pytorch/rl/pull/566
  • [BugFix] TensorDict comparison by @vmoens in https://github.com/pytorch/rl/pull/567
  • [BugFix] Fix SyncDataCollector reset by @jrobine in https://github.com/pytorch/rl/pull/571
  • [Doc] Banners on README.md by @vmoens in https://github.com/pytorch/rl/pull/572
  • [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in https://github.com/pytorch/rl/pull/573
  • [BugFix] Add eps to reward normalization by @vmoens in https://github.com/pytorch/rl/pull/574
  • [BugFix] Fix argument for PPOLoss.getentropybonus() by @vmoens in https://github.com/pytorch/rl/pull/578
  • [Feature] Restructure torchrl/objectives by @sgrigory in https://github.com/pytorch/rl/pull/580
  • [Docs] Documentation revamp by @vmoens in https://github.com/pytorch/rl/pull/581
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/582
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/584
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/585
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/586
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/587
  • [Feature] More restrictive tests on docstrings by @vmoens in https://github.com/pytorch/rl/pull/457
  • [BugFix] Wrong stack import in tests by @vmoens in https://github.com/pytorch/rl/pull/590
  • [Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in https://github.com/pytorch/rl/pull/589
  • [Feature]: Dreamer support by @nicolas-dufour in https://github.com/pytorch/rl/pull/341
  • [Doc] Missing doc for prototype RB by @vmoens in https://github.com/pytorch/rl/pull/595
  • [Feature] Update list of supported libraries by @vmoens in https://github.com/pytorch/rl/pull/594
  • [BugFix] Fix timeit count registration by @vmoens in https://github.com/pytorch/rl/pull/598
  • [Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in https://github.com/pytorch/rl/pull/603
  • [Feature] Categorical encoding for action space by @artkorenev in https://github.com/pytorch/rl/pull/593
  • [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in https://github.com/pytorch/rl/pull/614
  • [Doc] Typos in tensordict tutorial by @PaLeroy in https://github.com/pytorch/rl/pull/621
  • [Doc] Integrate knowledge base in docs by @hatala91 in https://github.com/pytorch/rl/pull/622
  • [Doc] Updating docs requirements by @vmoens in https://github.com/pytorch/rl/pull/624
  • [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in https://github.com/pytorch/rl/pull/386
  • [Feature] Habitat integration by @vmoens in https://github.com/pytorch/rl/pull/514
  • [Feature] Checkpointing by @vmoens in https://github.com/pytorch/rl/pull/549
  • Add support for null dim argument in TensorDict.squeeze by @jgonik in https://github.com/pytorch/rl/pull/608
  • [Version] Updating to torch 1.13 by @vmoens in https://github.com/pytorch/rl/pull/627
  • [Feature] Sub-memmap tensors by @vmoens in https://github.com/pytorch/rl/pull/626
  • [BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in https://github.com/pytorch/rl/pull/631
  • [Feature] Unfold transforms for folded TransformedEnv by @alexanderlobov in https://github.com/pytorch/rl/pull/630
  • [BugFix] make TensorDictReplayBuffer.extend call super().extend with stacked_td by @vmoens in https://github.com/pytorch/rl/pull/634
  • [BugFix] correct the use of step_mdp method in data collector by @adityagandhamal in https://github.com/pytorch/rl/pull/637
  • [Feature] Added implement_for decorator by @ordinskiy in https://github.com/pytorch/rl/pull/618
  • [Feature] Make DQN compatible with nn.Module by @svarolgunes in https://github.com/pytorch/rl/pull/632
  • [Example] Distributed Replay Buffer Prototype Example Implementation by @adityagoel4512 in https://github.com/pytorch/rl/pull/615
  • [Feature] Benchmark storage types by @adityagoel4512 in https://github.com/pytorch/rl/pull/633
  • [Feature] Remove wild imports in the library by @sosmond in https://github.com/pytorch/rl/pull/642
  • [BugFix] Prevent transform parent from being reassigned by @jasonfkut in https://github.com/pytorch/rl/pull/641
  • [Feature] Too many deepcopy in transforms.py by @romainjln in https://github.com/pytorch/rl/pull/625
  • [Naming] Rename keysin to inkeys in transforms.py and related modules by @sardaankita in https://github.com/pytorch/rl/pull/656
  • [Refactoring] Refactor dreamer helper in smaller pieces by @vmoens in https://github.com/pytorch/rl/pull/662
  • [Feature] VIPRewardTransform by @vmoens in https://github.com/pytorch/rl/pull/658
  • [BugFix] make_trainer possible bug for on-policy cases by @albertbou92 in https://github.com/pytorch/rl/pull/655
  • [Naming] Fixing key names by @vmoens in https://github.com/pytorch/rl/pull/668
  • [Test] Check dtypes of envs by @vmoens in https://github.com/pytorch/rl/pull/666
  • [Refactor] Relying on the standalone tensordict -- phase 1 by @vmoens in https://github.com/pytorch/rl/pull/650
  • [Doc] More doc on trainers by @vmoens in https://github.com/pytorch/rl/pull/663
  • [BugFix] PPO example GAE import by @albertbou92 in https://github.com/pytorch/rl/pull/671
  • [BugFix] Use GitHub for flake8 pre-commit hook by @vmoens in https://github.com/pytorch/rl/pull/679
  • [BugFix] Update to strict select by @vmoens in https://github.com/pytorch/rl/pull/675
  • [Feature] Auto-compute stats for ObservationNorm by @romainjln in https://github.com/pytorch/rl/pull/669
  • [Doc] makecollector helper function by @albertbou92 in https://github.com/pytorch/rl/pull/678
  • [Doc] BatchSubSampler class docstrings example by @albertbou92 in https://github.com/pytorch/rl/pull/677
  • [BugFix] PPO objective crashes if advantage_module is None by @albertbou92 in https://github.com/pytorch/rl/pull/676
  • [Refactor] Refactor 'next_' into nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/649
  • [Doc] More doc about environments by @vmoens in https://github.com/pytorch/rl/pull/683
  • [Doc] Fix missing tensordict install for doc by @vmoens in https://github.com/pytorch/rl/pull/685
  • [CI] Added CircleCI pipeline to test compatibility across supported gym versions by @ordinskiy in https://github.com/pytorch/rl/pull/645
  • [BugFix] ConvNet forward method with tensors of more than 4 dimensions by @albertbou92 in https://github.com/pytorch/rl/pull/686
  • [Feature] add standard_normal for RewardScaling by @adityagandhamal in https://github.com/pytorch/rl/pull/682
  • [Feature] Jumanji envs by @yingchenlin in https://github.com/pytorch/rl/pull/674
  • [Feature] Default collate_fn by @vmoens in https://github.com/pytorch/rl/pull/688
  • [BugFix] Fix Examples by @vmoens in https://github.com/pytorch/rl/pull/687
  • [Refactoring] Replace direct gym version checks with decorated functions (#) by @ordinskiy in https://github.com/pytorch/rl/pull/691
  • Version 0.0.3 by @vmoens in https://github.com/pytorch/rl/pull/696
  • [Docs] Host TensorDict docs inside TorchRL docs by @tcbegley in https://github.com/pytorch/rl/pull/693
  • [BugFix] Fix docs build by @tcbegley in https://github.com/pytorch/rl/pull/698
  • [BugFix] Proper error messages for orphan transform creation by @vmoens in https://github.com/pytorch/rl/pull/697
  • [Feature] Append, init and insert transforms in ReplayBuffer by @altre in https://github.com/pytorch/rl/pull/695
  • [Feature] A2C objective class and train example by @albertbou92 in https://github.com/pytorch/rl/pull/680
  • [Doc, Test] Add A2C script test and doc by @vmoens in https://github.com/pytorch/rl/pull/702
  • [BugFix] Initialising the classes LazyTensorStorage with a nested TensorDict raises error by @albertbou92 in https://github.com/pytorch/rl/pull/703
  • [BugFix] Fix initrandomframes in A2C example test by @vmoens in https://github.com/pytorch/rl/pull/706
  • [Formatting] Upgrade formatting libs by @vmoens in https://github.com/pytorch/rl/pull/705
  • [Doc] Document undefined symbol error with torch version < 1.13 by @nickspell in https://github.com/pytorch/rl/pull/707
  • [Doc] Tuto integration by @vmoens in https://github.com/pytorch/rl/pull/681
  • [Quality] Deprecate .ipynb tutos by @vmoens in https://github.com/pytorch/rl/pull/710
  • [Test] Fix wrong skip message when functorch is installed by @vmoens in https://github.com/pytorch/rl/pull/711
  • [BugFix, Doc] Clone TensorDict docs into localbuild by @tcbegley in https://github.com/pytorch/rl/pull/712
  • [Feature] Migrate to tensordict.nn.TensorDictModule by @tcbegley in https://github.com/pytorch/rl/pull/700
  • [Doc] Fix Tutos TODOs by @vmoens in https://github.com/pytorch/rl/pull/713
  • [BugFix] RoundRobinWriter, possible duplicated code in the extend method by @albertbou92 in https://github.com/pytorch/rl/pull/709
  • [Feature] Add OptimizerHook by @aakhundov in https://github.com/pytorch/rl/pull/716
  • [Feature] Support for in-place functionalization by @tcbegley in https://github.com/pytorch/rl/pull/714
  • [BugFix] Fix TorchRL demo tutorial by @vmoens in https://github.com/pytorch/rl/pull/721
  • [Docs] Update tutorial links in readme by @tcbegley in https://github.com/pytorch/rl/pull/724
  • [Feature] Extend PPO loss helper to allow for more customisation by @albertbou92 in https://github.com/pytorch/rl/pull/718
  • [BugFix] Model maker functions for A2C and PPO fail for discrete action space envs by @albertbou92 in https://github.com/pytorch/rl/pull/717
  • [Minor] docstrings and setup fixes by @vmoens in https://github.com/pytorch/rl/pull/726
  • [BugFix] Avoid wrongfully erasing observation keys from specs in CatTensors by @vmoens in https://github.com/pytorch/rl/pull/727
  • [BugFix] Avoid wrongfully erasing observation keys from tensordict in CatTensors by @vmoens in https://github.com/pytorch/rl/pull/729
  • [Doc] More doc for data collectors by @vmoens in https://github.com/pytorch/rl/pull/732
  • [Feature] Port testfaketensordict to torchrl by @vmoens in https://github.com/pytorch/rl/pull/731
  • [Feature] Use ObservationNorm.init_stats for stats computation in example scripts by @romainjln in https://github.com/pytorch/rl/pull/715
  • [BugFix] init_stats over multiple dimensions by @vmoens in https://github.com/pytorch/rl/pull/735
  • [Refactor] logger creation in examples by @acforvs in https://github.com/pytorch/rl/pull/733
  • [Feature] Brax envs by @yingchenlin in https://github.com/pytorch/rl/pull/722
  • [Refactor] Adopt prototype ProbabilisticTensorDictModule and ProbabilisticTensorDictSequential by @tcbegley in https://github.com/pytorch/rl/pull/728
  • [Doc] Link to doc in README by @vmoens in https://github.com/pytorch/rl/pull/740
  • [Feature] Make GAE return a 'value_target' entry by @vmoens in https://github.com/pytorch/rl/pull/741
  • [Feature] SamplerWithoutReplacement by @vmoens in https://github.com/pytorch/rl/pull/742
  • [Doc, CI] Update doc workflow to run on PR and only publishes doc on main. by @EmGarr in https://github.com/pytorch/rl/pull/745
  • [Feature] Better advantage API for higher order derivatives by @vmoens in https://github.com/pytorch/rl/pull/744
  • [Refactor] Cosmetic improvements to advantage modules by @vmoens in https://github.com/pytorch/rl/pull/746
  • [BugFix] Fix NoopReset in parallel settings by @vmoens in https://github.com/pytorch/rl/pull/747
  • [Refactor] Remove env.is_done attribute by @vmoens in https://github.com/pytorch/rl/pull/748
  • [Refactor] Drop prototype imports by @tcbegley in https://github.com/pytorch/rl/pull/738
  • [BugFix] Fixes for speed branch merge on tensordict by @vmoens in https://github.com/pytorch/rl/pull/755
  • [BugFix] Fix size-match unsqueeze deprecation by @vmoens in https://github.com/pytorch/rl/pull/750
  • [Feature] FrameSkipTransform by @vmoens in https://github.com/pytorch/rl/pull/749
  • [BugFix] Better memory management for collectors by @vmoens in https://github.com/pytorch/rl/pull/763
  • Minor cleaning in BaseEnv classes by @matteobettini in https://github.com/pytorch/rl/pull/767
  • Revert "Minor cleaning in BaseEnv classes" by @vmoens in https://github.com/pytorch/rl/pull/768
  • Cleaning in envs common.py by @matteobettini in https://github.com/pytorch/rl/pull/769
  • Making _set_seed abstract by @matteobettini in https://github.com/pytorch/rl/pull/770
  • [Feature] Remove the Nd*TensorSpec classes by @riiswa in https://github.com/pytorch/rl/pull/772
  • [BugFix] Reinstantiate custom value key for multioutput value networks by @vmoens in https://github.com/pytorch/rl/pull/754
  • [Feature] Add Step Counter transform by @riiswa in https://github.com/pytorch/rl/pull/756
  • [BugFix] Batched environments with non empty batch size by @matteobettini in https://github.com/pytorch/rl/pull/774
  • Allow undounded boxes creation from gym spaces by @matteobettini in https://github.com/pytorch/rl/pull/778
  • [BugFix] Doc built cmake error by @vmoens in https://github.com/pytorch/rl/pull/780
  • [Feature] Lazy TensorClass storage by @tcbegley in https://github.com/pytorch/rl/pull/752
  • [BugFix] SyncDataCollector init when device and env_device are different by @albertbou92 in https://github.com/pytorch/rl/pull/765
  • [Feature] RewardSum transform by @albertbou92 in https://github.com/pytorch/rl/pull/751
  • [BugFix] Fix PPO clip by @vmoens in https://github.com/pytorch/rl/pull/786
  • [Feature] MultiDiscreteTensorSpec by @riiswa in https://github.com/pytorch/rl/pull/783
  • [Doc] Doc revamp by @vmoens in https://github.com/pytorch/rl/pull/782
  • [BugFix] ParallelEnv handling of done flag by @matteobettini in https://github.com/pytorch/rl/pull/788
  • [BugFix] Sorting nested keys by @matteobettini in https://github.com/pytorch/rl/pull/787
  • [Doc] README index by @vmoens in https://github.com/pytorch/rl/pull/791
  • Add windows wheel build to CircleCI by @yohann-benchetrit in https://github.com/pytorch/rl/pull/759
  • [Algorithm] MPPI planner by @vmoens in https://github.com/pytorch/rl/pull/701
  • [Doc] Better doc links by @vmoens in https://github.com/pytorch/rl/pull/795
  • [Doc] Missing headers by @vmoens in https://github.com/pytorch/rl/pull/796
  • [Doc] Knowledge base section by @vmoens in https://github.com/pytorch/rl/pull/797
  • [Feature] Vmas library wrapper by @matteobettini in https://github.com/pytorch/rl/pull/785
  • [Doc] Duplicate HabitatEnv entry in docs by @matteobettini in https://github.com/pytorch/rl/pull/798
  • [Feature] MultiDiscreteTensorSpec nvec with several axes by @riiswa in https://github.com/pytorch/rl/pull/789
  • [Refactor] Graduate Replay Buffer prototype by @KamilPiechowiak in https://github.com/pytorch/rl/pull/794
  • [BugFix] Solve R3MTransform init problem by @vmoens in https://github.com/pytorch/rl/pull/803
  • [Refactor] Simplify FlattenObservation default kwargs by @vmoens in https://github.com/pytorch/rl/pull/805
  • [Format] Fix lint by @vmoens in https://github.com/pytorch/rl/pull/811
  • [Doc, BugFix] Fix tutos errors by @vmoens in https://github.com/pytorch/rl/pull/817
  • [Doc] Pretrained models tutorial by @vmoens in https://github.com/pytorch/rl/pull/814
  • [Doc, BugFix] Fix tensordictmodule tutorial by @vmoens in https://github.com/pytorch/rl/pull/819
  • [BugFix] Fix MultOneHotDiscreteTensorSpec.is_in by @riiswa in https://github.com/pytorch/rl/pull/818
  • [Doc] Using R3M with a replay buffer by @vmoens in https://github.com/pytorch/rl/pull/820
  • [CodeQuality] call all() without making a list by @riiswa in https://github.com/pytorch/rl/pull/821
  • [BugFix] [Feature] "_reset" flag for env reset by @matteobettini in https://github.com/pytorch/rl/pull/800
  • [CI] Add unit test workflows for Windows by @yohann-benchetrit in https://github.com/pytorch/rl/pull/804
  • [BugFix] Fix habitat integration and doc by @vmoens in https://github.com/pytorch/rl/pull/812
  • [Minor] Better error reporting by @vmoens in https://github.com/pytorch/rl/pull/822
  • [Minor] Add ninja to deps in toml file by @vmoens in https://github.com/pytorch/rl/pull/823
  • [BugFix] Device of info specs by @vmoens in https://github.com/pytorch/rl/pull/824
  • [BugFix] Fix envs specs and info reading by @vmoens in https://github.com/pytorch/rl/pull/825
  • [Feature] Dtype in vmas tests by @matteobettini in https://github.com/pytorch/rl/pull/827
  • [BugFix] Fix R3M observation spec transform by @vmoens in https://github.com/pytorch/rl/pull/830
  • small change to make @robandpdx a contributor by @robandpdx in https://github.com/pytorch/rl/pull/831
  • [Feature] Exclude and select transforms by @vmoens in https://github.com/pytorch/rl/pull/832
  • [BugFix] Updating Recorder to accomodate "solved" key by @ShahRutav in https://github.com/pytorch/rl/pull/833
  • [BugFIx] Changed "set_count" set in collectors by @matteobettini in https://github.com/pytorch/rl/pull/835
  • [Algorithm] Td3 by @BY571 in https://github.com/pytorch/rl/pull/684
  • [Doc] A Succinct Summary of Reinforcement Learning by @vmoens in https://github.com/pytorch/rl/pull/840
  • [Feature, BugFix] ObservationNorm keep_dims and RewardSum init by @vmoens in https://github.com/pytorch/rl/pull/839
  • [BugFix] Improve done checking of collectors by @matteobettini in https://github.com/pytorch/rl/pull/838
  • [BugFix] Sync with tensordict (meta-tensor deprecation) by @vmoens in https://github.com/pytorch/rl/pull/842
  • [Feature] Refactor CatFrames using a proper preallocated buffer by @vmoens in https://github.com/pytorch/rl/pull/847
  • [CI] Add Github-Actions workflows for Windows wheels & nightly-build by @yohann-benchetrit in https://github.com/pytorch/rl/pull/837
  • [Doc] Fix broken link Dreamer by @atonkamanda in https://github.com/pytorch/rl/pull/853
  • [BugFix] Loading state_dict on uninitialized CatFrames by @vmoens in https://github.com/pytorch/rl/pull/855
  • [Refactor] Move loggers to torchrl.record by @vmoens in https://github.com/pytorch/rl/pull/854
  • [Refactor] specs batch size refactoring by @vmoens in https://github.com/pytorch/rl/pull/829
  • [Feature] Max pool Transform by @albertbou92 in https://github.com/pytorch/rl/pull/841
  • [Feature] Refactor advantages for continuous batches by @vmoens in https://github.com/pytorch/rl/pull/848
  • [BugFix, Doc] Minor fix in doc by @vmoens in https://github.com/pytorch/rl/pull/858
  • [Versioning] Version 0.0.4a by @vmoens in https://github.com/pytorch/rl/pull/859
  • [Feature] Vmas to device by @matteobettini in https://github.com/pytorch/rl/pull/850
  • [BugFix] Fix zero-ing from specs in RewardSum by @vmoens in https://github.com/pytorch/rl/pull/860
  • [Feature] Loading R3M and VIP from ResNet by @vmoens in https://github.com/pytorch/rl/pull/863
  • [Feature] SAC V2 by @vmoens in https://github.com/pytorch/rl/pull/864
  • [BugFix] Avoid collision of "step_count" key from transform and collector by @vmoens in https://github.com/pytorch/rl/pull/868
  • [Refactor] Better init for CatFrames buffers + removing default init values by @vmoens in https://github.com/pytorch/rl/pull/874
  • [Refactor] Minor refactorings to envs by @vmoens in https://github.com/pytorch/rl/pull/872
  • [Refactor] Removing inplace transform attribute by @vmoens in https://github.com/pytorch/rl/pull/871
  • [BugFix] Run checks when creating fake_td by @vmoens in https://github.com/pytorch/rl/pull/877
  • [Refactor] Box device by @vmoens in https://github.com/pytorch/rl/pull/881
  • [Feature] Multithreaded env by @sgrigory in https://github.com/pytorch/rl/pull/734
  • [Refactor] Turn off default advantage normalization in PPO by @vmoens in https://github.com/pytorch/rl/pull/887
  • [CI] Fix habitat-gym imports by @vmoens in https://github.com/pytorch/rl/pull/890
  • [CI] Fix cuda versions by @vmoens in https://github.com/pytorch/rl/pull/889
  • [CI] Fix windows install by @vmoens in https://github.com/pytorch/rl/pull/888
  • MacOS CPU unit test workflow using GitHub Actions by @robandpdx in https://github.com/pytorch/rl/pull/886
  • Linux CPU unit test workflow using GitHub Actions by @robandpdx in https://github.com/pytorch/rl/pull/826
  • [Major, BugFix, Test] Refactor Transforms tests by @vmoens in https://github.com/pytorch/rl/pull/878
  • [Bugfix] Codecov does not cover multiprocessed tests #879 by @kadeng in https://github.com/pytorch/rl/pull/893
  • [CI, BugFix] Fix gym related errors by @vmoens in https://github.com/pytorch/rl/pull/895
  • [WIP] Linux GPU unit test workflow using GitHub Actions by @robandpdx in https://github.com/pytorch/rl/pull/885
  • [BugFix] Compose cloning fix by @vmoens in https://github.com/pytorch/rl/pull/899
  • [Feature] Simplifying collector envs by @vmoens in https://github.com/pytorch/rl/pull/870
  • [CI,Feature] Upgrade to gymnasium by @vmoens in https://github.com/pytorch/rl/pull/898
  • [Doc] Add record utils to doc by @vmoens in https://github.com/pytorch/rl/pull/904
  • [Test] Improve exception message match by @apbard in https://github.com/pytorch/rl/pull/906
  • [BugFix] Dreamer helpers are broken with batched envs by @vmoens in https://github.com/pytorch/rl/pull/903
  • [Feature] RandomCropTensorDict transform by @vmoens in https://github.com/pytorch/rl/pull/908
  • [Versioning] Version 0.0.4b by @vmoens in https://github.com/pytorch/rl/pull/909

New Contributors

  • @sladebot made their first contribution in https://github.com/pytorch/rl/pull/435
  • @rayanht made their first contribution in https://github.com/pytorch/rl/pull/432
  • @brandonsj made their first contribution in https://github.com/pytorch/rl/pull/475
  • @ordinskiy made their first contribution in https://github.com/pytorch/rl/pull/485
  • @JasonMa2016 made their first contribution in https://github.com/pytorch/rl/pull/487
  • @himjohntang made their first contribution in https://github.com/pytorch/rl/pull/477
  • @romainjln made their first contribution in https://github.com/pytorch/rl/pull/512
  • @apbard made their first contribution in https://github.com/pytorch/rl/pull/526
  • @sicong-huang made their first contribution in https://github.com/pytorch/rl/pull/522
  • @psolikov made their first contribution in https://github.com/pytorch/rl/pull/566
  • @jrobine made their first contribution in https://github.com/pytorch/rl/pull/571
  • @nikhlrao made their first contribution in https://github.com/pytorch/rl/pull/573
  • @sgrigory made their first contribution in https://github.com/pytorch/rl/pull/580
  • @jlesuffleur made their first contribution in https://github.com/pytorch/rl/pull/589
  • @artkorenev made their first contribution in https://github.com/pytorch/rl/pull/593
  • @paulomarciano made their first contribution in https://github.com/pytorch/rl/pull/614
  • @hatala91 made their first contribution in https://github.com/pytorch/rl/pull/622
  • @jgonik made their first contribution in https://github.com/pytorch/rl/pull/608
  • @adityagandhamal made their first contribution in https://github.com/pytorch/rl/pull/637
  • @svarolgunes made their first contribution in https://github.com/pytorch/rl/pull/632
  • @adityagoel4512 made their first contribution in https://github.com/pytorch/rl/pull/615
  • @jasonfkut made their first contribution in https://github.com/pytorch/rl/pull/641
  • @sardaankita made their first contribution in https://github.com/pytorch/rl/pull/656
  • @albertbou92 made their first contribution in https://github.com/pytorch/rl/pull/655
  • @yingchenlin made their first contribution in https://github.com/pytorch/rl/pull/674
  • @altre made their first contribution in https://github.com/pytorch/rl/pull/695
  • @nickspell made their first contribution in https://github.com/pytorch/rl/pull/707
  • @aakhundov made their first contribution in https://github.com/pytorch/rl/pull/716
  • @acforvs made their first contribution in https://github.com/pytorch/rl/pull/733
  • @EmGarr made their first contribution in https://github.com/pytorch/rl/pull/745
  • @matteobettini made their first contribution in https://github.com/pytorch/rl/pull/767
  • @riiswa made their first contribution in https://github.com/pytorch/rl/pull/772
  • @yohann-benchetrit made their first contribution in https://github.com/pytorch/rl/pull/759
  • @KamilPiechowiak made their first contribution in https://github.com/pytorch/rl/pull/794
  • @robandpdx made their first contribution in https://github.com/pytorch/rl/pull/831
  • @ShahRutav made their first contribution in https://github.com/pytorch/rl/pull/833
  • @BY571 made their first contribution in https://github.com/pytorch/rl/pull/684
  • @atonkamanda made their first contribution in https://github.com/pytorch/rl/pull/853
  • @kadeng made their first contribution in https://github.com/pytorch/rl/pull/893

Full Changelog: https://github.com/pytorch/rl/compare/v0.0.2a...v0.0.4b

- Python
Published by vmoens over 3 years ago

torchrl - v0.0.4-beta

What's Changed

  • [CI, Doc] Update functorch source installation command by @zou3519 in https://github.com/pytorch/rl/pull/446
  • [BugFix] TransformedEnv attributes inheritance by @vmoens in https://github.com/pytorch/rl/pull/467
  • [Feature] Cleanup mocking envs init and new by @vmoens in https://github.com/pytorch/rl/pull/469
  • [Tests] Adding tensordict __repr__ tests by @sladebot in https://github.com/pytorch/rl/pull/435
  • [Logging]: implement MLFlow logging integration by @rayanht in https://github.com/pytorch/rl/pull/432
  • [BugFix] MLFlow import fix by @vmoens in https://github.com/pytorch/rl/pull/473
  • [BugFix] Fixed pip install by @brandonsj in https://github.com/pytorch/rl/pull/475
  • [Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in https://github.com/pytorch/rl/pull/464
  • [Feature]: ModelBased Envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/333
  • [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in https://github.com/pytorch/rl/pull/476
  • [Tutorial] DQN tutorial by @vmoens in https://github.com/pytorch/rl/pull/474
  • [Feature] reader hooks for GymLike by @vmoens in https://github.com/pytorch/rl/pull/478
  • [BugFix] TensorSpec.zero(None) failure fix by @vmoens in https://github.com/pytorch/rl/pull/483
  • [Feature]: Support for planners and CEM by @nicolas-dufour in https://github.com/pytorch/rl/pull/384
  • [Feature] Replaced device_safe() with device by @ordinskiy in https://github.com/pytorch/rl/pull/485
  • [Feature]: TensorDictPrimer transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/456
  • [Feature]: erase() method for torchrl.timeit by @nicolas-dufour in https://github.com/pytorch/rl/pull/480
  • [Feature] Added support for single collector in syncasynccollector by @nicolas-dufour in https://github.com/pytorch/rl/pull/482
  • [BugFix] removing unwanted device_safe() by @vmoens in https://github.com/pytorch/rl/pull/486
  • [Refactoring] Refactored getstatsrandom_rollout by @nicolas-dufour in https://github.com/pytorch/rl/pull/481
  • [Feature] VIP Integration by @JasonMa2016 in https://github.com/pytorch/rl/pull/487
  • [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/489
  • [Feature]: Deactivate typechecks in envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/490
  • [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in https://github.com/pytorch/rl/pull/400
  • [BugFix] Fix TensorDictPrimer init by @vmoens in https://github.com/pytorch/rl/pull/491
  • [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in https://github.com/pytorch/rl/pull/492
  • [BugFix] Defaulting passing_devices to None by @himjohntang in https://github.com/pytorch/rl/pull/477
  • Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in https://github.com/pytorch/rl/pull/494
  • [BugFix] Multi-agent fixes by @vmoens in https://github.com/pytorch/rl/pull/488
  • [BugFix] Defaulting passing_devices to None by @vmoens in https://github.com/pytorch/rl/pull/495
  • [Feature] Lazy initialization of CatTensors by @vmoens in https://github.com/pytorch/rl/pull/497
  • [Cleanup] Removing cuda 10.2 references by @vmoens in https://github.com/pytorch/rl/pull/498
  • [BugFix] Migration to pytorch org by @vmoens in https://github.com/pytorch/rl/pull/499
  • [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in https://github.com/pytorch/rl/pull/500
  • [BugFix] python version for linting checks by @vmoens in https://github.com/pytorch/rl/pull/502
  • [Feature] Replay Buffers refactor by @bamaxw in https://github.com/pytorch/rl/pull/330
  • [Feature] Rename step_tensordict in step_mdp by @romainjln in https://github.com/pytorch/rl/pull/512
  • [Lint] re-instantiate F821 by @vmoens in https://github.com/pytorch/rl/pull/516
  • [BugFix] runtypechecks for TransformedEnvs by @vmoens in https://github.com/pytorch/rl/pull/513
  • [BugFix] making firstdim and lastdim negative in FlattenObservation when a parent is set by @vmoens in https://github.com/pytorch/rl/pull/511
  • [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in https://github.com/pytorch/rl/pull/504
  • [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in https://github.com/pytorch/rl/pull/515
  • [CI] Add coverage with codecov by @silvestrebahi in https://github.com/pytorch/rl/pull/523
  • Revert "[CI] Add coverage with codecov" by @vmoens in https://github.com/pytorch/rl/pull/525
  • [Quality] Use relative imports for local c++ deps by @apbard in https://github.com/pytorch/rl/pull/526
  • [Feature] Nightly release by @vmoens in https://github.com/pytorch/rl/pull/519
  • [Feature] Add make_tensordict() function by @sicong-huang in https://github.com/pytorch/rl/pull/522
  • [Doc] Misc readme fixes by @GavinPHR in https://github.com/pytorch/rl/pull/532
  • [BugFix] Replacing inferencemode decorator with nograd to fix state_dict loading error by @GavinPHR in https://github.com/pytorch/rl/pull/530
  • [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in https://github.com/pytorch/rl/pull/531
  • [Doc] Add coverage banner by @vmoens in https://github.com/pytorch/rl/pull/533
  • [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in https://github.com/pytorch/rl/pull/543
  • [BugFix] Fix optional imports by @vmoens in https://github.com/pytorch/rl/pull/535
  • [BugFix] Restore missing keys in data collector output by @tcbegley in https://github.com/pytorch/rl/pull/521
  • [Lint] reorganize imports by @apbard in https://github.com/pytorch/rl/pull/545
  • [BugFix] Single-cpu compatibility by @vmoens in https://github.com/pytorch/rl/pull/548
  • [BugFix] vision install and other deps in optdeps by @vmoens in https://github.com/pytorch/rl/pull/552
  • [Feature] Implemented device argument for modules.models by @yushiyangk in https://github.com/pytorch/rl/pull/524
  • [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in https://github.com/pytorch/rl/pull/559
  • [BugFix] Additive gaussian exploration spec fix by @vmoens in https://github.com/pytorch/rl/pull/560
  • [BugFix] Disabling video step for wandb by @vmoens in https://github.com/pytorch/rl/pull/561
  • [BugFix] Various device fix by @vmoens in https://github.com/pytorch/rl/pull/558
  • [Feature] Allow collectors to accept regular modules as policies by @tcbegley in https://github.com/pytorch/rl/pull/546
  • [BugFix] Fix push binary nightly action by @psolikov in https://github.com/pytorch/rl/pull/566
  • [BugFix] TensorDict comparison by @vmoens in https://github.com/pytorch/rl/pull/567
  • [BugFix] Fix SyncDataCollector reset by @jrobine in https://github.com/pytorch/rl/pull/571
  • [Doc] Banners on README.md by @vmoens in https://github.com/pytorch/rl/pull/572
  • [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in https://github.com/pytorch/rl/pull/573
  • [BugFix] Add eps to reward normalization by @vmoens in https://github.com/pytorch/rl/pull/574
  • [BugFix] Fix argument for PPOLoss.getentropybonus() by @vmoens in https://github.com/pytorch/rl/pull/578
  • [Feature] Restructure torchrl/objectives by @sgrigory in https://github.com/pytorch/rl/pull/580
  • [Docs] Documentation revamp by @vmoens in https://github.com/pytorch/rl/pull/581
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/582
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/584
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/585
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/586
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/587
  • [Feature] More restrictive tests on docstrings by @vmoens in https://github.com/pytorch/rl/pull/457
  • [BugFix] Wrong stack import in tests by @vmoens in https://github.com/pytorch/rl/pull/590
  • [Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in https://github.com/pytorch/rl/pull/589
  • [Feature]: Dreamer support by @nicolas-dufour in https://github.com/pytorch/rl/pull/341
  • [Doc] Missing doc for prototype RB by @vmoens in https://github.com/pytorch/rl/pull/595
  • [Feature] Update list of supported libraries by @vmoens in https://github.com/pytorch/rl/pull/594
  • [BugFix] Fix timeit count registration by @vmoens in https://github.com/pytorch/rl/pull/598
  • [Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in https://github.com/pytorch/rl/pull/603
  • [Feature] Categorical encoding for action space by @artkorenev in https://github.com/pytorch/rl/pull/593
  • [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in https://github.com/pytorch/rl/pull/614
  • [Doc] Typos in tensordict tutorial by @PaLeroy in https://github.com/pytorch/rl/pull/621
  • [Doc] Integrate knowledge base in docs by @hatala91 in https://github.com/pytorch/rl/pull/622
  • [Doc] Updating docs requirements by @vmoens in https://github.com/pytorch/rl/pull/624
  • [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in https://github.com/pytorch/rl/pull/386
  • [Feature] Habitat integration by @vmoens in https://github.com/pytorch/rl/pull/514
  • [Feature] Checkpointing by @vmoens in https://github.com/pytorch/rl/pull/549
  • Add support for null dim argument in TensorDict.squeeze by @jgonik in https://github.com/pytorch/rl/pull/608
  • [Version] Updating to torch 1.13 by @vmoens in https://github.com/pytorch/rl/pull/627
  • [Feature] Sub-memmap tensors by @vmoens in https://github.com/pytorch/rl/pull/626
  • [BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in https://github.com/pytorch/rl/pull/631
  • [Feature] Unfold transforms for folded TransformedEnv by @alexanderlobov in https://github.com/pytorch/rl/pull/630
  • [BugFix] make TensorDictReplayBuffer.extend call super().extend with stacked_td by @vmoens in https://github.com/pytorch/rl/pull/634
  • [BugFix] correct the use of step_mdp method in data collector by @adityagandhamal in https://github.com/pytorch/rl/pull/637
  • [Feature] Added implement_for decorator by @ordinskiy in https://github.com/pytorch/rl/pull/618
  • [Feature] Make DQN compatible with nn.Module by @svarolgunes in https://github.com/pytorch/rl/pull/632
  • [Example] Distributed Replay Buffer Prototype Example Implementation by @adityagoel4512 in https://github.com/pytorch/rl/pull/615
  • [Feature] Benchmark storage types by @adityagoel4512 in https://github.com/pytorch/rl/pull/633
  • [Feature] Remove wild imports in the library by @sosmond in https://github.com/pytorch/rl/pull/642
  • [BugFix] Prevent transform parent from being reassigned by @jasonfkut in https://github.com/pytorch/rl/pull/641
  • [Feature] Too many deepcopy in transforms.py by @romainjln in https://github.com/pytorch/rl/pull/625
  • [Naming] Rename keysin to inkeys in transforms.py and related modules by @sardaankita in https://github.com/pytorch/rl/pull/656
  • [Refactoring] Refactor dreamer helper in smaller pieces by @vmoens in https://github.com/pytorch/rl/pull/662
  • [Feature] VIPRewardTransform by @vmoens in https://github.com/pytorch/rl/pull/658
  • [BugFix] make_trainer possible bug for on-policy cases by @albertbou92 in https://github.com/pytorch/rl/pull/655
  • [Naming] Fixing key names by @vmoens in https://github.com/pytorch/rl/pull/668
  • [Test] Check dtypes of envs by @vmoens in https://github.com/pytorch/rl/pull/666
  • [Refactor] Relying on the standalone tensordict -- phase 1 by @vmoens in https://github.com/pytorch/rl/pull/650
  • [Doc] More doc on trainers by @vmoens in https://github.com/pytorch/rl/pull/663
  • [BugFix] PPO example GAE import by @albertbou92 in https://github.com/pytorch/rl/pull/671
  • [BugFix] Use GitHub for flake8 pre-commit hook by @vmoens in https://github.com/pytorch/rl/pull/679
  • [BugFix] Update to strict select by @vmoens in https://github.com/pytorch/rl/pull/675
  • [Feature] Auto-compute stats for ObservationNorm by @romainjln in https://github.com/pytorch/rl/pull/669
  • [Doc] makecollector helper function by @albertbou92 in https://github.com/pytorch/rl/pull/678
  • [Doc] BatchSubSampler class docstrings example by @albertbou92 in https://github.com/pytorch/rl/pull/677
  • [BugFix] PPO objective crashes if advantage_module is None by @albertbou92 in https://github.com/pytorch/rl/pull/676
  • [Refactor] Refactor 'next_' into nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/649
  • [Doc] More doc about environments by @vmoens in https://github.com/pytorch/rl/pull/683
  • [Doc] Fix missing tensordict install for doc by @vmoens in https://github.com/pytorch/rl/pull/685
  • [CI] Added CircleCI pipeline to test compatibility across supported gym versions by @ordinskiy in https://github.com/pytorch/rl/pull/645
  • [BugFix] ConvNet forward method with tensors of more than 4 dimensions by @albertbou92 in https://github.com/pytorch/rl/pull/686
  • [Feature] add standard_normal for RewardScaling by @adityagandhamal in https://github.com/pytorch/rl/pull/682
  • [Feature] Jumanji envs by @yingchenlin in https://github.com/pytorch/rl/pull/674
  • [Feature] Default collate_fn by @vmoens in https://github.com/pytorch/rl/pull/688
  • [BugFix] Fix Examples by @vmoens in https://github.com/pytorch/rl/pull/687
  • [Refactoring] Replace direct gym version checks with decorated functions (#) by @ordinskiy in https://github.com/pytorch/rl/pull/691
  • Version 0.0.3 by @vmoens in https://github.com/pytorch/rl/pull/696
  • [Docs] Host TensorDict docs inside TorchRL docs by @tcbegley in https://github.com/pytorch/rl/pull/693
  • [BugFix] Fix docs build by @tcbegley in https://github.com/pytorch/rl/pull/698
  • [BugFix] Proper error messages for orphan transform creation by @vmoens in https://github.com/pytorch/rl/pull/697
  • [Feature] Append, init and insert transforms in ReplayBuffer by @altre in https://github.com/pytorch/rl/pull/695
  • [Feature] A2C objective class and train example by @albertbou92 in https://github.com/pytorch/rl/pull/680
  • [Doc, Test] Add A2C script test and doc by @vmoens in https://github.com/pytorch/rl/pull/702
  • [BugFix] Initialising the classes LazyTensorStorage with a nested TensorDict raises error by @albertbou92 in https://github.com/pytorch/rl/pull/703
  • [BugFix] Fix initrandomframes in A2C example test by @vmoens in https://github.com/pytorch/rl/pull/706
  • [Formatting] Upgrade formatting libs by @vmoens in https://github.com/pytorch/rl/pull/705
  • [Doc] Document undefined symbol error with torch version < 1.13 by @nickspell in https://github.com/pytorch/rl/pull/707
  • [Doc] Tuto integration by @vmoens in https://github.com/pytorch/rl/pull/681
  • [Quality] Deprecate .ipynb tutos by @vmoens in https://github.com/pytorch/rl/pull/710
  • [Test] Fix wrong skip message when functorch is installed by @vmoens in https://github.com/pytorch/rl/pull/711
  • [BugFix, Doc] Clone TensorDict docs into localbuild by @tcbegley in https://github.com/pytorch/rl/pull/712
  • [Feature] Migrate to tensordict.nn.TensorDictModule by @tcbegley in https://github.com/pytorch/rl/pull/700
  • [Doc] Fix Tutos TODOs by @vmoens in https://github.com/pytorch/rl/pull/713
  • [BugFix] RoundRobinWriter, possible duplicated code in the extend method by @albertbou92 in https://github.com/pytorch/rl/pull/709
  • [Feature] Add OptimizerHook by @aakhundov in https://github.com/pytorch/rl/pull/716
  • [Feature] Support for in-place functionalization by @tcbegley in https://github.com/pytorch/rl/pull/714
  • [BugFix] Fix TorchRL demo tutorial by @vmoens in https://github.com/pytorch/rl/pull/721
  • [Docs] Update tutorial links in readme by @tcbegley in https://github.com/pytorch/rl/pull/724
  • [Feature] Extend PPO loss helper to allow for more customisation by @albertbou92 in https://github.com/pytorch/rl/pull/718
  • [BugFix] Model maker functions for A2C and PPO fail for discrete action space envs by @albertbou92 in https://github.com/pytorch/rl/pull/717
  • [Minor] docstrings and setup fixes by @vmoens in https://github.com/pytorch/rl/pull/726
  • [BugFix] Avoid wrongfully erasing observation keys from specs in CatTensors by @vmoens in https://github.com/pytorch/rl/pull/727
  • [BugFix] Avoid wrongfully erasing observation keys from tensordict in CatTensors by @vmoens in https://github.com/pytorch/rl/pull/729
  • [Doc] More doc for data collectors by @vmoens in https://github.com/pytorch/rl/pull/732
  • [Feature] Port testfaketensordict to torchrl by @vmoens in https://github.com/pytorch/rl/pull/731
  • [Feature] Use ObservationNorm.init_stats for stats computation in example scripts by @romainjln in https://github.com/pytorch/rl/pull/715
  • [BugFix] init_stats over multiple dimensions by @vmoens in https://github.com/pytorch/rl/pull/735
  • [Refactor] logger creation in examples by @acforvs in https://github.com/pytorch/rl/pull/733
  • [Feature] Brax envs by @yingchenlin in https://github.com/pytorch/rl/pull/722
  • [Refactor] Adopt prototype ProbabilisticTensorDictModule and ProbabilisticTensorDictSequential by @tcbegley in https://github.com/pytorch/rl/pull/728
  • [Doc] Link to doc in README by @vmoens in https://github.com/pytorch/rl/pull/740
  • [Feature] Make GAE return a 'value_target' entry by @vmoens in https://github.com/pytorch/rl/pull/741
  • [Feature] SamplerWithoutReplacement by @vmoens in https://github.com/pytorch/rl/pull/742
  • [Doc, CI] Update doc workflow to run on PR and only publishes doc on main. by @EmGarr in https://github.com/pytorch/rl/pull/745
  • [Feature] Better advantage API for higher order derivatives by @vmoens in https://github.com/pytorch/rl/pull/744
  • [Refactor] Cosmetic improvements to advantage modules by @vmoens in https://github.com/pytorch/rl/pull/746
  • [BugFix] Fix NoopReset in parallel settings by @vmoens in https://github.com/pytorch/rl/pull/747
  • [Refactor] Remove env.is_done attribute by @vmoens in https://github.com/pytorch/rl/pull/748
  • [Refactor] Drop prototype imports by @tcbegley in https://github.com/pytorch/rl/pull/738
  • [BugFix] Fixes for speed branch merge on tensordict by @vmoens in https://github.com/pytorch/rl/pull/755
  • [BugFix] Fix size-match unsqueeze deprecation by @vmoens in https://github.com/pytorch/rl/pull/750
  • [Feature] FrameSkipTransform by @vmoens in https://github.com/pytorch/rl/pull/749
  • [BugFix] Better memory management for collectors by @vmoens in https://github.com/pytorch/rl/pull/763
  • Minor cleaning in BaseEnv classes by @matteobettini in https://github.com/pytorch/rl/pull/767
  • Revert "Minor cleaning in BaseEnv classes" by @vmoens in https://github.com/pytorch/rl/pull/768
  • Cleaning in envs common.py by @matteobettini in https://github.com/pytorch/rl/pull/769
  • Making _set_seed abstract by @matteobettini in https://github.com/pytorch/rl/pull/770
  • [Feature] Remove the Nd*TensorSpec classes by @riiswa in https://github.com/pytorch/rl/pull/772
  • [BugFix] Reinstantiate custom value key for multioutput value networks by @vmoens in https://github.com/pytorch/rl/pull/754
  • [Feature] Add Step Counter transform by @riiswa in https://github.com/pytorch/rl/pull/756
  • [BugFix] Batched environments with non empty batch size by @matteobettini in https://github.com/pytorch/rl/pull/774
  • Allow undounded boxes creation from gym spaces by @matteobettini in https://github.com/pytorch/rl/pull/778
  • [BugFix] Doc built cmake error by @vmoens in https://github.com/pytorch/rl/pull/780
  • [Feature] Lazy TensorClass storage by @tcbegley in https://github.com/pytorch/rl/pull/752
  • [BugFix] SyncDataCollector init when device and env_device are different by @albertbou92 in https://github.com/pytorch/rl/pull/765
  • [Feature] RewardSum transform by @albertbou92 in https://github.com/pytorch/rl/pull/751
  • [BugFix] Fix PPO clip by @vmoens in https://github.com/pytorch/rl/pull/786
  • [Feature] MultiDiscreteTensorSpec by @riiswa in https://github.com/pytorch/rl/pull/783
  • [Doc] Doc revamp by @vmoens in https://github.com/pytorch/rl/pull/782
  • [BugFix] ParallelEnv handling of done flag by @matteobettini in https://github.com/pytorch/rl/pull/788
  • [BugFix] Sorting nested keys by @matteobettini in https://github.com/pytorch/rl/pull/787
  • [Doc] README index by @vmoens in https://github.com/pytorch/rl/pull/791
  • Add windows wheel build to CircleCI by @yohann-benchetrit in https://github.com/pytorch/rl/pull/759
  • [Algorithm] MPPI planner by @vmoens in https://github.com/pytorch/rl/pull/701
  • [Doc] Better doc links by @vmoens in https://github.com/pytorch/rl/pull/795
  • [Doc] Missing headers by @vmoens in https://github.com/pytorch/rl/pull/796
  • [Doc] Knowledge base section by @vmoens in https://github.com/pytorch/rl/pull/797
  • [Feature] Vmas library wrapper by @matteobettini in https://github.com/pytorch/rl/pull/785
  • [Doc] Duplicate HabitatEnv entry in docs by @matteobettini in https://github.com/pytorch/rl/pull/798
  • [Feature] MultiDiscreteTensorSpec nvec with several axes by @riiswa in https://github.com/pytorch/rl/pull/789
  • [Refactor] Graduate Replay Buffer prototype by @KamilPiechowiak in https://github.com/pytorch/rl/pull/794
  • [BugFix] Solve R3MTransform init problem by @vmoens in https://github.com/pytorch/rl/pull/803
  • [Refactor] Simplify FlattenObservation default kwargs by @vmoens in https://github.com/pytorch/rl/pull/805
  • [Format] Fix lint by @vmoens in https://github.com/pytorch/rl/pull/811
  • [Doc, BugFix] Fix tutos errors by @vmoens in https://github.com/pytorch/rl/pull/817
  • [Doc] Pretrained models tutorial by @vmoens in https://github.com/pytorch/rl/pull/814
  • [Doc, BugFix] Fix tensordictmodule tutorial by @vmoens in https://github.com/pytorch/rl/pull/819
  • [BugFix] Fix MultOneHotDiscreteTensorSpec.is_in by @riiswa in https://github.com/pytorch/rl/pull/818
  • [Doc] Using R3M with a replay buffer by @vmoens in https://github.com/pytorch/rl/pull/820
  • [CodeQuality] call all() without making a list by @riiswa in https://github.com/pytorch/rl/pull/821
  • [BugFix] [Feature] "_reset" flag for env reset by @matteobettini in https://github.com/pytorch/rl/pull/800
  • [CI] Add unit test workflows for Windows by @yohann-benchetrit in https://github.com/pytorch/rl/pull/804
  • [BugFix] Fix habitat integration and doc by @vmoens in https://github.com/pytorch/rl/pull/812
  • [Minor] Better error reporting by @vmoens in https://github.com/pytorch/rl/pull/822
  • [Minor] Add ninja to deps in toml file by @vmoens in https://github.com/pytorch/rl/pull/823
  • [BugFix] Device of info specs by @vmoens in https://github.com/pytorch/rl/pull/824
  • [BugFix] Fix envs specs and info reading by @vmoens in https://github.com/pytorch/rl/pull/825
  • [Feature] Dtype in vmas tests by @matteobettini in https://github.com/pytorch/rl/pull/827
  • [BugFix] Fix R3M observation spec transform by @vmoens in https://github.com/pytorch/rl/pull/830
  • small change to make @robandpdx a contributor by @robandpdx in https://github.com/pytorch/rl/pull/831
  • [Feature] Exclude and select transforms by @vmoens in https://github.com/pytorch/rl/pull/832
  • [BugFix] Updating Recorder to accomodate "solved" key by @ShahRutav in https://github.com/pytorch/rl/pull/833
  • [BugFIx] Changed "set_count" set in collectors by @matteobettini in https://github.com/pytorch/rl/pull/835
  • [Algorithm] Td3 by @BY571 in https://github.com/pytorch/rl/pull/684
  • [Doc] A Succinct Summary of Reinforcement Learning by @vmoens in https://github.com/pytorch/rl/pull/840
  • [Feature, BugFix] ObservationNorm keep_dims and RewardSum init by @vmoens in https://github.com/pytorch/rl/pull/839
  • [BugFix] Improve done checking of collectors by @matteobettini in https://github.com/pytorch/rl/pull/838
  • [BugFix] Sync with tensordict (meta-tensor deprecation) by @vmoens in https://github.com/pytorch/rl/pull/842
  • [Feature] Refactor CatFrames using a proper preallocated buffer by @vmoens in https://github.com/pytorch/rl/pull/847
  • [CI] Add Github-Actions workflows for Windows wheels & nightly-build by @yohann-benchetrit in https://github.com/pytorch/rl/pull/837
  • [Doc] Fix broken link Dreamer by @atonkamanda in https://github.com/pytorch/rl/pull/853
  • [BugFix] Loading state_dict on uninitialized CatFrames by @vmoens in https://github.com/pytorch/rl/pull/855
  • [Refactor] Move loggers to torchrl.record by @vmoens in https://github.com/pytorch/rl/pull/854
  • [Refactor] specs batch size refactoring by @vmoens in https://github.com/pytorch/rl/pull/829
  • [Feature] Max pool Transform by @albertbou92 in https://github.com/pytorch/rl/pull/841
  • [Feature] Refactor advantages for continuous batches by @vmoens in https://github.com/pytorch/rl/pull/848
  • [BugFix, Doc] Minor fix in doc by @vmoens in https://github.com/pytorch/rl/pull/858
  • [Versioning] Version 0.0.4a by @vmoens in https://github.com/pytorch/rl/pull/859
  • [Feature] Vmas to device by @matteobettini in https://github.com/pytorch/rl/pull/850
  • [BugFix] Fix zero-ing from specs in RewardSum by @vmoens in https://github.com/pytorch/rl/pull/860
  • [Feature] Loading R3M and VIP from ResNet by @vmoens in https://github.com/pytorch/rl/pull/863
  • [Feature] SAC V2 by @vmoens in https://github.com/pytorch/rl/pull/864
  • [BugFix] Avoid collision of "step_count" key from transform and collector by @vmoens in https://github.com/pytorch/rl/pull/868
  • [Refactor] Better init for CatFrames buffers + removing default init values by @vmoens in https://github.com/pytorch/rl/pull/874
  • [Refactor] Minor refactorings to envs by @vmoens in https://github.com/pytorch/rl/pull/872
  • [Refactor] Removing inplace transform attribute by @vmoens in https://github.com/pytorch/rl/pull/871
  • [BugFix] Run checks when creating fake_td by @vmoens in https://github.com/pytorch/rl/pull/877
  • [Refactor] Box device by @vmoens in https://github.com/pytorch/rl/pull/881
  • [Feature] Multithreaded env by @sgrigory in https://github.com/pytorch/rl/pull/734
  • [Refactor] Turn off default advantage normalization in PPO by @vmoens in https://github.com/pytorch/rl/pull/887
  • [CI] Fix habitat-gym imports by @vmoens in https://github.com/pytorch/rl/pull/890
  • [CI] Fix cuda versions by @vmoens in https://github.com/pytorch/rl/pull/889
  • [CI] Fix windows install by @vmoens in https://github.com/pytorch/rl/pull/888
  • MacOS CPU unit test workflow using GitHub Actions by @robandpdx in https://github.com/pytorch/rl/pull/886
  • Linux CPU unit test workflow using GitHub Actions by @robandpdx in https://github.com/pytorch/rl/pull/826
  • [Major, BugFix, Test] Refactor Transforms tests by @vmoens in https://github.com/pytorch/rl/pull/878
  • [Bugfix] Codecov does not cover multiprocessed tests #879 by @kadeng in https://github.com/pytorch/rl/pull/893
  • [CI, BugFix] Fix gym related errors by @vmoens in https://github.com/pytorch/rl/pull/895
  • [WIP] Linux GPU unit test workflow using GitHub Actions by @robandpdx in https://github.com/pytorch/rl/pull/885
  • [BugFix] Compose cloning fix by @vmoens in https://github.com/pytorch/rl/pull/899
  • [Feature] Simplifying collector envs by @vmoens in https://github.com/pytorch/rl/pull/870
  • [CI,Feature] Upgrade to gymnasium by @vmoens in https://github.com/pytorch/rl/pull/898
  • [Doc] Add record utils to doc by @vmoens in https://github.com/pytorch/rl/pull/904
  • [Test] Improve exception message match by @apbard in https://github.com/pytorch/rl/pull/906
  • [BugFix] Dreamer helpers are broken with batched envs by @vmoens in https://github.com/pytorch/rl/pull/903
  • [Feature] RandomCropTensorDict transform by @vmoens in https://github.com/pytorch/rl/pull/908
  • [Versioning] Version 0.0.4b by @vmoens in https://github.com/pytorch/rl/pull/909

New Contributors

  • @sladebot made their first contribution in https://github.com/pytorch/rl/pull/435
  • @rayanht made their first contribution in https://github.com/pytorch/rl/pull/432
  • @brandonsj made their first contribution in https://github.com/pytorch/rl/pull/475
  • @ordinskiy made their first contribution in https://github.com/pytorch/rl/pull/485
  • @JasonMa2016 made their first contribution in https://github.com/pytorch/rl/pull/487
  • @himjohntang made their first contribution in https://github.com/pytorch/rl/pull/477
  • @romainjln made their first contribution in https://github.com/pytorch/rl/pull/512
  • @apbard made their first contribution in https://github.com/pytorch/rl/pull/526
  • @sicong-huang made their first contribution in https://github.com/pytorch/rl/pull/522
  • @psolikov made their first contribution in https://github.com/pytorch/rl/pull/566
  • @jrobine made their first contribution in https://github.com/pytorch/rl/pull/571
  • @nikhlrao made their first contribution in https://github.com/pytorch/rl/pull/573
  • @sgrigory made their first contribution in https://github.com/pytorch/rl/pull/580
  • @jlesuffleur made their first contribution in https://github.com/pytorch/rl/pull/589
  • @artkorenev made their first contribution in https://github.com/pytorch/rl/pull/593
  • @paulomarciano made their first contribution in https://github.com/pytorch/rl/pull/614
  • @hatala91 made their first contribution in https://github.com/pytorch/rl/pull/622
  • @jgonik made their first contribution in https://github.com/pytorch/rl/pull/608
  • @adityagandhamal made their first contribution in https://github.com/pytorch/rl/pull/637
  • @svarolgunes made their first contribution in https://github.com/pytorch/rl/pull/632
  • @adityagoel4512 made their first contribution in https://github.com/pytorch/rl/pull/615
  • @jasonfkut made their first contribution in https://github.com/pytorch/rl/pull/641
  • @sardaankita made their first contribution in https://github.com/pytorch/rl/pull/656
  • @albertbou92 made their first contribution in https://github.com/pytorch/rl/pull/655
  • @yingchenlin made their first contribution in https://github.com/pytorch/rl/pull/674
  • @altre made their first contribution in https://github.com/pytorch/rl/pull/695
  • @nickspell made their first contribution in https://github.com/pytorch/rl/pull/707
  • @aakhundov made their first contribution in https://github.com/pytorch/rl/pull/716
  • @acforvs made their first contribution in https://github.com/pytorch/rl/pull/733
  • @EmGarr made their first contribution in https://github.com/pytorch/rl/pull/745
  • @matteobettini made their first contribution in https://github.com/pytorch/rl/pull/767
  • @riiswa made their first contribution in https://github.com/pytorch/rl/pull/772
  • @yohann-benchetrit made their first contribution in https://github.com/pytorch/rl/pull/759
  • @KamilPiechowiak made their first contribution in https://github.com/pytorch/rl/pull/794
  • @robandpdx made their first contribution in https://github.com/pytorch/rl/pull/831
  • @ShahRutav made their first contribution in https://github.com/pytorch/rl/pull/833
  • @BY571 made their first contribution in https://github.com/pytorch/rl/pull/684
  • @atonkamanda made their first contribution in https://github.com/pytorch/rl/pull/853
  • @kadeng made their first contribution in https://github.com/pytorch/rl/pull/893

Full Changelog: https://github.com/pytorch/rl/compare/v0.0.2a...v0.0.4b

- Python
Published by vmoens over 3 years ago

torchrl - v0.0.4-alpha

What's Changed

  • [CI, Doc] Update functorch source installation command by @zou3519 in https://github.com/pytorch/rl/pull/446
  • [BugFix] TransformedEnv attributes inheritance by @vmoens in https://github.com/pytorch/rl/pull/467
  • [Feature] Cleanup mocking envs init and new by @vmoens in https://github.com/pytorch/rl/pull/469
  • [Tests] Adding tensordict __repr__ tests by @sladebot in https://github.com/pytorch/rl/pull/435
  • [Logging]: implement MLFlow logging integration by @rayanht in https://github.com/pytorch/rl/pull/432
  • [BugFix] MLFlow import fix by @vmoens in https://github.com/pytorch/rl/pull/473
  • [BugFix] Fixed pip install by @brandonsj in https://github.com/pytorch/rl/pull/475
  • [Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in https://github.com/pytorch/rl/pull/464
  • [Feature]: ModelBased Envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/333
  • [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in https://github.com/pytorch/rl/pull/476
  • [Tutorial] DQN tutorial by @vmoens in https://github.com/pytorch/rl/pull/474
  • [Feature] reader hooks for GymLike by @vmoens in https://github.com/pytorch/rl/pull/478
  • [BugFix] TensorSpec.zero(None) failure fix by @vmoens in https://github.com/pytorch/rl/pull/483
  • [Feature]: Support for planners and CEM by @nicolas-dufour in https://github.com/pytorch/rl/pull/384
  • [Feature] Replaced device_safe() with device by @ordinskiy in https://github.com/pytorch/rl/pull/485
  • [Feature]: TensorDictPrimer transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/456
  • [Feature]: erase() method for torchrl.timeit by @nicolas-dufour in https://github.com/pytorch/rl/pull/480
  • [Feature] Added support for single collector in syncasynccollector by @nicolas-dufour in https://github.com/pytorch/rl/pull/482
  • [BugFix] removing unwanted device_safe() by @vmoens in https://github.com/pytorch/rl/pull/486
  • [Refactoring] Refactored getstatsrandom_rollout by @nicolas-dufour in https://github.com/pytorch/rl/pull/481
  • [Feature] VIP Integration by @JasonMa2016 in https://github.com/pytorch/rl/pull/487
  • [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/489
  • [Feature]: Deactivate typechecks in envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/490
  • [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in https://github.com/pytorch/rl/pull/400
  • [BugFix] Fix TensorDictPrimer init by @vmoens in https://github.com/pytorch/rl/pull/491
  • [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in https://github.com/pytorch/rl/pull/492
  • [BugFix] Defaulting passing_devices to None by @himjohntang in https://github.com/pytorch/rl/pull/477
  • Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in https://github.com/pytorch/rl/pull/494
  • [BugFix] Multi-agent fixes by @vmoens in https://github.com/pytorch/rl/pull/488
  • [BugFix] Defaulting passing_devices to None by @vmoens in https://github.com/pytorch/rl/pull/495
  • [Feature] Lazy initialization of CatTensors by @vmoens in https://github.com/pytorch/rl/pull/497
  • [Cleanup] Removing cuda 10.2 references by @vmoens in https://github.com/pytorch/rl/pull/498
  • [BugFix] Migration to pytorch org by @vmoens in https://github.com/pytorch/rl/pull/499
  • [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in https://github.com/pytorch/rl/pull/500
  • [BugFix] python version for linting checks by @vmoens in https://github.com/pytorch/rl/pull/502
  • [Feature] Replay Buffers refactor by @bamaxw in https://github.com/pytorch/rl/pull/330
  • [Feature] Rename step_tensordict in step_mdp by @romainjln in https://github.com/pytorch/rl/pull/512
  • [Lint] re-instantiate F821 by @vmoens in https://github.com/pytorch/rl/pull/516
  • [BugFix] runtypechecks for TransformedEnvs by @vmoens in https://github.com/pytorch/rl/pull/513
  • [BugFix] making firstdim and lastdim negative in FlattenObservation when a parent is set by @vmoens in https://github.com/pytorch/rl/pull/511
  • [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in https://github.com/pytorch/rl/pull/504
  • [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in https://github.com/pytorch/rl/pull/515
  • [CI] Add coverage with codecov by @silvestrebahi in https://github.com/pytorch/rl/pull/523
  • Revert "[CI] Add coverage with codecov" by @vmoens in https://github.com/pytorch/rl/pull/525
  • [Quality] Use relative imports for local c++ deps by @apbard in https://github.com/pytorch/rl/pull/526
  • [Feature] Nightly release by @vmoens in https://github.com/pytorch/rl/pull/519
  • [Feature] Add make_tensordict() function by @sicong-huang in https://github.com/pytorch/rl/pull/522
  • [Doc] Misc readme fixes by @GavinPHR in https://github.com/pytorch/rl/pull/532
  • [BugFix] Replacing inferencemode decorator with nograd to fix state_dict loading error by @GavinPHR in https://github.com/pytorch/rl/pull/530
  • [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in https://github.com/pytorch/rl/pull/531
  • [Doc] Add coverage banner by @vmoens in https://github.com/pytorch/rl/pull/533
  • [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in https://github.com/pytorch/rl/pull/543
  • [BugFix] Fix optional imports by @vmoens in https://github.com/pytorch/rl/pull/535
  • [BugFix] Restore missing keys in data collector output by @tcbegley in https://github.com/pytorch/rl/pull/521
  • [Lint] reorganize imports by @apbard in https://github.com/pytorch/rl/pull/545
  • [BugFix] Single-cpu compatibility by @vmoens in https://github.com/pytorch/rl/pull/548
  • [BugFix] vision install and other deps in optdeps by @vmoens in https://github.com/pytorch/rl/pull/552
  • [Feature] Implemented device argument for modules.models by @yushiyangk in https://github.com/pytorch/rl/pull/524
  • [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in https://github.com/pytorch/rl/pull/559
  • [BugFix] Additive gaussian exploration spec fix by @vmoens in https://github.com/pytorch/rl/pull/560
  • [BugFix] Disabling video step for wandb by @vmoens in https://github.com/pytorch/rl/pull/561
  • [BugFix] Various device fix by @vmoens in https://github.com/pytorch/rl/pull/558
  • [Feature] Allow collectors to accept regular modules as policies by @tcbegley in https://github.com/pytorch/rl/pull/546
  • [BugFix] Fix push binary nightly action by @psolikov in https://github.com/pytorch/rl/pull/566
  • [BugFix] TensorDict comparison by @vmoens in https://github.com/pytorch/rl/pull/567
  • [BugFix] Fix SyncDataCollector reset by @jrobine in https://github.com/pytorch/rl/pull/571
  • [Doc] Banners on README.md by @vmoens in https://github.com/pytorch/rl/pull/572
  • [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in https://github.com/pytorch/rl/pull/573
  • [BugFix] Add eps to reward normalization by @vmoens in https://github.com/pytorch/rl/pull/574
  • [BugFix] Fix argument for PPOLoss.getentropybonus() by @vmoens in https://github.com/pytorch/rl/pull/578
  • [Feature] Restructure torchrl/objectives by @sgrigory in https://github.com/pytorch/rl/pull/580
  • [Docs] Documentation revamp by @vmoens in https://github.com/pytorch/rl/pull/581
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/582
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/584
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/585
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/586
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/587
  • [Feature] More restrictive tests on docstrings by @vmoens in https://github.com/pytorch/rl/pull/457
  • [BugFix] Wrong stack import in tests by @vmoens in https://github.com/pytorch/rl/pull/590
  • [Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in https://github.com/pytorch/rl/pull/589
  • [Feature]: Dreamer support by @nicolas-dufour in https://github.com/pytorch/rl/pull/341
  • [Doc] Missing doc for prototype RB by @vmoens in https://github.com/pytorch/rl/pull/595
  • [Feature] Update list of supported libraries by @vmoens in https://github.com/pytorch/rl/pull/594
  • [BugFix] Fix timeit count registration by @vmoens in https://github.com/pytorch/rl/pull/598
  • [Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in https://github.com/pytorch/rl/pull/603
  • [Feature] Categorical encoding for action space by @artkorenev in https://github.com/pytorch/rl/pull/593
  • [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in https://github.com/pytorch/rl/pull/614
  • [Doc] Typos in tensordict tutorial by @PaLeroy in https://github.com/pytorch/rl/pull/621
  • [Doc] Integrate knowledge base in docs by @hatala91 in https://github.com/pytorch/rl/pull/622
  • [Doc] Updating docs requirements by @vmoens in https://github.com/pytorch/rl/pull/624
  • [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in https://github.com/pytorch/rl/pull/386
  • [Feature] Habitat integration by @vmoens in https://github.com/pytorch/rl/pull/514
  • [Feature] Checkpointing by @vmoens in https://github.com/pytorch/rl/pull/549
  • Add support for null dim argument in TensorDict.squeeze by @jgonik in https://github.com/pytorch/rl/pull/608
  • [Version] Updating to torch 1.13 by @vmoens in https://github.com/pytorch/rl/pull/627
  • [Feature] Sub-memmap tensors by @vmoens in https://github.com/pytorch/rl/pull/626
  • [BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in https://github.com/pytorch/rl/pull/631
  • [Feature] Unfold transforms for folded TransformedEnv by @alexanderlobov in https://github.com/pytorch/rl/pull/630
  • [BugFix] make TensorDictReplayBuffer.extend call super().extend with stacked_td by @vmoens in https://github.com/pytorch/rl/pull/634
  • [BugFix] correct the use of step_mdp method in data collector by @adityagandhamal in https://github.com/pytorch/rl/pull/637
  • [Feature] Added implement_for decorator by @ordinskiy in https://github.com/pytorch/rl/pull/618
  • [Feature] Make DQN compatible with nn.Module by @svarolgunes in https://github.com/pytorch/rl/pull/632
  • [Example] Distributed Replay Buffer Prototype Example Implementation by @adityagoel4512 in https://github.com/pytorch/rl/pull/615
  • [Feature] Benchmark storage types by @adityagoel4512 in https://github.com/pytorch/rl/pull/633
  • [Feature] Remove wild imports in the library by @sosmond in https://github.com/pytorch/rl/pull/642
  • [BugFix] Prevent transform parent from being reassigned by @jasonfkut in https://github.com/pytorch/rl/pull/641
  • [Feature] Too many deepcopy in transforms.py by @romainjln in https://github.com/pytorch/rl/pull/625
  • [Naming] Rename keysin to inkeys in transforms.py and related modules by @sardaankita in https://github.com/pytorch/rl/pull/656
  • [Refactoring] Refactor dreamer helper in smaller pieces by @vmoens in https://github.com/pytorch/rl/pull/662
  • [Feature] VIPRewardTransform by @vmoens in https://github.com/pytorch/rl/pull/658
  • [BugFix] make_trainer possible bug for on-policy cases by @albertbou92 in https://github.com/pytorch/rl/pull/655
  • [Naming] Fixing key names by @vmoens in https://github.com/pytorch/rl/pull/668
  • [Test] Check dtypes of envs by @vmoens in https://github.com/pytorch/rl/pull/666
  • [Refactor] Relying on the standalone tensordict -- phase 1 by @vmoens in https://github.com/pytorch/rl/pull/650
  • [Doc] More doc on trainers by @vmoens in https://github.com/pytorch/rl/pull/663
  • [BugFix] PPO example GAE import by @albertbou92 in https://github.com/pytorch/rl/pull/671
  • [BugFix] Use GitHub for flake8 pre-commit hook by @vmoens in https://github.com/pytorch/rl/pull/679
  • [BugFix] Update to strict select by @vmoens in https://github.com/pytorch/rl/pull/675
  • [Feature] Auto-compute stats for ObservationNorm by @romainjln in https://github.com/pytorch/rl/pull/669
  • [Doc] makecollector helper function by @albertbou92 in https://github.com/pytorch/rl/pull/678
  • [Doc] BatchSubSampler class docstrings example by @albertbou92 in https://github.com/pytorch/rl/pull/677
  • [BugFix] PPO objective crashes if advantage_module is None by @albertbou92 in https://github.com/pytorch/rl/pull/676
  • [Refactor] Refactor 'next_' into nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/649
  • [Doc] More doc about environments by @vmoens in https://github.com/pytorch/rl/pull/683
  • [Doc] Fix missing tensordict install for doc by @vmoens in https://github.com/pytorch/rl/pull/685
  • [CI] Added CircleCI pipeline to test compatibility across supported gym versions by @ordinskiy in https://github.com/pytorch/rl/pull/645
  • [BugFix] ConvNet forward method with tensors of more than 4 dimensions by @albertbou92 in https://github.com/pytorch/rl/pull/686
  • [Feature] add standard_normal for RewardScaling by @adityagandhamal in https://github.com/pytorch/rl/pull/682
  • [Feature] Jumanji envs by @yingchenlin in https://github.com/pytorch/rl/pull/674
  • [Feature] Default collate_fn by @vmoens in https://github.com/pytorch/rl/pull/688
  • [BugFix] Fix Examples by @vmoens in https://github.com/pytorch/rl/pull/687
  • [Refactoring] Replace direct gym version checks with decorated functions (#) by @ordinskiy in https://github.com/pytorch/rl/pull/691
  • Version 0.0.3 by @vmoens in https://github.com/pytorch/rl/pull/696
  • [Docs] Host TensorDict docs inside TorchRL docs by @tcbegley in https://github.com/pytorch/rl/pull/693
  • [BugFix] Fix docs build by @tcbegley in https://github.com/pytorch/rl/pull/698
  • [BugFix] Proper error messages for orphan transform creation by @vmoens in https://github.com/pytorch/rl/pull/697
  • [Feature] Append, init and insert transforms in ReplayBuffer by @altre in https://github.com/pytorch/rl/pull/695
  • [Feature] A2C objective class and train example by @albertbou92 in https://github.com/pytorch/rl/pull/680
  • [Doc, Test] Add A2C script test and doc by @vmoens in https://github.com/pytorch/rl/pull/702
  • [BugFix] Initialising the classes LazyTensorStorage with a nested TensorDict raises error by @albertbou92 in https://github.com/pytorch/rl/pull/703
  • [BugFix] Fix initrandomframes in A2C example test by @vmoens in https://github.com/pytorch/rl/pull/706
  • [Formatting] Upgrade formatting libs by @vmoens in https://github.com/pytorch/rl/pull/705
  • [Doc] Document undefined symbol error with torch version < 1.13 by @nickspell in https://github.com/pytorch/rl/pull/707
  • [Doc] Tuto integration by @vmoens in https://github.com/pytorch/rl/pull/681
  • [Quality] Deprecate .ipynb tutos by @vmoens in https://github.com/pytorch/rl/pull/710
  • [Test] Fix wrong skip message when functorch is installed by @vmoens in https://github.com/pytorch/rl/pull/711
  • [BugFix, Doc] Clone TensorDict docs into localbuild by @tcbegley in https://github.com/pytorch/rl/pull/712
  • [Feature] Migrate to tensordict.nn.TensorDictModule by @tcbegley in https://github.com/pytorch/rl/pull/700
  • [Doc] Fix Tutos TODOs by @vmoens in https://github.com/pytorch/rl/pull/713
  • [BugFix] RoundRobinWriter, possible duplicated code in the extend method by @albertbou92 in https://github.com/pytorch/rl/pull/709
  • [Feature] Add OptimizerHook by @aakhundov in https://github.com/pytorch/rl/pull/716
  • [Feature] Support for in-place functionalization by @tcbegley in https://github.com/pytorch/rl/pull/714
  • [BugFix] Fix TorchRL demo tutorial by @vmoens in https://github.com/pytorch/rl/pull/721
  • [Docs] Update tutorial links in readme by @tcbegley in https://github.com/pytorch/rl/pull/724
  • [Feature] Extend PPO loss helper to allow for more customisation by @albertbou92 in https://github.com/pytorch/rl/pull/718
  • [BugFix] Model maker functions for A2C and PPO fail for discrete action space envs by @albertbou92 in https://github.com/pytorch/rl/pull/717
  • [Minor] docstrings and setup fixes by @vmoens in https://github.com/pytorch/rl/pull/726
  • [BugFix] Avoid wrongfully erasing observation keys from specs in CatTensors by @vmoens in https://github.com/pytorch/rl/pull/727
  • [BugFix] Avoid wrongfully erasing observation keys from tensordict in CatTensors by @vmoens in https://github.com/pytorch/rl/pull/729
  • [Doc] More doc for data collectors by @vmoens in https://github.com/pytorch/rl/pull/732
  • [Feature] Port testfaketensordict to torchrl by @vmoens in https://github.com/pytorch/rl/pull/731
  • [Feature] Use ObservationNorm.init_stats for stats computation in example scripts by @romainjln in https://github.com/pytorch/rl/pull/715
  • [BugFix] init_stats over multiple dimensions by @vmoens in https://github.com/pytorch/rl/pull/735
  • [Refactor] logger creation in examples by @acforvs in https://github.com/pytorch/rl/pull/733
  • [Feature] Brax envs by @yingchenlin in https://github.com/pytorch/rl/pull/722
  • [Refactor] Adopt prototype ProbabilisticTensorDictModule and ProbabilisticTensorDictSequential by @tcbegley in https://github.com/pytorch/rl/pull/728
  • [Doc] Link to doc in README by @vmoens in https://github.com/pytorch/rl/pull/740
  • [Feature] Make GAE return a 'value_target' entry by @vmoens in https://github.com/pytorch/rl/pull/741
  • [Feature] SamplerWithoutReplacement by @vmoens in https://github.com/pytorch/rl/pull/742
  • [Doc, CI] Update doc workflow to run on PR and only publishes doc on main. by @EmGarr in https://github.com/pytorch/rl/pull/745
  • [Feature] Better advantage API for higher order derivatives by @vmoens in https://github.com/pytorch/rl/pull/744
  • [Refactor] Cosmetic improvements to advantage modules by @vmoens in https://github.com/pytorch/rl/pull/746
  • [BugFix] Fix NoopReset in parallel settings by @vmoens in https://github.com/pytorch/rl/pull/747
  • [Refactor] Remove env.is_done attribute by @vmoens in https://github.com/pytorch/rl/pull/748
  • [Refactor] Drop prototype imports by @tcbegley in https://github.com/pytorch/rl/pull/738
  • [BugFix] Fixes for speed branch merge on tensordict by @vmoens in https://github.com/pytorch/rl/pull/755
  • [BugFix] Fix size-match unsqueeze deprecation by @vmoens in https://github.com/pytorch/rl/pull/750
  • [Feature] FrameSkipTransform by @vmoens in https://github.com/pytorch/rl/pull/749
  • [BugFix] Better memory management for collectors by @vmoens in https://github.com/pytorch/rl/pull/763
  • Minor cleaning in BaseEnv classes by @matteobettini in https://github.com/pytorch/rl/pull/767
  • Revert "Minor cleaning in BaseEnv classes" by @vmoens in https://github.com/pytorch/rl/pull/768
  • Cleaning in envs common.py by @matteobettini in https://github.com/pytorch/rl/pull/769
  • Making _set_seed abstract by @matteobettini in https://github.com/pytorch/rl/pull/770
  • [Feature] Remove the Nd*TensorSpec classes by @riiswa in https://github.com/pytorch/rl/pull/772
  • [BugFix] Reinstantiate custom value key for multioutput value networks by @vmoens in https://github.com/pytorch/rl/pull/754
  • [Feature] Add Step Counter transform by @riiswa in https://github.com/pytorch/rl/pull/756
  • [BugFix] Batched environments with non empty batch size by @matteobettini in https://github.com/pytorch/rl/pull/774
  • Allow undounded boxes creation from gym spaces by @matteobettini in https://github.com/pytorch/rl/pull/778
  • [BugFix] Doc built cmake error by @vmoens in https://github.com/pytorch/rl/pull/780
  • [Feature] Lazy TensorClass storage by @tcbegley in https://github.com/pytorch/rl/pull/752
  • [BugFix] SyncDataCollector init when device and env_device are different by @albertbou92 in https://github.com/pytorch/rl/pull/765
  • [Feature] RewardSum transform by @albertbou92 in https://github.com/pytorch/rl/pull/751
  • [BugFix] Fix PPO clip by @vmoens in https://github.com/pytorch/rl/pull/786
  • [Feature] MultiDiscreteTensorSpec by @riiswa in https://github.com/pytorch/rl/pull/783
  • [Doc] Doc revamp by @vmoens in https://github.com/pytorch/rl/pull/782
  • [BugFix] ParallelEnv handling of done flag by @matteobettini in https://github.com/pytorch/rl/pull/788
  • [BugFix] Sorting nested keys by @matteobettini in https://github.com/pytorch/rl/pull/787
  • [Doc] README index by @vmoens in https://github.com/pytorch/rl/pull/791
  • Add windows wheel build to CircleCI by @yohann-benchetrit in https://github.com/pytorch/rl/pull/759
  • [Algorithm] MPPI planner by @vmoens in https://github.com/pytorch/rl/pull/701
  • [Doc] Better doc links by @vmoens in https://github.com/pytorch/rl/pull/795
  • [Doc] Missing headers by @vmoens in https://github.com/pytorch/rl/pull/796
  • [Doc] Knowledge base section by @vmoens in https://github.com/pytorch/rl/pull/797
  • [Feature] Vmas library wrapper by @matteobettini in https://github.com/pytorch/rl/pull/785
  • [Doc] Duplicate HabitatEnv entry in docs by @matteobettini in https://github.com/pytorch/rl/pull/798
  • [Feature] MultiDiscreteTensorSpec nvec with several axes by @riiswa in https://github.com/pytorch/rl/pull/789
  • [Refactor] Graduate Replay Buffer prototype by @KamilPiechowiak in https://github.com/pytorch/rl/pull/794
  • [BugFix] Solve R3MTransform init problem by @vmoens in https://github.com/pytorch/rl/pull/803
  • [Refactor] Simplify FlattenObservation default kwargs by @vmoens in https://github.com/pytorch/rl/pull/805
  • [Format] Fix lint by @vmoens in https://github.com/pytorch/rl/pull/811
  • [Doc, BugFix] Fix tutos errors by @vmoens in https://github.com/pytorch/rl/pull/817
  • [Doc] Pretrained models tutorial by @vmoens in https://github.com/pytorch/rl/pull/814
  • [Doc, BugFix] Fix tensordictmodule tutorial by @vmoens in https://github.com/pytorch/rl/pull/819
  • [BugFix] Fix MultOneHotDiscreteTensorSpec.is_in by @riiswa in https://github.com/pytorch/rl/pull/818
  • [Doc] Using R3M with a replay buffer by @vmoens in https://github.com/pytorch/rl/pull/820
  • [CodeQuality] call all() without making a list by @riiswa in https://github.com/pytorch/rl/pull/821
  • [BugFix] [Feature] "_reset" flag for env reset by @matteobettini in https://github.com/pytorch/rl/pull/800
  • [CI] Add unit test workflows for Windows by @yohann-benchetrit in https://github.com/pytorch/rl/pull/804
  • [BugFix] Fix habitat integration and doc by @vmoens in https://github.com/pytorch/rl/pull/812
  • [Minor] Better error reporting by @vmoens in https://github.com/pytorch/rl/pull/822
  • [Minor] Add ninja to deps in toml file by @vmoens in https://github.com/pytorch/rl/pull/823
  • [BugFix] Device of info specs by @vmoens in https://github.com/pytorch/rl/pull/824
  • [BugFix] Fix envs specs and info reading by @vmoens in https://github.com/pytorch/rl/pull/825
  • [Feature] Dtype in vmas tests by @matteobettini in https://github.com/pytorch/rl/pull/827
  • [BugFix] Fix R3M observation spec transform by @vmoens in https://github.com/pytorch/rl/pull/830
  • small change to make @robandpdx a contributor by @robandpdx in https://github.com/pytorch/rl/pull/831
  • [Feature] Exclude and select transforms by @vmoens in https://github.com/pytorch/rl/pull/832
  • [BugFix] Updating Recorder to accomodate "solved" key by @ShahRutav in https://github.com/pytorch/rl/pull/833
  • [BugFIx] Changed "set_count" set in collectors by @matteobettini in https://github.com/pytorch/rl/pull/835
  • [Algorithm] Td3 by @BY571 in https://github.com/pytorch/rl/pull/684
  • [Doc] A Succinct Summary of Reinforcement Learning by @vmoens in https://github.com/pytorch/rl/pull/840
  • [Feature, BugFix] ObservationNorm keep_dims and RewardSum init by @vmoens in https://github.com/pytorch/rl/pull/839
  • [BugFix] Improve done checking of collectors by @matteobettini in https://github.com/pytorch/rl/pull/838
  • [BugFix] Sync with tensordict (meta-tensor deprecation) by @vmoens in https://github.com/pytorch/rl/pull/842
  • [Feature] Refactor CatFrames using a proper preallocated buffer by @vmoens in https://github.com/pytorch/rl/pull/847
  • [CI] Add Github-Actions workflows for Windows wheels & nightly-build by @yohann-benchetrit in https://github.com/pytorch/rl/pull/837
  • [Doc] Fix broken link Dreamer by @atonkamanda in https://github.com/pytorch/rl/pull/853
  • [BugFix] Loading state_dict on uninitialized CatFrames by @vmoens in https://github.com/pytorch/rl/pull/855
  • [Refactor] Move loggers to torchrl.record by @vmoens in https://github.com/pytorch/rl/pull/854
  • [Refactor] specs batch size refactoring by @vmoens in https://github.com/pytorch/rl/pull/829
  • [Feature] Max pool Transform by @albertbou92 in https://github.com/pytorch/rl/pull/841
  • [Feature] Refactor advantages for continuous batches by @vmoens in https://github.com/pytorch/rl/pull/848
  • [BugFix, Doc] Minor fix in doc by @vmoens in https://github.com/pytorch/rl/pull/858

New Contributors

  • @sladebot made their first contribution in https://github.com/pytorch/rl/pull/435
  • @rayanht made their first contribution in https://github.com/pytorch/rl/pull/432
  • @brandonsj made their first contribution in https://github.com/pytorch/rl/pull/475
  • @ordinskiy made their first contribution in https://github.com/pytorch/rl/pull/485
  • @JasonMa2016 made their first contribution in https://github.com/pytorch/rl/pull/487
  • @himjohntang made their first contribution in https://github.com/pytorch/rl/pull/477
  • @romainjln made their first contribution in https://github.com/pytorch/rl/pull/512
  • @apbard made their first contribution in https://github.com/pytorch/rl/pull/526
  • @sicong-huang made their first contribution in https://github.com/pytorch/rl/pull/522
  • @psolikov made their first contribution in https://github.com/pytorch/rl/pull/566
  • @jrobine made their first contribution in https://github.com/pytorch/rl/pull/571
  • @nikhlrao made their first contribution in https://github.com/pytorch/rl/pull/573
  • @sgrigory made their first contribution in https://github.com/pytorch/rl/pull/580
  • @jlesuffleur made their first contribution in https://github.com/pytorch/rl/pull/589
  • @artkorenev made their first contribution in https://github.com/pytorch/rl/pull/593
  • @paulomarciano made their first contribution in https://github.com/pytorch/rl/pull/614
  • @hatala91 made their first contribution in https://github.com/pytorch/rl/pull/622
  • @jgonik made their first contribution in https://github.com/pytorch/rl/pull/608
  • @adityagandhamal made their first contribution in https://github.com/pytorch/rl/pull/637
  • @svarolgunes made their first contribution in https://github.com/pytorch/rl/pull/632
  • @adityagoel4512 made their first contribution in https://github.com/pytorch/rl/pull/615
  • @jasonfkut made their first contribution in https://github.com/pytorch/rl/pull/641
  • @sardaankita made their first contribution in https://github.com/pytorch/rl/pull/656
  • @albertbou92 made their first contribution in https://github.com/pytorch/rl/pull/655
  • @yingchenlin made their first contribution in https://github.com/pytorch/rl/pull/674
  • @altre made their first contribution in https://github.com/pytorch/rl/pull/695
  • @nickspell made their first contribution in https://github.com/pytorch/rl/pull/707
  • @aakhundov made their first contribution in https://github.com/pytorch/rl/pull/716
  • @acforvs made their first contribution in https://github.com/pytorch/rl/pull/733
  • @EmGarr made their first contribution in https://github.com/pytorch/rl/pull/745
  • @matteobettini made their first contribution in https://github.com/pytorch/rl/pull/767
  • @riiswa made their first contribution in https://github.com/pytorch/rl/pull/772
  • @yohann-benchetrit made their first contribution in https://github.com/pytorch/rl/pull/759
  • @KamilPiechowiak made their first contribution in https://github.com/pytorch/rl/pull/794
  • @robandpdx made their first contribution in https://github.com/pytorch/rl/pull/831
  • @ShahRutav made their first contribution in https://github.com/pytorch/rl/pull/833
  • @BY571 made their first contribution in https://github.com/pytorch/rl/pull/684
  • @atonkamanda made their first contribution in https://github.com/pytorch/rl/pull/853

Full Changelog: https://github.com/pytorch/rl/compare/v0.0.2a...0.0.4a

- Python
Published by vmoens over 3 years ago

torchrl - v0.0.3

The main changes introduced by this release are: - dependency on the standalone tensordict repo; - refactoring of the "next" API

What's Changed

  • [Versioning] MacOs versioning and release bugfix by @vmoens in https://github.com/pytorch/rl/pull/247
  • [Versioning] Setup metadata by @vmoens in https://github.com/pytorch/rl/pull/248
  • [BugFix] Fix setup instructions by @vmoens in https://github.com/pytorch/rl/pull/250
  • [BugFix] Fix a bug when segment_tree size is exactly 2^N by @xiaomengy in https://github.com/pytorch/rl/pull/251
  • [Feature] Added test for RewardRescale transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/252
  • [Feature] Empty TensorDict population in loops by @vmoens in https://github.com/pytorch/rl/pull/253
  • [BugFix] Memmap del bugfix by @vmoens in https://github.com/pytorch/rl/pull/254
  • [Feature] Implement padding for tensordicts by @ajhinsvark in https://github.com/pytorch/rl/pull/257
  • [BugFix]: recursion error when calling permute(...).to_tensordict() by @vmoens in https://github.com/pytorch/rl/pull/260
  • [Feature] Differentiable PPOLoss for IRL by @vmoens in https://github.com/pytorch/rl/pull/240
  • [BugFix]: avoid deleting true in_keys in TensorDictSequence by @vmoens in https://github.com/pytorch/rl/pull/261
  • [Feature] Add issue and pull request template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/263
  • [Feature] Nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/256
  • [Feature]: Index nested tensordicts using tuples by @vmoens in https://github.com/pytorch/rl/pull/262
  • [Feature]: flatten nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/264
  • [Test]: test nested CompositeSpec by @vmoens in https://github.com/pytorch/rl/pull/265
  • [Test]: test squeezed TensorDict by @vmoens in https://github.com/pytorch/rl/pull/269
  • [Doc] Added TensorDict tutorial by @nicolas-dufour in https://github.com/pytorch/rl/pull/255
  • [Test]: TensorDict: test tensordict created on cuda and sub-tensordict indexed along 2nd dimension by @vmoens in https://github.com/pytorch/rl/pull/268
  • Refactor the torch.stack with destination by @khmigor in https://github.com/pytorch/rl/pull/245
  • [Feature]: faster meta-tensor API for TensorDict by @vmoens in https://github.com/pytorch/rl/pull/272
  • [Feature]: Refactored logging to be able to support other loggers easily by @nicolas-dufour in https://github.com/pytorch/rl/pull/270
  • Small tweaks to make the replay buffer code more consistent by @shagunsodhani in https://github.com/pytorch/rl/pull/275
  • [BugFix]: Minor bugs in docstrings by @vmoens in https://github.com/pytorch/rl/pull/276
  • [Doc]: TorchRL demo by @vmoens in https://github.com/pytorch/rl/pull/284
  • [BugFix]: update wrong links in issue and pull request template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/286
  • [BugFix]: quickfix: force gym 0.24 installation until issue with rendering is resolved by @vmoens in https://github.com/pytorch/rl/pull/283
  • [Doc]: remove pip install from CONTRIBUTING.md by @vmoens in https://github.com/pytorch/rl/pull/288
  • [Feature]: faster safetanh transform via C++ bindings by @vmoens in https://github.com/pytorch/rl/pull/289
  • [BugFix]: fix GLFW3 error when installing dm_control by @vmoens in https://github.com/pytorch/rl/pull/291
  • [BugFix]: Fix examples by @vmoens in https://github.com/pytorch/rl/pull/290
  • [Doc] Simplify PR template by @vmoens in https://github.com/pytorch/rl/pull/292
  • [BugFix]: Replay buffer bugfixes by @vmoens in https://github.com/pytorch/rl/pull/294
  • [Doc] MacOs M1 troubleshooting by @ramonmedel in https://github.com/pytorch/rl/pull/296
  • [Feature]: Improving training efficiency by @vmoens in https://github.com/pytorch/rl/pull/293
  • [Feature] Wandb logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/274
  • [QuickFix]: update issue and pr template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/303
  • [Test] tests for BinarizeReward by @srikanthmg85 in https://github.com/pytorch/rl/pull/302
  • [BugFix]: L2-priority for PRB by @vmoens in https://github.com/pytorch/rl/pull/305
  • [Feature] Transforms: Compose.insert and TransformedEnv.insert_transform by @rmartimov in https://github.com/pytorch/rl/pull/304
  • [BugFix] Fix flaky test by waiting for procs instead of sleep by @nairbv in https://github.com/pytorch/rl/pull/306
  • [BugFix] Fix a build warning, setuptools/distutils import order by @nairbv in https://github.com/pytorch/rl/pull/307
  • ufmt issue if imports in order requested by distutils by @nairbv in https://github.com/pytorch/rl/pull/308
  • [BugFix]: Conda to pip for circleci by @vmoens in https://github.com/pytorch/rl/pull/310
  • [BugFix] Support list-based boolean masks for TensorDict by @benoitdescamps in https://github.com/pytorch/rl/pull/299
  • [Feature] Truly invertible tensordict permutation of dimensions by @ramonmedel in https://github.com/pytorch/rl/pull/295
  • [Doc] Tensordictmodule tutorial by @nicolas-dufour in https://github.com/pytorch/rl/pull/267
  • [Feature] Rename _TensorDict into TensorDictBase by @yoavnavon in https://github.com/pytorch/rl/pull/316
  • [Release]: v0.0.1b versioning by @vmoens in https://github.com/pytorch/rl/pull/317
  • [Feature] Adding additional checks to TensorDict.view to remove unnecessary ViewedTensorDict object creation by @bamaxw in https://github.com/pytorch/rl/pull/319
  • [BugFix]: Safe state normalization when std=0 by @vmoens in https://github.com/pytorch/rl/pull/323
  • [BugFix]: gradient propagation in advantage estimates by @vmoens in https://github.com/pytorch/rl/pull/322
  • [BugFix]: make training example gracefully exit by @vmoens in https://github.com/pytorch/rl/pull/326
  • [Setup]: Exclude tutorials from wheels by @vmoens in https://github.com/pytorch/rl/pull/325
  • [BugFix]: Tensor map for subtensordict.set_ by @vmoens in https://github.com/pytorch/rl/pull/324
  • [Versioning]: Wheels v0.0.1c by @vmoens in https://github.com/pytorch/rl/pull/327
  • [BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in https://github.com/pytorch/rl/pull/328
  • [BugFix] functorch installation in CircleCI by @vmoens in https://github.com/pytorch/rl/pull/336
  • [Refactor] VecNorm inference API by @vmoens in https://github.com/pytorch/rl/pull/337
  • [BugFix] TransformedEnv sets added Transforms into eval mode by @alexanderlobov in https://github.com/pytorch/rl/pull/331
  • [Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in https://github.com/pytorch/rl/pull/334
  • [CircleCI] Fix dm_control rendering by @vmoens in https://github.com/pytorch/rl/pull/339
  • [BugFix]: joining processes when they're done by @vmoens in https://github.com/pytorch/rl/pull/311
  • [Test] pass the OS error in case the file isn't closed by @tongbaojia in https://github.com/pytorch/rl/pull/344
  • [Feature] Make default rollout tensordict contiguous by @vmoens in https://github.com/pytorch/rl/pull/343
  • [BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in https://github.com/pytorch/rl/pull/340
  • [CI] Using latest gym by @vmoens in https://github.com/pytorch/rl/pull/346
  • [Doc] Coding your first DDPG tutorial by @vmoens in https://github.com/pytorch/rl/pull/345
  • [Doc] Minor: typos in DDPG by @vmoens in https://github.com/pytorch/rl/pull/354
  • [Feature] Register lambda and gamma in buffers by @vmoens in https://github.com/pytorch/rl/pull/353
  • [Feature] Implement eq for TensorSpec by @omikad in https://github.com/pytorch/rl/pull/358
  • [Doc] Multi-tasking tutorial by @vmoens in https://github.com/pytorch/rl/pull/352
  • [Feature] Env refactoring for model based RL by @nicolas-dufour in https://github.com/pytorch/rl/pull/315
  • [Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in https://github.com/pytorch/rl/pull/332
  • [BugFix] Add lock to vec norm transform by @jaschmid-fb in https://github.com/pytorch/rl/pull/356
  • [Perf]: Improve PPO training performance by @vmoens in https://github.com/pytorch/rl/pull/297
  • [BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/pytorch/rl/pull/361
  • Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in https://github.com/pytorch/rl/pull/362
  • [BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/pytorch/rl/pull/363
  • [Feature] CSVLogger (ABBANDONED) by @vmoens in https://github.com/pytorch/rl/pull/371
  • [Feature] Support tensor-based decay in TD-lambda by @tcbegley in https://github.com/pytorch/rl/pull/360
  • [Feature] CSVLogger by @vmoens in https://github.com/pytorch/rl/pull/372
  • [BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in https://github.com/pytorch/rl/pull/378
  • [Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in https://github.com/pytorch/rl/pull/379
  • [BugFix] Representation of indexed nested tensordict by @vmoens in https://github.com/pytorch/rl/pull/370
  • [BugFix] In-place __setitem__ for SubTensorDict by @vmoens in https://github.com/pytorch/rl/pull/369
  • [Feature] Add ProbabilisticTensorDictModule dist key mapping support by @nicolas-dufour in https://github.com/pytorch/rl/pull/376
  • [Feature]: R3M integration by @vmoens in https://github.com/pytorch/rl/pull/321
  • [Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in https://github.com/pytorch/rl/pull/385
  • [Feature] AdditiveGaussian exploration strategy by @vmoens in https://github.com/pytorch/rl/pull/388
  • [Feature] Multi-images R3M by @vmoens in https://github.com/pytorch/rl/pull/389
  • [Feature] Flatten multi-images in R3M by @vmoens in https://github.com/pytorch/rl/pull/391
  • [Quality] Code cleanup for fbsync by @vmoens in https://github.com/pytorch/rl/pull/392
  • [Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in https://github.com/pytorch/rl/pull/387
  • [Quality] Code cleanup for fbsync by @vmoens in https://github.com/pytorch/rl/pull/397
  • [Doc] Add charts to examples by @nicolas-dufour in https://github.com/pytorch/rl/pull/374
  • [Feature] Vectorized GAE by @vmoens in https://github.com/pytorch/rl/pull/365
  • [BugFix] Temporarily fix gym to 0.25.1 to fix CI by @vmoens in https://github.com/pytorch/rl/pull/411
  • [Feature] Create a Squeeze transform and update Unsqueeze transform by @reachsumit in https://github.com/pytorch/rl/pull/408
  • [Naming] Recurse kwarg to match pytorch by @matt-fff in https://github.com/pytorch/rl/pull/410
  • [Feature] Add all implemented loggers to the init of loggers by @flinder in https://github.com/pytorch/rl/pull/402
  • [BugFix] Fix gym 0.26 compatibility by @vmoens in https://github.com/pytorch/rl/pull/403
  • [BugFix] Remove submodules by @vmoens in https://github.com/pytorch/rl/pull/414
  • [Feature] lock tensordict when calling share_memory_() by @fdabek1 in https://github.com/pytorch/rl/pull/412
  • [BugFix] Updated TensorDict.expand to work as Tensor.expand by @AnshulSehgal in https://github.com/pytorch/rl/pull/409
  • [BugFix] Looser check for test_recorder assertion by @vmoens in https://github.com/pytorch/rl/pull/415
  • [Feature] Allow spec to be passed directly to exploration wrappers by @vmoens in https://github.com/pytorch/rl/pull/418
  • [BugFix] Collector revert to default exploration mode if empty string is passed by @vmoens in https://github.com/pytorch/rl/pull/421
  • [Naming] Rename _TargetNetUpdate to TargetNetUpdater, making it public by @yushiyangk in https://github.com/pytorch/rl/pull/422
  • [Doc] Re-run tutorials by @vmoens in https://github.com/pytorch/rl/pull/381
  • Revert "[Doc] Re-run tutorials" (colab links broken) by @vmoens in https://github.com/pytorch/rl/pull/423
  • [Feature] Switch back to latest gym by @vmoens in https://github.com/pytorch/rl/pull/425
  • [Feature] TensorDict without device by @tcbegley in https://github.com/pytorch/rl/pull/413
  • Updated the README.md file by @bashnick in https://github.com/pytorch/rl/pull/427
  • [Feature] Adding support for initialising TensorDicts from nested dicts by @zeenolife in https://github.com/pytorch/rl/pull/404
  • [Features] Make image_size a cfg param by @nicolas-dufour in https://github.com/pytorch/rl/pull/430
  • Make TensorDict.expand accept Sequence arguments by @nicolasgriffiths in https://github.com/pytorch/rl/pull/424
  • [Doc] Readme revamp for efficiency/modularity display by @vmoens in https://github.com/pytorch/rl/pull/382
  • [Feature] New biased_softplus semantic to allow for minimum scale setting by @nicolas-dufour in https://github.com/pytorch/rl/pull/428
  • [Tutorial] Re-run tutos by @vmoens in https://github.com/pytorch/rl/pull/434
  • [BugFix] mixed device_safe vs device by @vmoens in https://github.com/pytorch/rl/pull/429
  • [BugFix] Explicit params and buffers by @agrotov in https://github.com/pytorch/rl/pull/436
  • [BugFix] Fixed Additive noise by @nicolas-dufour in https://github.com/pytorch/rl/pull/441
  • [Tests] Test loggers video saving by @bashnick in https://github.com/pytorch/rl/pull/439
  • Revert "[BugFix] Fixed Additive noise" by @vmoens in https://github.com/pytorch/rl/pull/442
  • [Refactor] Rename TensorDictSequence to TensorDictSequential by @ronert in https://github.com/pytorch/rl/pull/440
  • [Refactor] Refactoring set*() methods for TensorDictBase class by @zeenolife in https://github.com/pytorch/rl/pull/438
  • [Cleanup] Removing gym-retro interface by @vmoens in https://github.com/pytorch/rl/pull/444
  • [BugFix]: Fix additive noise by @nicolas-dufour in https://github.com/pytorch/rl/pull/447
  • [BugFix] CatTensors: Prepended next_ to the out_key by @ggimler3 in https://github.com/pytorch/rl/pull/449
  • [BugFix] Fix AdditiveGaussian exploration tests by @vmoens in https://github.com/pytorch/rl/pull/450
  • [BugFix] Wrong call to device_safe in replay buffer code by @vmoens in https://github.com/pytorch/rl/pull/454
  • [BugFix] Add transformobservationspec _R3MNet by @ymwdalex in https://github.com/pytorch/rl/pull/443
  • [Doc] Add a knowledge base by @shagunsodhani in https://github.com/pytorch/rl/pull/375
  • [Feature] Allow for actions and rewards to be in the reset tensordict by @vmoens in https://github.com/pytorch/rl/pull/458
  • [Doc] Readme for knowledge base by @vmoens in https://github.com/pytorch/rl/pull/459
  • [Feature] Added batch_lock attribute in EnvBase by @nicolas-dufour in https://github.com/pytorch/rl/pull/399
  • [BugFix] deepcopy specs before transforming by @vmoens in https://github.com/pytorch/rl/pull/461
  • [BugFix]: Fixed dm_control action type casting by @nicolas-dufour in https://github.com/pytorch/rl/pull/463
  • [Versioning] Version 0.0.2a0 by @vmoens in https://github.com/pytorch/rl/pull/465
  • [CI, Doc] Update functorch source installation command by @zou3519 in https://github.com/pytorch/rl/pull/446
  • [BugFix] TransformedEnv attributes inheritance by @vmoens in https://github.com/pytorch/rl/pull/467
  • [Feature] Cleanup mocking envs init and new by @vmoens in https://github.com/pytorch/rl/pull/469
  • [Tests] Adding tensordict __repr__ tests by @sladebot in https://github.com/pytorch/rl/pull/435
  • [Logging]: implement MLFlow logging integration by @rayanht in https://github.com/pytorch/rl/pull/432
  • [BugFix] MLFlow import fix by @vmoens in https://github.com/pytorch/rl/pull/473
  • [BugFix] Fixed pip install by @brandonsj in https://github.com/pytorch/rl/pull/475
  • [Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in https://github.com/pytorch/rl/pull/464
  • [Feature]: ModelBased Envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/333
  • [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in https://github.com/pytorch/rl/pull/476
  • [Tutorial] DQN tutorial by @vmoens in https://github.com/pytorch/rl/pull/474
  • [Feature] reader hooks for GymLike by @vmoens in https://github.com/pytorch/rl/pull/478
  • [BugFix] TensorSpec.zero(None) failure fix by @vmoens in https://github.com/pytorch/rl/pull/483
  • [Feature]: Support for planners and CEM by @nicolas-dufour in https://github.com/pytorch/rl/pull/384
  • [Feature] Replaced device_safe() with device by @ordinskiy in https://github.com/pytorch/rl/pull/485
  • [Feature]: TensorDictPrimer transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/456
  • [Feature]: erase() method for torchrl.timeit by @nicolas-dufour in https://github.com/pytorch/rl/pull/480
  • [Feature] Added support for single collector in syncasynccollector by @nicolas-dufour in https://github.com/pytorch/rl/pull/482
  • [BugFix] removing unwanted device_safe() by @vmoens in https://github.com/pytorch/rl/pull/486
  • [Refactoring] Refactored getstatsrandom_rollout by @nicolas-dufour in https://github.com/pytorch/rl/pull/481
  • [Feature] VIP Integration by @JasonMa2016 in https://github.com/pytorch/rl/pull/487
  • [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/489
  • [Feature]: Deactivate typechecks in envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/490
  • [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in https://github.com/pytorch/rl/pull/400
  • [BugFix] Fix TensorDictPrimer init by @vmoens in https://github.com/pytorch/rl/pull/491
  • [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in https://github.com/pytorch/rl/pull/492
  • [BugFix] Defaulting passing_devices to None by @himjohntang in https://github.com/pytorch/rl/pull/477
  • Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in https://github.com/pytorch/rl/pull/494
  • [BugFix] Multi-agent fixes by @vmoens in https://github.com/pytorch/rl/pull/488
  • [BugFix] Defaulting passing_devices to None by @vmoens in https://github.com/pytorch/rl/pull/495
  • [Feature] Lazy initialization of CatTensors by @vmoens in https://github.com/pytorch/rl/pull/497
  • [Cleanup] Removing cuda 10.2 references by @vmoens in https://github.com/pytorch/rl/pull/498
  • [BugFix] Migration to pytorch org by @vmoens in https://github.com/pytorch/rl/pull/499
  • [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in https://github.com/pytorch/rl/pull/500
  • [BugFix] python version for linting checks by @vmoens in https://github.com/pytorch/rl/pull/502
  • [Feature] Replay Buffers refactor by @bamaxw in https://github.com/pytorch/rl/pull/330
  • [Feature] Rename step_tensordict in step_mdp by @romainjln in https://github.com/pytorch/rl/pull/512
  • [Lint] re-instantiate F821 by @vmoens in https://github.com/pytorch/rl/pull/516
  • [BugFix] runtypechecks for TransformedEnvs by @vmoens in https://github.com/pytorch/rl/pull/513
  • [BugFix] making firstdim and lastdim negative in FlattenObservation when a parent is set by @vmoens in https://github.com/pytorch/rl/pull/511
  • [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in https://github.com/pytorch/rl/pull/504
  • [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in https://github.com/pytorch/rl/pull/515
  • [CI] Add coverage with codecov by @silvestrebahi in https://github.com/pytorch/rl/pull/523
  • Revert "[CI] Add coverage with codecov" by @vmoens in https://github.com/pytorch/rl/pull/525
  • [Quality] Use relative imports for local c++ deps by @apbard in https://github.com/pytorch/rl/pull/526
  • [Feature] Nightly release by @vmoens in https://github.com/pytorch/rl/pull/519
  • [Feature] Add make_tensordict() function by @sicong-huang in https://github.com/pytorch/rl/pull/522
  • [Doc] Misc readme fixes by @GavinPHR in https://github.com/pytorch/rl/pull/532
  • [BugFix] Replacing inferencemode decorator with nograd to fix state_dict loading error by @GavinPHR in https://github.com/pytorch/rl/pull/530
  • [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in https://github.com/pytorch/rl/pull/531
  • [Doc] Add coverage banner by @vmoens in https://github.com/pytorch/rl/pull/533
  • [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in https://github.com/pytorch/rl/pull/543
  • [BugFix] Fix optional imports by @vmoens in https://github.com/pytorch/rl/pull/535
  • [BugFix] Restore missing keys in data collector output by @tcbegley in https://github.com/pytorch/rl/pull/521
  • [Lint] reorganize imports by @apbard in https://github.com/pytorch/rl/pull/545
  • [BugFix] Single-cpu compatibility by @vmoens in https://github.com/pytorch/rl/pull/548
  • [BugFix] vision install and other deps in optdeps by @vmoens in https://github.com/pytorch/rl/pull/552
  • [Feature] Implemented device argument for modules.models by @yushiyangk in https://github.com/pytorch/rl/pull/524
  • [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in https://github.com/pytorch/rl/pull/559
  • [BugFix] Additive gaussian exploration spec fix by @vmoens in https://github.com/pytorch/rl/pull/560
  • [BugFix] Disabling video step for wandb by @vmoens in https://github.com/pytorch/rl/pull/561
  • [BugFix] Various device fix by @vmoens in https://github.com/pytorch/rl/pull/558
  • [Feature] Allow collectors to accept regular modules as policies by @tcbegley in https://github.com/pytorch/rl/pull/546
  • [BugFix] Fix push binary nightly action by @psolikov in https://github.com/pytorch/rl/pull/566
  • [BugFix] TensorDict comparison by @vmoens in https://github.com/pytorch/rl/pull/567
  • [BugFix] Fix SyncDataCollector reset by @jrobine in https://github.com/pytorch/rl/pull/571
  • [Doc] Banners on README.md by @vmoens in https://github.com/pytorch/rl/pull/572
  • [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in https://github.com/pytorch/rl/pull/573
  • [BugFix] Add eps to reward normalization by @vmoens in https://github.com/pytorch/rl/pull/574
  • [BugFix] Fix argument for PPOLoss.getentropybonus() by @vmoens in https://github.com/pytorch/rl/pull/578
  • [Feature] Restructure torchrl/objectives by @sgrigory in https://github.com/pytorch/rl/pull/580
  • [Docs] Documentation revamp by @vmoens in https://github.com/pytorch/rl/pull/581
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/582
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/584
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/585
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/586
  • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/587
  • [Feature] More restrictive tests on docstrings by @vmoens in https://github.com/pytorch/rl/pull/457
  • [BugFix] Wrong stack import in tests by @vmoens in https://github.com/pytorch/rl/pull/590
  • [Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in https://github.com/pytorch/rl/pull/589
  • [Feature]: Dreamer support by @nicolas-dufour in https://github.com/pytorch/rl/pull/341
  • [Doc] Missing doc for prototype RB by @vmoens in https://github.com/pytorch/rl/pull/595
  • [Feature] Update list of supported libraries by @vmoens in https://github.com/pytorch/rl/pull/594
  • [BugFix] Fix timeit count registration by @vmoens in https://github.com/pytorch/rl/pull/598
  • [Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in https://github.com/pytorch/rl/pull/603
  • [Feature] Categorical encoding for action space by @artkorenev in https://github.com/pytorch/rl/pull/593
  • [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in https://github.com/pytorch/rl/pull/614
  • [Doc] Typos in tensordict tutorial by @PaLeroy in https://github.com/pytorch/rl/pull/621
  • [Doc] Integrate knowledge base in docs by @hatala91 in https://github.com/pytorch/rl/pull/622
  • [Doc] Updating docs requirements by @vmoens in https://github.com/pytorch/rl/pull/624
  • [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in https://github.com/pytorch/rl/pull/386
  • [Feature] Habitat integration by @vmoens in https://github.com/pytorch/rl/pull/514
  • [Feature] Checkpointing by @vmoens in https://github.com/pytorch/rl/pull/549
  • Add support for null dim argument in TensorDict.squeeze by @jgonik in https://github.com/pytorch/rl/pull/608
  • [Version] Updating to torch 1.13 by @vmoens in https://github.com/pytorch/rl/pull/627
  • [Feature] Sub-memmap tensors by @vmoens in https://github.com/pytorch/rl/pull/626
  • [BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in https://github.com/pytorch/rl/pull/631
  • [Feature] Unfold transforms for folded TransformedEnv by @alexanderlobov in https://github.com/pytorch/rl/pull/630
  • [BugFix] make TensorDictReplayBuffer.extend call super().extend with stacked_td by @vmoens in https://github.com/pytorch/rl/pull/634
  • [BugFix] correct the use of step_mdp method in data collector by @adityagandhamal in https://github.com/pytorch/rl/pull/637
  • [Feature] Added implement_for decorator by @ordinskiy in https://github.com/pytorch/rl/pull/618
  • [Feature] Make DQN compatible with nn.Module by @svarolgunes in https://github.com/pytorch/rl/pull/632
  • [Example] Distributed Replay Buffer Prototype Example Implementation by @adityagoel4512 in https://github.com/pytorch/rl/pull/615
  • [Feature] Benchmark storage types by @adityagoel4512 in https://github.com/pytorch/rl/pull/633
  • [Feature] Remove wild imports in the library by @sosmond in https://github.com/pytorch/rl/pull/642
  • [BugFix] Prevent transform parent from being reassigned by @jasonfkut in https://github.com/pytorch/rl/pull/641
  • [Feature] Too many deepcopy in transforms.py by @romainjln in https://github.com/pytorch/rl/pull/625
  • [Naming] Rename keysin to inkeys in transforms.py and related modules by @sardaankita in https://github.com/pytorch/rl/pull/656
  • [Refactoring] Refactor dreamer helper in smaller pieces by @vmoens in https://github.com/pytorch/rl/pull/662
  • [Feature] VIPRewardTransform by @vmoens in https://github.com/pytorch/rl/pull/658
  • [BugFix] make_trainer possible bug for on-policy cases by @albertbou92 in https://github.com/pytorch/rl/pull/655
  • [Naming] Fixing key names by @vmoens in https://github.com/pytorch/rl/pull/668
  • [Test] Check dtypes of envs by @vmoens in https://github.com/pytorch/rl/pull/666
  • [Refactor] Relying on the standalone tensordict -- phase 1 by @vmoens in https://github.com/pytorch/rl/pull/650
  • [Doc] More doc on trainers by @vmoens in https://github.com/pytorch/rl/pull/663
  • [BugFix] PPO example GAE import by @albertbou92 in https://github.com/pytorch/rl/pull/671
  • [BugFix] Use GitHub for flake8 pre-commit hook by @vmoens in https://github.com/pytorch/rl/pull/679
  • [BugFix] Update to strict select by @vmoens in https://github.com/pytorch/rl/pull/675
  • [Feature] Auto-compute stats for ObservationNorm by @romainjln in https://github.com/pytorch/rl/pull/669
  • [Doc] makecollector helper function by @albertbou92 in https://github.com/pytorch/rl/pull/678
  • [Doc] BatchSubSampler class docstrings example by @albertbou92 in https://github.com/pytorch/rl/pull/677
  • [BugFix] PPO objective crashes if advantage_module is None by @albertbou92 in https://github.com/pytorch/rl/pull/676
  • [Refactor] Refactor 'next_' into nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/649
  • [Doc] More doc about environments by @vmoens in https://github.com/pytorch/rl/pull/683
  • [Doc] Fix missing tensordict install for doc by @vmoens in https://github.com/pytorch/rl/pull/685
  • [CI] Added CircleCI pipeline to test compatibility across supported gym versions by @ordinskiy in https://github.com/pytorch/rl/pull/645
  • [BugFix] ConvNet forward method with tensors of more than 4 dimensions by @albertbou92 in https://github.com/pytorch/rl/pull/686
  • [Feature] add standard_normal for RewardScaling by @adityagandhamal in https://github.com/pytorch/rl/pull/682
  • [Feature] Jumanji envs by @yingchenlin in https://github.com/pytorch/rl/pull/674
  • [Feature] Default collate_fn by @vmoens in https://github.com/pytorch/rl/pull/688
  • [BugFix] Fix Examples by @vmoens in https://github.com/pytorch/rl/pull/687
  • [Refactoring] Replace direct gym version checks with decorated functions (#) by @ordinskiy in https://github.com/pytorch/rl/pull/691

New Contributors

  • @ajhinsvark made their first contribution in https://github.com/pytorch/rl/pull/257
  • @ramonmedel made their first contribution in https://github.com/pytorch/rl/pull/296
  • @srikanthmg85 made their first contribution in https://github.com/pytorch/rl/pull/302
  • @rmartimov made their first contribution in https://github.com/pytorch/rl/pull/304
  • @nairbv made their first contribution in https://github.com/pytorch/rl/pull/306
  • @benoitdescamps made their first contribution in https://github.com/pytorch/rl/pull/299
  • @yoavnavon made their first contribution in https://github.com/pytorch/rl/pull/316
  • @bamaxw made their first contribution in https://github.com/pytorch/rl/pull/319
  • @alexanderlobov made their first contribution in https://github.com/pytorch/rl/pull/331
  • @tongbaojia made their first contribution in https://github.com/pytorch/rl/pull/344
  • @omikad made their first contribution in https://github.com/pytorch/rl/pull/358
  • @jaschmid-fb made their first contribution in https://github.com/pytorch/rl/pull/356
  • @guabao made their first contribution in https://github.com/pytorch/rl/pull/379
  • @reachsumit made their first contribution in https://github.com/pytorch/rl/pull/408
  • @matt-fff made their first contribution in https://github.com/pytorch/rl/pull/410
  • @flinder made their first contribution in https://github.com/pytorch/rl/pull/402
  • @fdabek1 made their first contribution in https://github.com/pytorch/rl/pull/412
  • @AnshulSehgal made their first contribution in https://github.com/pytorch/rl/pull/409
  • @yushiyangk made their first contribution in https://github.com/pytorch/rl/pull/422
  • @bashnick made their first contribution in https://github.com/pytorch/rl/pull/427
  • @zeenolife made their first contribution in https://github.com/pytorch/rl/pull/404
  • @nicolasgriffiths made their first contribution in https://github.com/pytorch/rl/pull/424
  • @agrotov made their first contribution in https://github.com/pytorch/rl/pull/436
  • @ronert made their first contribution in https://github.com/pytorch/rl/pull/440
  • @ggimler3 made their first contribution in https://github.com/pytorch/rl/pull/449
  • @ymwdalex made their first contribution in https://github.com/pytorch/rl/pull/443
  • @sladebot made their first contribution in https://github.com/pytorch/rl/pull/435
  • @rayanht made their first contribution in https://github.com/pytorch/rl/pull/432
  • @brandonsj made their first contribution in https://github.com/pytorch/rl/pull/475
  • @ordinskiy made their first contribution in https://github.com/pytorch/rl/pull/485
  • @JasonMa2016 made their first contribution in https://github.com/pytorch/rl/pull/487
  • @himjohntang made their first contribution in https://github.com/pytorch/rl/pull/477
  • @romainjln made their first contribution in https://github.com/pytorch/rl/pull/512
  • @apbard made their first contribution in https://github.com/pytorch/rl/pull/526
  • @sicong-huang made their first contribution in https://github.com/pytorch/rl/pull/522
  • @psolikov made their first contribution in https://github.com/pytorch/rl/pull/566
  • @jrobine made their first contribution in https://github.com/pytorch/rl/pull/571
  • @nikhlrao made their first contribution in https://github.com/pytorch/rl/pull/573
  • @sgrigory made their first contribution in https://github.com/pytorch/rl/pull/580
  • @jlesuffleur made their first contribution in https://github.com/pytorch/rl/pull/589
  • @artkorenev made their first contribution in https://github.com/pytorch/rl/pull/593
  • @paulomarciano made their first contribution in https://github.com/pytorch/rl/pull/614
  • @hatala91 made their first contribution in https://github.com/pytorch/rl/pull/622
  • @jgonik made their first contribution in https://github.com/pytorch/rl/pull/608
  • @adityagandhamal made their first contribution in https://github.com/pytorch/rl/pull/637
  • @svarolgunes made their first contribution in https://github.com/pytorch/rl/pull/632
  • @adityagoel4512 made their first contribution in https://github.com/pytorch/rl/pull/615
  • @jasonfkut made their first contribution in https://github.com/pytorch/rl/pull/641
  • @sardaankita made their first contribution in https://github.com/pytorch/rl/pull/656
  • @albertbou92 made their first contribution in https://github.com/pytorch/rl/pull/655
  • @yingchenlin made their first contribution in https://github.com/pytorch/rl/pull/674

Full Changelog: https://github.com/pytorch/rl/compare/v0.0.1...0.0.3

- Python
Published by vmoens over 3 years ago

torchrl - 0.0.2a

What's Changed

  • [BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/328
  • [BugFix] functorch installation in CircleCI by @vmoens in https://github.com/facebookresearch/rl/pull/336
  • [Refactor] VecNorm inference API by @vmoens in https://github.com/facebookresearch/rl/pull/337
  • TransformedEnv sets added Transforms into eval mode by @alexanderlobov in https://github.com/facebookresearch/rl/pull/331
  • [Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/334
  • [CircleCI] Fix dm_control rendering by @vmoens in https://github.com/facebookresearch/rl/pull/339
  • [BugFix]: joining processes when they're done by @vmoens in https://github.com/facebookresearch/rl/pull/311
  • [Test] pass the OS error in case the file isn't closed by @tongbaojia in https://github.com/facebookresearch/rl/pull/344
  • [Feature] Make default rollout tensordict contiguous by @vmoens in https://github.com/facebookresearch/rl/pull/343
  • [BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in https://github.com/facebookresearch/rl/pull/340
  • [CI] Using latest gym by @vmoens in https://github.com/facebookresearch/rl/pull/346
  • [Doc] Coding your first DDPG tutorial by @vmoens in https://github.com/facebookresearch/rl/pull/345
  • [Doc] Minor: typos in DDPG by @vmoens in https://github.com/facebookresearch/rl/pull/354
  • [Feature] Register lambda and gamma in buffers by @vmoens in https://github.com/facebookresearch/rl/pull/353
  • [Feature] Implement eq for TensorSpec by @omikad in https://github.com/facebookresearch/rl/pull/358
  • [Doc] Multi-tasking tutorial by @vmoens in https://github.com/facebookresearch/rl/pull/352
  • [Feature] Env refactoring for model based RL by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/315
  • [Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/332
  • [BugFix] Add lock to vec norm transform by @jaschmid-fb in https://github.com/facebookresearch/rl/pull/356
  • [Perf]: Improve PPO training performance by @vmoens in https://github.com/facebookresearch/rl/pull/297
  • [BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/facebookresearch/rl/pull/361
  • Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in https://github.com/facebookresearch/rl/pull/362
  • [BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/facebookresearch/rl/pull/363
  • [Feature] CSVLogger (ABBANDONED) by @vmoens in https://github.com/facebookresearch/rl/pull/371
  • [Feature] Support tensor-based decay in TD-lambda by @tcbegley in https://github.com/facebookresearch/rl/pull/360
  • [Feature] CSVLogger by @vmoens in https://github.com/facebookresearch/rl/pull/372
  • [BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in https://github.com/facebookresearch/rl/pull/378
  • [Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in https://github.com/facebookresearch/rl/pull/379
  • [BugFix] Representation of indexed nested tensordict by @vmoens in https://github.com/facebookresearch/rl/pull/370
  • [BugFix] In-place __setitem__ for SubTensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/369
  • [Feature] Add ProbabilisticTensorDictModule dist key mapping support by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/376
  • [Feature]: R3M integration by @vmoens in https://github.com/facebookresearch/rl/pull/321
  • [Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in https://github.com/facebookresearch/rl/pull/385
  • [Feature] AdditiveGaussian exploration strategy by @vmoens in https://github.com/facebookresearch/rl/pull/388
  • [Feature] Multi-images R3M by @vmoens in https://github.com/facebookresearch/rl/pull/389
  • [Feature] Flatten multi-images in R3M by @vmoens in https://github.com/facebookresearch/rl/pull/391
  • [Quality] Code cleanup for fbsync by @vmoens in https://github.com/facebookresearch/rl/pull/392
  • [Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/387
  • [Quality] Code cleanup for fbsync by @vmoens in https://github.com/facebookresearch/rl/pull/397
  • [Doc] Add charts to examples by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/374
  • [Feature] Vectorized GAE by @vmoens in https://github.com/facebookresearch/rl/pull/365
  • [BugFix] Temporarily fix gym to 0.25.1 to fix CI by @vmoens in https://github.com/facebookresearch/rl/pull/411
  • [Feature] Create a Squeeze transform and update Unsqueeze transform by @reachsumit in https://github.com/facebookresearch/rl/pull/408
  • [Naming] Recurse kwarg to match pytorch by @matt-fff in https://github.com/facebookresearch/rl/pull/410
  • [Feature] Add all implemented loggers to the init of loggers by @flinder in https://github.com/facebookresearch/rl/pull/402
  • [BugFix] Fix gym 0.26 compatibility by @vmoens in https://github.com/facebookresearch/rl/pull/403
  • [BugFix] Remove submodules by @vmoens in https://github.com/facebookresearch/rl/pull/414
  • [Feature] lock tensordict when calling share_memory_() by @fdabek1 in https://github.com/facebookresearch/rl/pull/412
  • [BugFix] Updated TensorDict.expand to work as Tensor.expand by @AnshulSehgal in https://github.com/facebookresearch/rl/pull/409
  • [BugFix] Looser check for test_recorder assertion by @vmoens in https://github.com/facebookresearch/rl/pull/415
  • [Feature] Allow spec to be passed directly to exploration wrappers by @vmoens in https://github.com/facebookresearch/rl/pull/418
  • [BugFix] Collector revert to default exploration mode if empty string is passed by @vmoens in https://github.com/facebookresearch/rl/pull/421
  • [Naming] Rename _TargetNetUpdate to TargetNetUpdater, making it public by @yushiyangk in https://github.com/facebookresearch/rl/pull/422
  • [Doc] Re-run tutorials by @vmoens in https://github.com/facebookresearch/rl/pull/381
  • Revert "[Doc] Re-run tutorials" (colab links broken) by @vmoens in https://github.com/facebookresearch/rl/pull/423
  • [Feature] Switch back to latest gym by @vmoens in https://github.com/facebookresearch/rl/pull/425
  • [Feature] TensorDict without device by @tcbegley in https://github.com/facebookresearch/rl/pull/413
  • Updated the README.md file by @bashnick in https://github.com/facebookresearch/rl/pull/427
  • [Feature] Adding support for initialising TensorDicts from nested dicts by @zeenolife in https://github.com/facebookresearch/rl/pull/404
  • [Features] Make image_size a cfg param by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/430
  • Make TensorDict.expand accept Sequence arguments by @nicolasgriffiths in https://github.com/facebookresearch/rl/pull/424
  • [Doc] Readme revamp for efficiency/modularity display by @vmoens in https://github.com/facebookresearch/rl/pull/382
  • [Feature] New biased_softplus semantic to allow for minimum scale setting by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/428
  • [Tutorial] Re-run tutos by @vmoens in https://github.com/facebookresearch/rl/pull/434
  • [BugFix] mixed device_safe vs device by @vmoens in https://github.com/facebookresearch/rl/pull/429
  • [BugFix] Explicit params and buffers by @agrotov in https://github.com/facebookresearch/rl/pull/436
  • [BugFix] Fixed Additive noise by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/441
  • [Tests] Test loggers video saving by @bashnick in https://github.com/facebookresearch/rl/pull/439
  • Revert "[BugFix] Fixed Additive noise" by @vmoens in https://github.com/facebookresearch/rl/pull/442
  • [Refactor] Rename TensorDictSequence to TensorDictSequential by @ronert in https://github.com/facebookresearch/rl/pull/440
  • [Refactor] Refactoring set*() methods for TensorDictBase class by @zeenolife in https://github.com/facebookresearch/rl/pull/438
  • [Cleanup] Removing gym-retro interface by @vmoens in https://github.com/facebookresearch/rl/pull/444
  • [BugFix]: Fix additive noise by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/447
  • [BugFix] CatTensors: Prepended next_ to the out_key by @ggimler3 in https://github.com/facebookresearch/rl/pull/449
  • [BugFix] Fix AdditiveGaussian exploration tests by @vmoens in https://github.com/facebookresearch/rl/pull/450
  • [BugFix] Wrong call to device_safe in replay buffer code by @vmoens in https://github.com/facebookresearch/rl/pull/454
  • [BugFix] Add transformobservationspec _R3MNet by @ymwdalex in https://github.com/facebookresearch/rl/pull/443
  • [Doc] Add a knowledge base by @shagunsodhani in https://github.com/facebookresearch/rl/pull/375
  • [Feature] Allow for actions and rewards to be in the reset tensordict by @vmoens in https://github.com/facebookresearch/rl/pull/458
  • [Doc] Readme for knowledge base by @vmoens in https://github.com/facebookresearch/rl/pull/459
  • [Feature] Added batch_lock attribute in EnvBase by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/399
  • [BugFix] deepcopy specs before transforming by @vmoens in https://github.com/facebookresearch/rl/pull/461
  • [BugFix]: Fixed dm_control action type casting by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/463
  • [Versioning] Version 0.0.2a0 by @vmoens in https://github.com/facebookresearch/rl/pull/465

New Contributors

  • @alexanderlobov made their first contribution in https://github.com/facebookresearch/rl/pull/331
  • @tongbaojia made their first contribution in https://github.com/facebookresearch/rl/pull/344
  • @omikad made their first contribution in https://github.com/facebookresearch/rl/pull/358
  • @jaschmid-fb made their first contribution in https://github.com/facebookresearch/rl/pull/356
  • @tcbegley made their first contribution in https://github.com/facebookresearch/rl/pull/360
  • @guabao made their first contribution in https://github.com/facebookresearch/rl/pull/379
  • @reachsumit made their first contribution in https://github.com/facebookresearch/rl/pull/408
  • @matt-fff made their first contribution in https://github.com/facebookresearch/rl/pull/410
  • @flinder made their first contribution in https://github.com/facebookresearch/rl/pull/402
  • @fdabek1 made their first contribution in https://github.com/facebookresearch/rl/pull/412
  • @AnshulSehgal made their first contribution in https://github.com/facebookresearch/rl/pull/409
  • @yushiyangk made their first contribution in https://github.com/facebookresearch/rl/pull/422
  • @bashnick made their first contribution in https://github.com/facebookresearch/rl/pull/427
  • @zeenolife made their first contribution in https://github.com/facebookresearch/rl/pull/404
  • @nicolasgriffiths made their first contribution in https://github.com/facebookresearch/rl/pull/424
  • @agrotov made their first contribution in https://github.com/facebookresearch/rl/pull/436
  • @ronert made their first contribution in https://github.com/facebookresearch/rl/pull/440
  • @ggimler3 made their first contribution in https://github.com/facebookresearch/rl/pull/449
  • @ymwdalex made their first contribution in https://github.com/facebookresearch/rl/pull/443

Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1c...v0.0.2a

- Python
Published by vmoens over 3 years ago

torchrl - v0.0.1-gamma

What's Changed

  • Adding additional checks to TensorDict.view to remove unnecessary ViewedTensorDict object creation by @bamaxw in https://github.com/facebookresearch/rl/pull/319
  • [BugFix]: Safe state normalization when std=0 by @vmoens in https://github.com/facebookresearch/rl/pull/323
  • [BugFix]: gradient propagation in advantage estimates by @vmoens in https://github.com/facebookresearch/rl/pull/322
  • [BugFix]: make training example gracefully exit by @vmoens in https://github.com/facebookresearch/rl/pull/326
  • [Setup]: Exclude tutorials from wheels by @vmoens in https://github.com/facebookresearch/rl/pull/325
  • [BugFix]: Tensor map for subtensordict.set_ by @vmoens in https://github.com/facebookresearch/rl/pull/324
  • [Release]: Wheels v0.0.1c by @vmoens in https://github.com/facebookresearch/rl/pull/327

New Contributors

  • @bamaxw made their first contribution in https://github.com/facebookresearch/rl/pull/319

Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1b...v0.0.1c

- Python
Published by vmoens almost 4 years ago

torchrl - v0.0.1-beta

Highlights

Supports nested tensordicts: * [Feature] Nested tensordicts by @vmoens in https://github.com/facebookresearch/rl/pull/256 * [Feature]: Index nested tensordicts using tuples by @vmoens in https://github.com/facebookresearch/rl/pull/262 * [Feature]: flatten nested tensordicts by @vmoens in https://github.com/facebookresearch/rl/pull/264

Padding for tensordicts: * [Feature] Implement padding for tensordicts by @ajhinsvark in https://github.com/facebookresearch/rl/pull/257

Speed improvements: * [Feature]: faster meta-tensor API for TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/272 * [Feature]: faster safetanh transform via C++ bindings by @vmoens in https://github.com/facebookresearch/rl/pull/289 * [Feature]: Improving training efficiency by @vmoens in https://github.com/facebookresearch/rl/pull/293

Logging capabilities: * [Feature]: Refactored logging to be able to support other loggers easily by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/270 * [Feature] Wandb logger by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/274

Doc * [Doc]: TorchRL demo by @vmoens in https://github.com/facebookresearch/rl/pull/284 * [Doc] Added TensorDict tutorial by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/255 * [Doc] Tensordictmodule tutorial by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/267

What's Changed

  • MacOs versioning and release bugfix by @vmoens in https://github.com/facebookresearch/rl/pull/247
  • Setup metadata by @vmoens in https://github.com/facebookresearch/rl/pull/248
  • Fix setup instructions by @vmoens in https://github.com/facebookresearch/rl/pull/250
  • Fix a bug when segment_tree size is exactly 2^N by @xiaomengy in https://github.com/facebookresearch/rl/pull/251
  • Added test for RewardRescale transform by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/252
  • Empty TensorDict population in loops by @vmoens in https://github.com/facebookresearch/rl/pull/253
  • Memmap del bugfix by @vmoens in https://github.com/facebookresearch/rl/pull/254
  • [BugFix]: recursion error when calling permute(...).to_tensordict() by @vmoens in https://github.com/facebookresearch/rl/pull/260
  • Differentiable PPOLoss for IRL by @vmoens in https://github.com/facebookresearch/rl/pull/240
  • [BugFix]: avoid deleting true in_keys in TensorDictSequence by @vmoens in https://github.com/facebookresearch/rl/pull/261
  • [Feature] Add issue and pull request template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/263
  • [Test]: test nested CompositeSpec by @vmoens in https://github.com/facebookresearch/rl/pull/265
  • [Test]: test squeezed TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/269
  • [Test]: TensorDict: test tensordict created on cuda and sub-tensordict indexed along 2nd dimension by @vmoens in https://github.com/facebookresearch/rl/pull/268
  • Refactor the torch.stack with destination by @khmigor in https://github.com/facebookresearch/rl/pull/245
  • Small tweaks to make the replay buffer code more consistent by @shagunsodhani in https://github.com/facebookresearch/rl/pull/275
  • [BugFix]: Minor bugs in docstrings by @vmoens in https://github.com/facebookresearch/rl/pull/276
  • [BugFix]: update wrong links in issue and pull request template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/286
  • [BugFix]: quickfix: force gym 0.24 installation until issue with rendering is resolved by @vmoens in https://github.com/facebookresearch/rl/pull/283
  • [Doc]: remove pip install from CONTRIBUTING.md by @vmoens in https://github.com/facebookresearch/rl/pull/288
  • [BugFix]: fix GLFW3 error when installing dm_control by @vmoens in https://github.com/facebookresearch/rl/pull/291
  • [BugFix]: Fix examples by @vmoens in https://github.com/facebookresearch/rl/pull/290
  • [Doc] Simplify PR template by @vmoens in https://github.com/facebookresearch/rl/pull/292
  • [BugFix]: Replay buffer bugfixes by @vmoens in https://github.com/facebookresearch/rl/pull/294
  • [Doc] MacOs M1 troubleshooting by @ramonmedel in https://github.com/facebookresearch/rl/pull/296
  • [QuickFix]: update issue and pr template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/303
  • [Test] tests for BinarizeReward by @srikanthmg85 in https://github.com/facebookresearch/rl/pull/302
  • [BugFix]: L2-priority for PRB by @vmoens in https://github.com/facebookresearch/rl/pull/305
  • [Feature] Transforms: Compose.insert and TransformedEnv.insert_transform by @rmartimov in https://github.com/facebookresearch/rl/pull/304
  • [BugFix] Fix flaky test by waiting for procs instead of sleep by @nairbv in https://github.com/facebookresearch/rl/pull/306
  • [BugFix] Fix a build warning, setuptools/distutils import order by @nairbv in https://github.com/facebookresearch/rl/pull/307
  • ufmt issue if imports in order requested by distutils by @nairbv in https://github.com/facebookresearch/rl/pull/308
  • [BugFix]: Conda to pip for circleci by @vmoens in https://github.com/facebookresearch/rl/pull/310
  • [BugFix] Support list-based boolean masks for TensorDict by @benoitdescamps in https://github.com/facebookresearch/rl/pull/299
  • [Feature] Truly invertible tensordict permutation of dimensions by @ramonmedel in https://github.com/facebookresearch/rl/pull/295
  • [Feature] Rename _TensorDict into TensorDictBase by @yoavnavon in https://github.com/facebookresearch/rl/pull/316

New Contributors

  • @nicolas-dufour made their first contribution in https://github.com/facebookresearch/rl/pull/252
  • @ajhinsvark made their first contribution in https://github.com/facebookresearch/rl/pull/257
  • @ramonmedel made their first contribution in https://github.com/facebookresearch/rl/pull/296
  • @srikanthmg85 made their first contribution in https://github.com/facebookresearch/rl/pull/302
  • @rmartimov made their first contribution in https://github.com/facebookresearch/rl/pull/304
  • @nairbv made their first contribution in https://github.com/facebookresearch/rl/pull/306
  • @benoitdescamps made their first contribution in https://github.com/facebookresearch/rl/pull/299
  • @yoavnavon made their first contribution in https://github.com/facebookresearch/rl/pull/316

Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1...v0.0.1b

- Python
Published by vmoens almost 4 years ago

torchrl - v0.0.1-alpha

TorchRL Initial Alpha Release

TorchRL is the soon-to-be official RL domain library for PyTorch. It contains primitives that are aimed at covering most of the modern RL research space.

Getting started with the library

Installation

The library can be installed through $ pip install torchrl Currently, torchrl wheels are provided for linux and macos (not M1) machines. For other architectures or for the latest features, refer to the README.md and CONTRIBUTING.md files for advanced installation instructions.

Environments

TorchRL currently supports gym and dm_control out-of-the-box. To create a gym wrapped environment, simply use ```python from torchrl.envs import GymEnv, GymWrapper env = GymEnv("Pendulum-v1")

similarly

env = GymWrapper(gym.make("Pendulum-v1")) `` Environment can be transformed using thetorchrl.envs.transformsmodule. See the [environment tutorial](tutorials/envs.ipynb) for more information. The [ParallelEnv`](torchrl/envs/vec_env.py) allows to run multiple environments in parallel.

Policy and modules

TorchRL modules interacts using TensorDict, a new data carrier class. Although it is not necessary to use it and one can find workarounds for it, we advise to use the TensorDictModule class to read tensordicts: ```python from torchrl.modules import TensorDictModule

policymodule = nn.Linear(nobs, nact) policy = TensorDictModule(policymodule, ... inkeys=["observation"], # keys to be read for the module input ... outkeys=["action"], # keys to be written with the module output ) tensordict = env.reset() tensordict = policy(tensordict) action = tensordict["action"] ```

By using TensorDict and TensorDictModule, you can make sure that your algorithm is robust to changes in configuration (e.g. usage of an RNN for the policy, exploration strategies etc.) TensorDict instances can be reshaped in several ways, cast to device, updated, shared among processes, stacked, concatenated etc.

Some specialized TensorDictModule are implemented for convenience: Actor, ProbabilisticActor, ValueOperator, ActorCriticOperator, ActorCriticWrapper and QValueActor can be found in actors.py.

Collecting data

DataColllectors is the TorchRL data loading class family. We provide single process, sync and async multiprocess loaders. We also provide ReplayBuffers that can be stored in memory or on disk using the various storage options.

Loss modules and advantage computation

Loss modules are provided for each algorithm class independently. They are accompanied by efficient implementations of value and advantage computation functions. TorchRL is devoted to be fully compatible with functorch, the functional programming PyTorch library.

Examples

A bunch of examples are provided as well. Check the examples directory to learn more about exploration strategies, loss modules etc.

- Python
Published by vmoens almost 4 years ago