https://github.com/conglu1997/synther

Synthetic Experience Replay

https://github.com/conglu1997/synther

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 1 committers (100.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.7%) to scientific vocabulary
Last synced: 7 months ago · JSON representation

Repository

Synthetic Experience Replay

Basic Info
  • Host: GitHub
  • Owner: conglu1997
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 614 KB
Statistics
  • Stars: 92
  • Watchers: 3
  • Forks: 13
  • Open Issues: 0
  • Releases: 1
Created almost 3 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License

README.md

Synthetic Experience Replay

Twitter arXiv

Synthetic Experience Replay (SynthER) is a diffusion-based approach to arbitrarily upsample an RL agent's collected experience, leading to large gains in sample efficiency and scaling benefits. We integrate SynthER into a variety of offline and online algorithms in this codebase, including SAC, TD3+BC, IQL, EDAC, and CQL. For further details, please see the paper:

Synthetic Experience Replay; Cong Lu, Philip J. Ball, Yee Whye Teh, Jack Parker-Holder. Published at NeurIPS, 2023.

View on arXiv

Setup

To install, clone the repository and run the following:

bash git submodule update --init --recursive pip install -r requirements.txt

The code was tested on Python 3.8 and 3.9. If you don't have MuJoCo installed, follow the instructions here: https://github.com/openai/mujoco-py#install-mujoco.

Running Instructions

Offline RL

Diffusion model training (this automatically generates samples and saves them):

bash python3 synther/diffusion/train_diffuser.py --dataset halfcheetah-medium-replay-v2

Baseline without SynthER (e.g. on TD3+BC):

bash python3 synther/corl/algorithms/td3_bc.py --config synther/corl/yaml/td3_bc/halfcheetah/medium_replay_v2.yaml --checkpoints_path corl_logs/

Offline RL training with SynthER:

```bash

Generating diffusion samples on the fly.

python3 synther/corl/algorithms/td3bc.py --config synther/corl/yaml/td3bc/halfcheetah/mediumreplayv2.yaml --checkpointspath corllogs/ --name SynthER --diffusion.path path/to/model-100000.pt

Using saved diffusion samples.

python3 synther/corl/algorithms/td3bc.py --config synther/corl/yaml/td3bc/halfcheetah/mediumreplayv2.yaml --checkpointspath corllogs/ --name SynthER --diffusion.path path/to/samples.npz ```

Online RL

Baselines (SAC, REDQ):

```bash

SAC.

python3 synther/online/onlineexp.py --env quadruped-walk-v0 --resultsfolder onlinelogs/ --expname SAC --ginconfigfiles 'config/online/sac.gin'

REDQ.

python3 synther/online/onlineexp.py --env quadruped-walk-v0 --resultsfolder onlinelogs/ --expname REDQ --ginconfigfiles 'config/online/redq.gin' ```

SynthER (SAC):

```bash

DMC environments.

python3 synther/online/onlineexp.py --env quadruped-walk-v0 --resultsfolder onlinelogs/ --expname SynthER --ginconfigfiles 'config/online/sacsyntherdmc.gin' --ginparams 'redqsac.utdratio = 20' 'redqsac.num_samples = 1000000'

OpenAI environments (different gin config).

python3 synther/online/onlineexp.py --env HalfCheetah-v2 --resultsfolder onlinelogs/ --expname SynthER --ginconfigfiles 'config/online/sacsyntheropenai.gin' --ginparams 'redqsac.utdratio = 20' 'redqsac.num_samples = 1000000' ```

Thinking of adding SynthER to your own algorithm?

Our codebase has everything you need for diffusion with low-dimensional data along with example integrations with RL algorithms. For a custom use-case, we recommend starting from the training script and SimpleDiffusionGenerator class in synther/diffusion/train_diffuser.py. You can modify the hyperparameters specified in config/resmlp_denoiser.gin to suit your own needs.

Additional Notes

  • Our codebase uses wandb for logging, you will need to set --wandb-entity across the repository.
  • Our pixel-based experiments are based on a modified version of the V-D4RL repository. The latent representations are derived from the trunks of the actor and critic.

Acknowledgements

SynthER builds upon many works and open-source codebases in both diffusion modelling and reinforcement learning. We would like to particularly thank the authors of:

Contact

Please contact Cong Lu or Philip Ball for any queries. We welcome any suggestions or contributions!

Owner

  • Name: Cong Lu
  • Login: conglu1997
  • Kind: user
  • Location: Oxford

Reinforcement Learning PhD Student in MLRG and OxCSML @ Oxford

GitHub Events

Total
  • Issues event: 6
  • Watch event: 31
  • Issue comment event: 6
  • Fork event: 5
Last Year
  • Issues event: 6
  • Watch event: 31
  • Issue comment event: 6
  • Fork event: 5

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 13
  • Total Committers: 1
  • Avg Commits per committer: 13.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 3
  • Committers: 1
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Cong Lu c****u@b****k 13
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 3
  • Total pull requests: 1
  • Average time to close issues: about 1 hour
  • Average time to close pull requests: 8 months
  • Total issue authors: 3
  • Total pull request authors: 1
  • Average comments per issue: 1.0
  • Average comments per pull request: 2.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: about 1 hour
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Criswim (2)
  • fyqqyf (1)
  • leekwoon (1)
  • glass1720 (1)
Pull Request Authors
  • glass1720 (2)
Top Labels
Issue Labels
Pull Request Labels