https://github.com/conglu1997/synther

Synthetic Experience Replay

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
1 of 1 committers (100.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Synthetic Experience Replay

Basic Info

Host: GitHub
Owner: conglu1997
License: mit
Language: Python
Default Branch: main
Size: 614 KB

Statistics

Stars: 92
Watchers: 3
Forks: 13
Open Issues: 0
Releases: 1

Created about 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme License

Synthetic Experience Replay

Synthetic Experience Replay (SynthER) is a diffusion-based approach to arbitrarily upsample an RL agent's collected experience, leading to large gains in sample efficiency and scaling benefits. We integrate SynthER into a variety of offline and online algorithms in this codebase, including SAC, TD3+BC, IQL, EDAC, and CQL. For further details, please see the paper:

Synthetic Experience Replay; Cong Lu, Philip J. Ball, Yee Whye Teh, Jack Parker-Holder. Published at NeurIPS, 2023.

View on arXiv

Setup

To install, clone the repository and run the following:

bash git submodule update --init --recursive pip install -r requirements.txt

The code was tested on Python 3.8 and 3.9. If you don't have MuJoCo installed, follow the instructions here: https://github.com/openai/mujoco-py#install-mujoco.

Running Instructions

Offline RL

Diffusion model training (this automatically generates samples and saves them):

bash python3 synther/diffusion/train_diffuser.py --dataset halfcheetah-medium-replay-v2

Baseline without SynthER (e.g. on TD3+BC):

bash python3 synther/corl/algorithms/td3_bc.py --config synther/corl/yaml/td3_bc/halfcheetah/medium_replay_v2.yaml --checkpoints_path corl_logs/

Offline RL training with SynthER:

```bash

Generating diffusion samples on the fly.

python3 synther/corl/algorithms/td3bc.py --config synther/corl/yaml/td3bc/halfcheetah/mediumreplayv2.yaml --checkpointspath corllogs/ --name SynthER --diffusion.path path/to/model-100000.pt

Using saved diffusion samples.

python3 synther/corl/algorithms/td3bc.py --config synther/corl/yaml/td3bc/halfcheetah/mediumreplayv2.yaml --checkpointspath corllogs/ --name SynthER --diffusion.path path/to/samples.npz ```

Online RL

Baselines (SAC, REDQ):

```bash

SAC.

python3 synther/online/onlineexp.py --env quadruped-walk-v0 --resultsfolder onlinelogs/ --expname SAC --ginconfigfiles 'config/online/sac.gin'

REDQ.

python3 synther/online/onlineexp.py --env quadruped-walk-v0 --resultsfolder onlinelogs/ --expname REDQ --ginconfigfiles 'config/online/redq.gin' ```

SynthER (SAC):

```bash

DMC environments.

python3 synther/online/onlineexp.py --env quadruped-walk-v0 --resultsfolder onlinelogs/ --expname SynthER --ginconfigfiles 'config/online/sacsyntherdmc.gin' --ginparams 'redqsac.utdratio = 20' 'redqsac.num_samples = 1000000'

OpenAI environments (different gin config).

python3 synther/online/onlineexp.py --env HalfCheetah-v2 --resultsfolder onlinelogs/ --expname SynthER --ginconfigfiles 'config/online/sacsyntheropenai.gin' --ginparams 'redqsac.utdratio = 20' 'redqsac.num_samples = 1000000' ```

Thinking of adding SynthER to your own algorithm?

Our codebase has everything you need for diffusion with low-dimensional data along with example integrations with RL algorithms. For a custom use-case, we recommend starting from the training script and SimpleDiffusionGenerator class in synther/diffusion/train_diffuser.py. You can modify the hyperparameters specified in config/resmlp_denoiser.gin to suit your own needs.

Additional Notes

Our codebase uses wandb for logging, you will need to set --wandb-entity across the repository.
Our pixel-based experiments are based on a modified version of the V-D4RL repository. The latent representations are derived from the trunks of the actor and critic.

Acknowledgements

SynthER builds upon many works and open-source codebases in both diffusion modelling and reinforcement learning. We would like to particularly thank the authors of:

Contact

Please contact Cong Lu or Philip Ball for any queries. We welcome any suggestions or contributions!

Owner

Name: Cong Lu
Login: conglu1997
Kind: user
Location: Oxford

Website: https://www.conglu.co.uk
Twitter: cong_ml
Repositories: 2
Profile: https://github.com/conglu1997

Reinforcement Learning PhD Student in MLRG and OxCSML @ Oxford

GitHub Events

Total

Issues event: 6
Watch event: 31
Issue comment event: 6
Fork event: 5

Last Year

Issues event: 6
Watch event: 31
Issue comment event: 6
Fork event: 5

Committers

Last synced: over 1 year ago

All Time

Total Commits: 13
Total Committers: 1
Avg Commits per committer: 13.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 3
Committers: 1
Avg Commits per committer: 3.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Cong Lu	c**u@b**k	13

Committer Domains (Top 20 + Academic)

balliol.ox.ac.uk: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 3
Total pull requests: 1
Average time to close issues: about 1 hour
Average time to close pull requests: 8 months
Total issue authors: 3
Total pull request authors: 1
Average comments per issue: 1.0
Average comments per pull request: 2.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 0
Average time to close issues: about 1 hour
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 0
Average comments per issue: 1.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

https://github.com/conglu1997/synther

Science Score: 46.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Synthetic Experience Replay

Setup

Running Instructions

Offline RL

Generating diffusion samples on the fly.

Using saved diffusion samples.

Online RL

SAC.

REDQ.

DMC environments.

OpenAI environments (different gin config).

Thinking of adding SynthER to your own algorithm?

Additional Notes

Acknowledgements

Contact

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels