rlor
Reinforcement learning for operation research problems with OpenAI Gym and CleanRL
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.1%) to scientific vocabulary
Keywords
Repository
Reinforcement learning for operation research problems with OpenAI Gym and CleanRL
Basic Info
- Host: GitHub
- Owner: cpwan
- License: other
- Language: Python
- Default Branch: main
- Homepage: https://cpwan.github.io/RLOR/
- Size: 5.09 MB
Statistics
- Stars: 98
- Watchers: 3
- Forks: 9
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research
:one: First work to incorporate end-to-end vehicle routing model in a modern RL platform (CleanRL)
:zap: Speed up the training of Attention Model by 8 times (25hours $\to$ 3 hours)
:mag_right: A flexible framework for developing model, algorithm, environment, and search for operation research
News
- 13/04/2023: We release web demo on Hugging Face 🤗!
- 24/03/2023: We release our paper on arxiv!
- 20/03/2023: We release jupyter lab demo and pretrained checkpoints!
- 10/03/2023: We release our codebase!
Demo
We provide inference demo on colab notebook:
| Environment | Search | Demo |
| ----------- | ------------ | ------------------------------------------------------------ |
| TSP | Greedy |
|
| CVRP | Multi-Greedy |
|
Installation
Conda
```shell
conda env create -n
The environment.yml was generated from
conda env export --no-builds > environment.yml
``` It can take a few minutes.
Optional dependency
wandb
Refer to their quick start guide for installation.
File structures
All the major implementations were under rlor folder.
shell
./rlor
├── envs
│ ├── tsp_data.py # load pre-generated data for evaluation
│ ├── tsp_vector_env.py # define the (vectorized) gym environment
│ ├── cvrp_data.py
│ └── cvrp_vector_env.py
├── models
│ ├── attention_model_wrapper.py # wrap refactored attention model to cleanRL
│ └── nets # contains refactored attention model
└── ppo_or.py # implementaion of ppo with attention model for operation research problems
The ppo_or.py was modified from cleanrl/ppo.py. To see what's changed, use diff: ```shell
apt install diff
diff --color ppo.py ppo_or.py ```
Training OR model with PPO
TSP
shell
python ppo_or.py --num-steps 51 --env-id tsp-v0 --env-entry-point envs.tsp_vector_env:TSPVectorEnv --problem tsp
CVRP
shell
python ppo_or.py --num-steps 60 --env-id cvrp-v0 --env-entry-point envs.cvrp_vector_env:CVRPVectorEnv --problem cvrp
Enable WandB
shell
python ppo_or.py ... --track
Add --track argument to enable tracking with WandB.
Where is the tsp data?
It can be generated from the official repo of the attention-learn-to-route paper. You may modify the ./envs/tsp_data.py to update the path to data accordingly.
Acknowledgements
The neural network model is refactored and developed from Attention, Learn to Solve Routing Problems!.
The idea of multiple trajectory training/ inference is from POMO: Policy Optimization with Multiple Optima for Reinforcement Learning.
The RL environments are defined with OpenAI Gym.
The PPO algorithm implementation is based on CleanRL.
Owner
- Login: cpwan
- Kind: user
- Website: https://cpwan.github.io/
- Repositories: 20
- Profile: https://github.com/cpwan
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "WAN"
given-names: "Ching Pui"
orcid: "https://orcid.org/0000-0002-6217-5418"
- family-names: "LI"
given-names: "Tung"
- family-names: "WANG"
given-names: "Jason Min"
title: "RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research"
version: 1.0.0
doi: 10.5281/zenodo.1234
date-released: 2023-03-23
url: "https://github.com/cpwan/RLOR"
preferred-citation:
type: misc
authors:
- family-names: "WAN"
given-names: "Ching Pui"
orcid: "https://orcid.org/0000-0002-6217-5418"
- family-names: "LI"
given-names: "Tung"
- family-names: "WANG"
given-names: "Jason Min"
doi: 10.48550/arXiv.2303.13117
title: "RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research"
year: 2023
eprint: "arXiv:2303.13117"
url : "http://arxiv.org/abs/2303.13117"
GitHub Events
Total
- Issues event: 2
- Watch event: 30
- Issue comment event: 1
- Fork event: 3
Last Year
- Issues event: 2
- Watch event: 30
- Issue comment event: 1
- Fork event: 3
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| cpwan | c****n@c****k | 8 |
| Patrick WAN | c****5@l****m | 1 |
| TonyLiHK | 1****K | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 2 years ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- WaitDumplings (1)
Pull Request Authors
- cpwan (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- absl-py ==1.3.0
- cachetools ==5.2.0
- certifi ==2022.9.24
- cfgv ==3.3.1
- charset-normalizer ==2.1.1
- cloudpickle ==2.2.0
- distlib ==0.3.6
- filelock ==3.8.0
- google-auth ==2.14.1
- google-auth-oauthlib ==0.4.6
- grpcio ==1.50.0
- gym ==0.23.1
- gym-notices ==0.0.8
- identify ==2.5.8
- idna ==3.4
- importlib-metadata ==5.0.0
- llvmlite ==0.39.1
- markdown ==3.4.1
- markupsafe ==2.1.1
- nodeenv ==1.7.0
- numba ==0.56.4
- numpy ==1.23.4
- nvidia-cublas-cu11 ==11.10.3.66
- nvidia-cuda-nvrtc-cu11 ==11.7.99
- nvidia-cuda-runtime-cu11 ==11.7.99
- nvidia-cudnn-cu11 ==8.5.0.96
- oauthlib ==3.2.2
- pillow ==9.3.0
- platformdirs ==2.5.3
- pre-commit ==2.20.0
- protobuf ==3.20.3
- pyasn1 ==0.4.8
- pyasn1-modules ==0.2.8
- pygame ==2.1.0
- pyyaml ==6.0
- requests ==2.28.1
- requests-oauthlib ==1.3.1
- rsa ==4.9
- tensorboard ==2.11.0
- tensorboard-data-server ==0.6.1
- tensorboard-plugin-wit ==1.8.1
- toml ==0.10.2
- torch ==1.13.0
- torchvision ==0.14.0
- typing-extensions ==4.4.0
- urllib3 ==1.26.12
- virtualenv ==20.16.6
- werkzeug ==2.2.2
- zipp ==3.10.0