sumo-rl-mobicharger
OpenAI-gym-like Reinforcement Learning environment for Dispatching of Mobile Chargers with SUMO. Compatible with Gym and popular RL libraries such as stable-baselines3.
Science Score: 41.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: scholar.google -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary
Keywords
Repository
OpenAI-gym-like Reinforcement Learning environment for Dispatching of Mobile Chargers with SUMO. Compatible with Gym and popular RL libraries such as stable-baselines3.
Basic Info
Statistics
- Stars: 14
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
SUMO-RL-MobiCharger
SUMO-RL-MobiCharger provides an OpenAI-gym-like environment for the implementation of RL-based mobile charger dispatching methods on the SUMO simulator. The fetures of this environment are four-fold:
- A simple and customizable interface to work with Reinforcement Learning for Dispatching of Mobile Chargers on city-scale transportation network with SUMO
- Compatibility with OpenAI-gym and popular RL libraries such as stable-baselines3 and RL Baselines3 Zoo
- Easy modification of state and reward functions for research focusing on vehicle routing or scheduling problems
- Support parallel training of multiple environments via the use of
SubprocVecEnvin stable-baselines3
|
|
|:--:|
| Blue vehicles are mobile chargers, yellow vehicles are electric vehicles, green highlight means charging between mobile chargers and EVs, and blue highlight means charging between mobile chargers and charging stations |
Install
Install SUMO >= 1.16.0:
Install SUMO as in their doc.
Note that this environment uses Libsumo as default for simulation speedup, but sumo-gui does not work with Libsumo on Windows (more details). If you need to go back to TraCI, uncomment import traci and modify the code in reset() of SumoEnv.
Install the Necessary Packages
Install the necessary packages listed in requirements.txt
Install SUMO-RL-MobiCharger
Clone the latest version and install it in gym
bash
git clone https://github.com/liyan2015/SUMO-RL-MobiCharger.git
cd SUMO-RL-MobiCharger/source
pip install -e .
Training & Testing
In case the environment is not compatible with the up-to-date rl-baselines3-zoo, use this old cloned copy for tuning.
Register SUMO-RL-MobiCharger in RL Baselines3 Zoo
The main class is SumoEnv. To train with RL Baselines3 Zoo, you need to register the environment as in their doc and add the following code to exp_manager.py:
```python
On most env, SubprocVecEnv does not help and is quite memory hungry
therefore we use DummyVecEnv by default
if "SumoEnv" not in self.envname.gymid: env = makevecenv( makeenv, nenvs=nenvs, seed=self.seed, envkwargs=self.envkwargs, monitordir=logdir, wrapperclass=self.envwrapper, vecenvcls=self.vecenvclass, vecenvkwargs=self.vecenvkwargs, monitorkwargs=self.monitorkwargs, ) else: def makeenv( envconfig={ 'guif':False, 'label':'evaluate' }, rank: int = 0, seed: int = 0 ): def init(): env = gym.make('SumoEnv-v0', **envconfig) env = Monitor(env, logdir) env.seed(seed + rank) env.actionspace.seed(seed + rank) return env setrandomseed(seed) return _init
if eval_env:
if self.verbose > 0:
print("Creating evaluate environment.")
env = SubprocVecEnv([make_env() for i in range(n_envs)])
else:
env = SubprocVecEnv([make_env(
{
'gui_f':False,
'label':'train'+str(i+1)
}, rank=i*2) for i in range(n_envs)])
```
Training
For training, use the following command line:
bash
python train.py --algo ppo --env SumoEnv-v0 --num-threads 1 --progress --conf-file hyperparams/python/sumoenv_config.py --save-freq 500000 --log-folder /usr/data2/canaltrain_log/ --tensorboard-log /usr/data2/canaltrain_tensorboard/ --verbose 2 --eval-freq 2000000 --eval-episodes 10 --n-eval-envs 10 --vec-env subproc
Resume Training
For resume training with different EV route files, use the following command line or check the doc of RL Baselines3 Zoo:
bash
python train.py --algo ppo --env SumoEnv-v0 --num-threads 1 --progress --conf-file hyperparams/python/sumoenv_config.py --save-freq 500000 --log-folder /usr/data2/canaltrain_log/ --tensorboard-log /usr/data2/canaltrain_tensorboard/ --verbose 2 --eval-freq 2000000 --eval-episodes 10 --n-eval-envs 10 --vec-env subproc -i /usr/data2/canaltrain_log/ppo/SumoEnv-v0_16/rl_model_12999532_steps.zip
Testing
Change the model_path and stats_path in canal_test.py and run:
bash
python canal_test.py
MDP - Observation, Action and Reward
Observation
The default observation for the agent is a vector:
python
obs = [SOC_state, charger_state, elig_act_state, dir_state, charge_station_state]
- SOC_state indicates the amount of SOC on the road network pending to be refilled by mobile chargers
- charger_state indicates current road segment, staying time, chargingothers bit, chargeself bit, SOC, distance to target vehicle and neighborvehicle bit of each mobile charger
- ```eligactstateindicates the eligible actions that each mobile charger can take at current road segment
-dirstateindicates the best action of each mobile charger given its current road segment
-chargestationstate``` indicates the remaining SOCs that the mobile chargers will have if they go to the charging stations for a recharge
Action
The action space is discrete. Each edge in SUMO network is partitioned into several road segments:
Thus, the possible actions of the agent at each road segment can be illustrated as:
Througout the road network, a mobile charger can only take maximally 6 actions: stay (0), charge vehicles (1), go downstream road segments (2-5).
Reward
The default reward function is defined as:
+ 2if a mobile charger charges an EV withstep_charged_SOC+ 3 * charger.step_charged_SOC + 0.5 * (1 - before_SOC)if a mobile charger charges itself withstep_charged_SOC+ 8e-2if a mobile charger takes the best action- 8e-2if a mobile charger takes an action different from the best one- 8e-1if a mobile charger takes an ineligible action given its current road segment- 300if a mobile charger exhausts its SOC+ 250if the agent succeeds in charging all the EVs and support the completion of their trips
Citing
If you use this repository, please cite:
bibtex
@article{yan2022mobicharger,
title={MobiCharger: Optimal Scheduling for Cooperative EV-to-EV Dynamic Wireless Charging},
author={Yan, Li and Shen, Haiying and Kang, Liuwang and Zhao, Juanjuan and Zhang, Zhe and Xu, Chengzhong},
journal={IEEE Transactions on Mobile Computing},
volume={22},
number={12},
pages={6889-6906},
year={2023},
}
List of publications that cite this work: Google Scholar
Owner
- Login: liyan2015
- Kind: user
- Repositories: 2
- Profile: https://github.com/liyan2015
Citation (CITATION.bib)
@article{yan2022mobicharger,
title={MobiCharger: Optimal Scheduling for Cooperative EV-to-EV Dynamic Wireless Charging},
author={Yan, Li and Shen, Haiying and Kang, Liuwang and Zhao, Juanjuan and Zhang, Zhe and Xu, Chengzhong},
journal={IEEE Transactions on Mobile Computing},
volume={22},
number={12},
pages={6889-6906},
year={2023},
}
GitHub Events
Total
- Watch event: 3
- Push event: 3
Last Year
- Watch event: 3
- Push event: 3