reward4driving

Benchmarks for risk-aware reward shaping of autonomous driving

https://github.com/zhang-zengjie/reward4driving

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.6%) to scientific vocabulary

Keywords

autonomous-driving reinforcement-learning reward-shaping risk-aware-planner

Last synced: 10 months ago · JSON representation

Repository

Benchmarks for risk-aware reward shaping of autonomous driving

Basic Info

Host: GitHub
Owner: zhang-zengjie
License: bsd-3-clause
Language: Jupyter Notebook
Default Branch: main
Homepage: https://ieeexplore.ieee.org/abstract/document/10312462
Size: 277 MB

Statistics

Stars: 5
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

autonomous-driving reinforcement-learning reward-shaping risk-aware-planner

Created about 3 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation

Risk-Aware Reward Shaping of Reinforcement Learning Agents for Autonomous Driving

Author: Lin-Chi Wu (l.wu@student.tue.nl)

This git repository contains the programs for risk-aware reinforcement learning for autonomous driving. It is a supplementary for the following publication,

Wu, Lin-Chi, Zengjie Zhang, Sofie Haesaert, Zhiqiang Ma, and Zhiyong Sun. "Risk-aware reward shaping of reinforcement learning agents for autonomous driving." In IECON 2023-49th Annual Conference of the IEEE Industrial Electronics Society, pp. 1-6. IEEE, 2023.

Environment and setup

The running program is running with a Python environment. The suggested Python installed base: Python 3.8.15.
For the nicest experience and observation, one can use the Jupiter notebook.
Conda environments and the basic package version are presented alongside the repository.

Training file

The training files for 3 algorithms are displayed in Train_DDPG.ipynb, Train_DQN.ipynb and Train_PPO.ipynb.

Demonstrate (test the trained agent) file

One can observe the trained agent's (after training with 1000 episodes) behavior with LoadModelNTest.ipynb

The out-of-track (O2T) counts and the on-edge (OE) counts.

According to the review, there is still a lack of evaluation with a fair standard for different agents' experiences in the same environment. The new evaluation method is proposed to evaluate the agent's performance in the same environment and they are evaluated not just by the reward but also by agents' behaviors counteract to the environment, i. e. the strategy.

We think that the agent has a good strategy of being inside the track and can survive longer during the playing round by showing that the game is less likely to be terminated due to out of the track. In addition, the agent can drive smoothly so that it is less frequently turning angles while remaining on the track. We propose the evaluation method by counting the number of out of track and the number of touching the track's edge. We consider an agent cannot drive stable when it consistently turns driving directions avoiding out of the track near the edge. The lower the number of out-of-track and driving stable, the better the strategy of the agent.

Two parameters to evaluate the agent's strategy

We want these two factors as low as possible.

Out of the track.
The instance of the agent approaching the edge zone.

Nearness of edge zone and out of track

The edge zone area the place is between the green area and the yellow dash line.

Owner

Name: Zengjie Zhang
Login: zhang-zengjie
Kind: user
Location: Canada
Company: The University of British Columbia

Repositories: 3
Profile: https://github.com/zhang-zengjie

GitHub Events

Total

Watch event: 4

Last Year

Watch event: 4

Dependencies

gym_requirements.txt pypi

Box2D *
numpy >=1.10.4
opencv-python *
pyglet ==1.5.27
requests >=2.0
scipy >=0.17.1
six *
tqdm ==4.64.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science