tunnel-automation-with-reinforcement-learning-tunnrl-

Code repository for the paper "Reinforcement Learning based Process Optimization and Strategy Development in Conventional Tunneling" by G.H. Erharter, T.F. Hansen, Z. Liu and T. Marcher

https://github.com/geograz/tunnel-automation-with-reinforcement-learning-tunnrl-

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.0%) to scientific vocabulary

Keywords

excavation-sequence machine-learning reinforcement-learning tunnel-excavation tunnelling

Last synced: 11 months ago · JSON representation ·

Repository

Code repository for the paper "Reinforcement Learning based Process Optimization and Strategy Development in Conventional Tunneling" by G.H. Erharter, T.F. Hansen, Z. Liu and T. Marcher

Basic Info

Host: GitHub
Owner: geograz
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 193 KB

Statistics

Stars: 8
Watchers: 2
Forks: 4
Open Issues: 3
Releases: 1

Topics

excavation-sequence machine-learning reinforcement-learning tunnel-excavation tunnelling

Created almost 6 years ago · Last pushed almost 5 years ago

Metadata Files

Readme License Citation

Tunnel automation with Reinforcement Learning - TunnRL

This repository contains the codes for the paper:

Reinforcement learning based process optimization and strategy development in conventional tunneling

by Georg H. Erharter, Tom F. Hansen, Zhongqiang Liu and Thomas Marcher

published in Automation in Construction (Vol. 127; July 2021)

DOI: https://doi.org/10.1016/j.autcon.2021.103701

The paper was published as part of a collaboration on Machine Learning between the Institute of Rock Mechanics and Tunnelling (Graz University of Technology) and the Norwegian Geotechnical Institute (NGI) in Oslo.

Requirements and folder structure

Use the requirements.txt file to download the required packages to run the code. We recommend using a package management system like conda for this purpose.

Code and folder structure set up

The code framework depends on a certain folder structure. The python files should be placed in the main directory. The set up should be done in the following way: Reinforcement_Learning_for_Geotechnics ├── 02_plots │ └── tmp ├── 04_checkpoints ├── 06_results │ └── tmp ├── 00_main.py ├── 02_model_tester.py ├── 04_analyzer.py ├── A_utilities.py ├── B_generator.py ├── C_geotechnician.py ├── D_tunnel.py └── E_plotter.py Either set up the folder structure manually or on Linux run: bash bash folder_structure.sh

Code description

00_main.py ... is the main executing file
02_model_tester.py ... file that runs and tests individual checkpoints of already trained model for further analysis
04_analyzer.py ... file that analyzes and visualizes the performance of agents tested with 02_model_tester.py
A_utilities.py ... is a library containing useful functions that do not directly belong to the environment or the agent
B_generator.py ... part of the environment that generates a new geology for every episode
C_geotechnician.py ... part of the environment that evaluates the stability and also contains the RL agent itself
D_tunnel.py ... part of the environment that handles the rewards and updates the progress of the excavation
E_plotter.py ... plotting functionalities to visualize the training progress or render episodes

Pseudo - code for the utilized DQN-algorithm

(inspired by Deeplizard)

A. Initialize replay memory capacity ("un-correlates" the otherwise sequential correlated input)
B. Inititalize the policy-ANN (keeps the optimal approximated Q-function) with random weights
C. Clone the policy-ANN to a second target-ANN that is used for computing $ Q^* $ in $Q^*(s,a) - Q(s,a) = loss$
D. For each episode:
1. Initialize the starting state (not resetting the weights)
2. For each time step:
  - Select an action after an epsilon-greedy strategy (exploitation or exploration)
  - Execute the selected action in and emulator
  - Observe reward and next state
  - Store experience (a tuple of old-state, action, reward, new-state) in replay memory
  - Sample a random batch from replay memory
  - Preprocess all states (an array of values) from batch
  - Pass batch of preprocessed states and next-states to policy-ANN and target-ANN. Predict Q-values for both ANN's.
  - Calculate loss between output Q-values from policy-ANN and target-ANN
  - Standard gradient descent with back propagation updates weights in the policy-ANN to minimize loss. Every xxx timestep the weights in the target-ANN is updated with weights from the policy-ANN

References

Besides other references given in the paper, we especially want to highlight the Reinforcement Learning with Python tutorial series of Sentdex which served as a basis for the agent in C_geotechnician.py.

Owner

Name: G. H. Erharter
Login: geograz
Kind: user
Location: Norway
Company: @norwegian-geotechnical-institute

Repositories: 1
Profile: https://github.com/geograz

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Erharter"
  given-names: "Georg H."
  orcid: "https://orcid.org/0000-0002-7793-9994"
- family-names: "Hansen"
  given-names: "Tom F."
title: "Code to the paper Reinforcement learning based process optimization and strategy development in conventional tunneling"
version: 1.0.0
date-released: 2020-12-17
url: "https://github.com/geograz/Tunnel-automation-with-Reinforcement-Learning-TunnRL-"
preferred-citation:
  type: article
  authors:
  - family-names: "Erharter"
    given-names: "Georg H."
    orcid: "https://orcid.org/0000-0002-7793-9994"
  - family-names: "Hansen"
    given-names: "Tom F."
  - family-names: "Liu"
    given-names: "Zhongqiang"
  - family-names: "Marcher"
    given-names: "Thomas"
  doi: "https://doi.org/10.1016/j.autcon.2021.103701"
  url: "https://www.sciencedirect.com/science/article/pii/S0926580521001527"
  journal: "Automation in Construction"
  issn: "0926-5805"
  month: 7
  title: "Reinforcement learning based process optimization and strategy development in conventional tunneling"
  volume: 127
  year: 2021
  keywords: "Conventional tunneling, Reinforcement learning, Tunnel excavation strategy, Machine learning, Excavation sequences"
  abstract: "Reinforcement learning (RL) - a branch of machine learning - refers to the process of an agent learning to achieve a certain goal by interaction with its environment. The process of conventional tunneling shows many similarities, where a geotechnician (agent) tries to achieve a breakthrough (goal) by excavating the rockmass (environment) in an optimum way. In this paper we present a novel RL based framework for strategy development for conventional tunneling. We developed a virtual environment with the goal of a tunnel breakthrough and with a deep Q-network as the agent's architecture. It can choose from different excavation sequences to reach that goal and learns to do so in an economical and safe way by getting feedback from a specially designed reward system. Result analyses show that the optimal policies have great similarities to current practices of sequential tunneling and the framework has the potential to discover new tunneling strategies."

GitHub Events

Total

Watch event: 2
Fork event: 1

Last Year

Watch event: 2
Fork event: 1

Dependencies

requirements.txt pypi

matplotlib ==3.3.1
numpy ==1.19.1
opencv_python ==4.3.0.36
pandas ==1.1.3
tensorflow ==1.14.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science