simulink_gym

Gym Interface Wrapper for Simulink Models

https://github.com/johbrust/simulink_gym

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    2 of 5 committers (40.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.9%) to scientific vocabulary

Keywords

gym-environments python reinforcement-learning simulink-python
Last synced: 6 months ago · JSON representation ·

Repository

Gym Interface Wrapper for Simulink Models

Basic Info
Statistics
  • Stars: 16
  • Watchers: 1
  • Forks: 1
  • Open Issues: 2
  • Releases: 1
Topics
gym-environments python reinforcement-learning simulink-python
Created almost 5 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

Readme.md

Simulink Gym

A wrapper for using Simulink models as Gym environments

This wrapper establishes the Gymnasium environment interface for Simulink models by deriving a simulink_gym.SimulinkEnv subclass from gymnasium.Env.

This wrapper uses Gymnasium version 1.0.0 enabling easy usage with established RL libraries such as Stable-Baselines3 or rllib.

How it Works

This section gives a broad description of the functionality under the hood. For detailed instructions on how to wrap a simulink model, see below.

The wrapper is based on adding TCP/IP communication between a Simulink model running in a background instance of MATLAB Simulink and a Python wrapper class implementing the Gymnasium interface.

The TCP/IP communication is established via respective Simulink blocks and matching communication sockets. The Simulink blocks are provided by the Simulink block library included in this project. The input block receives the input action, which triggers the simulation of the next time step. At the end of this time step, the output block sends the output data (i.e., the observation) back to the wrapper. Check out the Readme of the Simulink block library for more information.

The wrapper provides the necessary methods to create this derived environment without the user having to implement the TCP/IP communication. Similar to the usual Gymnasium environment implementations, the user only has to define the action and observation/state space as well as the individual reset and step methods.

[!NOTE] Initializing an environment object takes a few seconds due to the starting of MATLAB in the background and the creation of the simulation object (SimulationInput object). Also, the first reset(...) takes substantially longer than any consecutive reset(...).

Setup

This package is controlling Simulink models in MATLAB from Python through the MATLAB engine for Python, which is a Python package provided by MATLAB. It is recommended to use a MATLAB version > R2022b since until MATLAB version R2022b, this Python package was not available for simple installation via pip from PyPI. Instead, it had to be installed manually from source and, therefore, it could not be added as a dependency for automatic installation (e.g., in requirements.txt or pyproject.toml). If you need instruction on how to install Simulink Gym and the MATLAB engine for Python for a version <= R2022b, see an older version of this project.

The provided Simulink blocks for connecting the Simulink model to this wrapper use the Simulink Instrument Control Toolbox. Therefore, this toolbox has to be installed in MATLAB/Simulink as a dependency.

Using uv, follow the following steps to install Simulink Gym into your environment or project.

```shell

Install Simulink Gym into some environment:

uv pip install git+https://github.com/johbrust/simulink_gym.git

Or add it as a dependency to a Python project:

uv add git+https://github.com/johbrust/simulink_gym.git ```

Extras

This package also provides example implementations using the Simulink wrapper (including example training scripts for DQN and PPO agents for the cart pole implementation in Simulink). To try them out, it is recommended to clone the repository and install from source by executing uv sync --all-extras to install the extra packages required by the examples.

Simulink Gym Block Library

Shipped with this package comes a custom Simulink block library for setting up the interface on the model side. Checkout the respective Readme for more information about setup and usage.

How to Wrap a Simulink Model

In order to use a Simulink model with this wrapper the model has to be prepared accordingly. This includes preparing the Simulink model file (.slx) to be wrapped and writing a wrapper class for the model with SimulinkEnv as its base class.

Prepare the Simulink Model File

For the communication with the wrapper the TCP/IP blocks provided by the Simulink Gym block library have to be added and setup accordingly.

Setting parameter values of the model through the wrapper can be done in two different ways, which has consequences for the model creation process. The first possibility is to directly set block parameter values through SimulinkEnv.set_block_parameter(...). The block parameters can be set to any value and changed later through the wrapper. A second way would be to define a variable in the model workspace and set the block parameter to this variable. The workspace variable then can be changed for changing the block parameter through SimulinkEnv.set_workspace_variable(...).

[!NOTE] Model workspace variables are the recommended way to make general block settings, like step sizes, available for the wrapper. For creating a model workspace variable, you can use the Model Explorer, which can be opened with CTRL + H from the Simulink model editor.

Check Model Debugging for information on how to debug the Simulink model while using this wrapper.

Preparing the Environment File

The second part of the environment definition is to create an environment class derived from the SimulinkEnv base class.

This derived class has to define the action and observation space as well as the reset(...) and step(...) methods specific for the environment.

Action and Observation Space

While the action space is defined simply by, e.g., self.action_space = gymnasium.spaces.Discrete(2), the observation space definition needs additional information about the corresponding blocks or workspace variables in the Simulink model. This is due to the fact that the wrapper needs to be able to set these values, e.g., while resetting the environment. For this, the wrapper provides the Observation and Observations classes. For an example definition of an observation space, check the cart pole example implementations in Simulink and Simscape which set initial values directly through the block parameter values (Simulink implementation) or through workspace variables (Simscape implementation).

The Observations object of the environment is a list-like object with the order of its Observation entries matching the concatenation order of the observation signals in the Simulink model (e.g., through the mux block).

Since observation values are reset after an episode, information about the corresponding blocks or workspace variable have to be provided. For block parameters, the wrapper can access these through the path of the block value which is given by the template <model name>/<subsystem 0>/.../<subsystem n>/<block name>/<parameter name> for a block buried in n subsystems.

[!WARNING] Block parameter names don't always match the description in the block mask! Therefore, get the correct parameter name from the Simulink documentation and not from the mask!

Reset and Step Methods

The provided _reset() method is to be called in the reset() method of the derived environment class. This takes care of resetting the Simulink simulation. The derived class therefore only has to implement environment specific reset behavior like resampling of the initial state or only parts of it. Again, see the cart pole example for an example usage.

The basic stepping functionality is provided by the wrapper's sim_step(...) method which should be called in the step(...) method of the derived environment definition class (see, e.g., step(...) method of the cart pole example).

Running the Simulink Model

After everything is set up just use the defined environment like any other Gymnasium environment. See the notebook of the cart pole Simulink implementation for an example usage.

Model Debugging

For debugging the Simulink model in combination with the wrapper, the model_debug flag is provided. Set this to True in the super().__init__(...) call (example usage here) in your derived environment class and start your environment. This tells the wrapper to not start a thread with a MATLAB instance running the simulation in the background. Instead, you have to manually start the simulation model in the Simulink GUI once the environment object is instantiated and reset initially (executing state = env.reset() will cause the program to wait for the connection). You can then access the Simulink model's internal signals through the Simulink GUI for easy debugging.

The best way to stop the simulation is by executing env.stop_simulation().

End of Episode

An environment complying with the Gymnasium interface returns the terminated flag when the episode is finished. The Simulink simulation returns an empty TCP/IP message after the simulation stopped (i.e., when the simulation has run for the defined duration). But this is only sent after the last simulation step (i.e., at time t_end + 1). Therefore, the termination of the simulation can only be detected one time step after the terminal state was already reached. Keep this in mind, when using the data from the environment, since the terminal state will be present two times! As a workaround, simply drop the last data point from the trajectory!

Known Issues

The known issues below could not be fixed due to the lack of knowledge about the exact cause. Despite these known issues, there are fixes known to avoid these issues, which are given with each issue.

[!IMPORTANT] If you encounter issues not listed below, please create a new issue or even a pull request if you also already found the fix!

  • It sometimes can be observed that after a while two sets of output data are received from the Simulink model when only one action was sent. It is assumed that this is causes by some timing issues of the TCP/IP communication in combination with the update order of the model.

Fix: All occurrences of this issue could be mitigated by ensuring a certain block execution order of the Simulink model. There are different possibilities to achieve this:

  1. Set the priorities of the TCP/IP In and TCP/IP Out blocks to 1 and 2, respectively. Simulink then tries to come up with a block execution order according to these priorities. Unfortunately, setting these priorities does not guarantee that such a block execution order is possible.
  2. Introduce additional signals in the Simulink model to enforce a certain block execution order. E.g., add a signal of the incoming action to some (dummy) blocks close before the TCP/IP Out block.

Owner

  • Name: Johannes Brust
  • Login: johbrust
  • Kind: user
  • Location: Osnabrück
  • Company: DFKI

Researcher at the German Research Center for Artificial Intelligence (DFKI) interested in Reinforcement Learning, Control Theory and Robotics

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Simulink Gym
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Johannes
    family-names: Brust
    email: johannes.brust@dfki.de
    affiliation: DFKI
    orcid: 'https://orcid.org/0000-0002-0615-6183'
repository-code: 'https://github.com/johbrust/simulink_gym'
abstract: A wrapper for using Simulink models as Gym environments.
keywords:
  - Simulink
  - Gym
  - Reinforcement Learning
  - MATLAB
  - Python
license: MIT

GitHub Events

Total
  • Create event: 2
  • Issues event: 3
  • Release event: 1
  • Watch event: 5
  • Delete event: 1
  • Issue comment event: 4
  • Push event: 7
  • Fork event: 1
Last Year
  • Create event: 2
  • Issues event: 3
  • Release event: 1
  • Watch event: 5
  • Delete event: 1
  • Issue comment event: 4
  • Push event: 7
  • Fork event: 1

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 181
  • Total Committers: 5
  • Avg Commits per committer: 36.2
  • Development Distribution Score (DDS): 0.492
Past Year
  • Commits: 132
  • Committers: 3
  • Avg Commits per committer: 44.0
  • Development Distribution Score (DDS): 0.348
Top Committers
Name Email Commits
Johannes Brust j****t@g****m 92
Johannes Brust j****t@d****e 47
johannes j****t@u****e 40
Johannes Brust 1****t 1
Johannes Brust j****t@J****i 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 6
  • Total pull requests: 2
  • Average time to close issues: 5 months
  • Average time to close pull requests: 1 minute
  • Total issue authors: 3
  • Total pull request authors: 1
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • lk1983823 (3)
  • johbrust (1)
  • gearskill (1)
  • Mingzefei (1)
Pull Request Authors
  • johbrust (2)
Top Labels
Issue Labels
Pull Request Labels