gym-idsgame
An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym
Science Score: 64.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org, ieee.org -
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.1%) to scientific vocabulary
Repository
An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym
Basic Info
Statistics
- Stars: 83
- Watchers: 3
- Forks: 24
- Open Issues: 11
- Releases: 0
Metadata Files
README.md
gym-idsgame An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym
gym-idsgame is a reinforcement learning environment for simulating attack and defense operations in an abstract network intrusion
game. The environment extends the abstract model described in (Elderman et al. 2017). The model constitutes a
two-player Markov game between an attacker agent and a defender agent that face each other in a simulated computer
network. The reinforcement learning environment exposes an interface to a partially observed Markov decision process
(POMDP) model of the Markov game. The interface can be used to train, simulate, and evaluate attack- and defend policies against each other.
Moreover, the repository contains code to reproduce baseline results for various reinforcement learning algorithms, including:
- Tabular Q-learning
- Neural-fitted Q-learning using the DQN algorithm.
- REINFORCE with baseline
- Actor-Critic REINFORCE
- PPO
Please use this bibtex if you make use of this code in your publications (paper: https://arxiv.org/abs/2009.08120):
@INPROCEEDINGS{Hamm2011:Finding,
AUTHOR="Kim Hammar and Rolf Stadler",
TITLE="Finding Effective Security Strategies through Reinforcement Learning and
{Self-Play}",
BOOKTITLE="International Conference on Network and Service Management (CNSM 2020)
(CNSM 2020)",
ADDRESS="Izmir, Turkey",
DAYS=1,
MONTH=nov,
YEAR=2020,
KEYWORDS="Network Security; Reinforcement Learning; Markov Security Games",
ABSTRACT="We present a method to automatically find security strategies for the use
case of intrusion prevention. Following this method, we model the
interaction between an attacker and a defender as a Markov game and let
attack and defense strategies evolve through reinforcement learning and
self-play without human intervention. Using a simple infrastructure
configuration, we demonstrate that effective security strategies can emerge
from self-play. This shows that self-play, which has been applied in other
domains with great success, can be effective in the context of network
security. Inspection of the converged policies show that the emerged
policies reflect common-sense knowledge and are similar to strategies of
humans. Moreover, we address known challenges of reinforcement learning in
this domain and present an approach that uses function approximation, an
opponent pool, and an autoregressive policy representation. Through
evaluations we show that our method is superior to two baseline methods but
that policy convergence in self-play remains a challenge."
}
Publications
See also
Table of Contents
- Design
- Included Environments
- Requirements
- Installation
- Usage
- Manual Play
- Baseline Experiments
- Future Work
- Author & Maintainer
- Copyright and license
Design
Included Environments
A rich set of configurations of the Markov game are registered as openAI gym environments.
The environments are specified and implemented in gym_idsgame/envs/idsgame_env.py see also gym_idsgame/__init__.py.
minimal_defense
This is an environment where the agent is supposed to play the attacker in the Markov game and the defender is following the defend_minimal baseline defense policy.
The defend_minimal policy entails that the defender will always defend the attribute with the minimal value out of all of its neighbors.
Registered configurations:
idsgame-minimal_defense-v0idsgame-minimal_defense-v1idsgame-minimal_defense-v2idsgame-minimal_defense-v3idsgame-minimal_defense-v4idsgame-minimal_defense-v5idsgame-minimal_defense-v6idsgame-minimal_defense-v7idsgame-minimal_defense-v8idsgame-minimal_defense-v9idsgame-minimal_defense-v10idsgame-minimal_defense-v11idsgame-minimal_defense-v12idsgame-minimal_defense-v13idsgame-minimal_defense-v14idsgame-minimal_defense-v15idsgame-minimal_defense-v16idsgame-minimal_defense-v17idsgame-minimal_defense-v18idsgame-minimal_defense-v19idsgame-minimal_defense-v20
maximal_attack
This is an environment where the agent is supposed to play the defender and the attacker is following the attack_maximal baseline attack policy.
The attack_maximal policy entails that the attacker will always attack the attribute with the maximum value out of all of its neighbors.
Registered configurations:
idsgame-maximal_attack-v0idsgame-maximal_attack-v1idsgame-maximal_attack-v2idsgame-maximal_attack-v3idsgame-maximal_attack-v4idsgame-maximal_attack-v5idsgame-maximal_attack-v6idsgame-maximal_attack-v7idsgame-maximal_attack-v8idsgame-maximal_attack-v9idsgame-maximal_attack-v10idsgame-maximal_attack-v11idsgame-maximal_attack-v12idsgame-maximal_attack-v13idsgame-maximal_attack-v14idsgame-maximal_attack-v15idsgame-maximal_attack-v16idsgame-maximal_attack-v17idsgame-maximal_attack-v18idsgame-maximal_attack-v19idsgame-maximal_attack-v20
random_attack
This is an environment where the agent is supposed to play as the defender and the attacker is following a random baseline attack policy.
Registered configurations:
idsgame-random_attack-v0idsgame-random_attack-v1idsgame-random_attack-v2idsgame-random_attack-v3idsgame-random_attack-v4idsgame-random_attack-v5idsgame-random_attack-v6idsgame-random_attack-v7idsgame-random_attack-v8idsgame-random_attack-v9idsgame-random_attack-v10idsgame-random_attack-v11idsgame-random_attack-v12idsgame-random_attack-v13idsgame-random_attack-v14idsgame-random_attack-v15idsgame-random_attack-v16idsgame-random_attack-v17idsgame-random_attack-v18idsgame-random_attack-v19idsgame-random_attack-v20
random_defense
An environment where the agent is supposed to play as the attacker and the defender is following a random baseline defense policy.
Registered configurations:
idsgame-random_defense-v0idsgame-random_defense-v1idsgame-random_defense-v2idsgame-random_defense-v3idsgame-random_defense-v4idsgame-random_defense-v5idsgame-random_defense-v6idsgame-random_defense-v7idsgame-random_defense-v8idsgame-random_defense-v9idsgame-random_defense-v10idsgame-random_defense-v11idsgame-random_defense-v12idsgame-random_defense-v13idsgame-random_defense-v14idsgame-random_defense-v15idsgame-random_defense-v16idsgame-random_defense-v17idsgame-random_defense-v18idsgame-random_defense-v19idsgame-random_defense-v20
two_agents
This is an environment where neither the attacker nor defender is part of the environment, i.e. it is intended for 2-agent simulations or RL training. In the experiments folder you can see examples of using this environment for training PPO-attacker vs PPO-defender, DQN-attacker vs REINFORCE-defender, etc..
Registered configurations:
idsgame-v0idsgame-v1idsgame-v2idsgame-v3idsgame-v4idsgame-v5idsgame-v6idsgame-v7idsgame-v8idsgame-v9idsgame-v10idsgame-v11idsgame-v12idsgame-v13idsgame-v14idsgame-v15idsgame-v16idsgame-v17idsgame-v18idsgame-v19idsgame-v20
Requirements
- Python 3.5+
- OpenAI Gym
- NumPy
- Pyglet (OpenGL 3D graphics)
- GPU for 3D graphics acceleration (optional)
- jsonpickle (for configuration files)
- torch (for baseline algorithms)
Installation & Tests
```bash
install from pip
pip install gym-idsgame==1.0.12
git clone and install from source
git clone https://github.com/Limmen/gym-idsgame cd gym-idsgame pip3 install -e .
local install from source
$ pip install -e gym-idsgame
force upgrade deps
$ pip install -e gym-idsgame --upgrade
run unit tests
pytest
run it tests
cd experiments make tests ```
Usage
The environment can be accessed like any other OpenAI environment with gym.make.
Once the environment has been created, the API functions
step(), reset(), render(), and close() can be used to train any RL algorithm of
your preference.
python
import gymnasium as gym
from gym_idsgame.envs import IdsGameEnv
env_name = "idsgame-maximal_attack-v3"
env = gym.make(env_name)
The environment ships with implementation of several baseline algorithms, e.g. the tabular Q(0) algorithm, see the example code below.
python
import gymnasium as gym
from gym_idsgame.agents.training_agents.q_learning.q_agent_config import QAgentConfig
from gym_idsgame.agents.training_agents.q_learning.tabular_q_learning.tabular_q_agent import TabularQAgent
random_seed = 0
util.create_artefact_dirs(default_output_dir(), random_seed)
q_agent_config = QAgentConfig(gamma=0.999, alpha=0.0005, epsilon=1, render=False, eval_sleep=0.9,
min_epsilon=0.01, eval_episodes=100, train_log_frequency=100,
epsilon_decay=0.9999, video=True, eval_log_frequency=1,
video_fps=5, video_dir=default_output_dir() + "/results/videos/" + str(random_seed), num_episodes=20001,
eval_render=False, gifs=True, gif_dir=default_output_dir() + "/results/gifs/" + str(random_seed),
eval_frequency=1000, attacker=True, defender=False, video_frequency=101,
save_dir=default_output_dir() + "/results/data/" + str(random_seed))
env_name = "idsgame-minimal_defense-v2"
env = gym.make(env_name, save_dir=default_output_dir() + "/results/data/" + str(random_seed))
attacker_agent = TabularQAgent(env, q_agent_config)
attacker_agent.train()
train_result = attacker_agent.train_result
eval_result = attacker_agent.eval_result
Manual Play
You can run the environment in a mode of "manual control" as well:
python
from gym_idsgame.agents.manual_agents.manual_defense_agent import ManualDefenseAgent
random_seed = 0
env_name = "idsgame-random_attack-v2"
env = gym.make(env_name)
ManualDefenseAgent(env.idsgame_config)
Baseline Experiments
The experiments folder contains results, hyperparameters and code to reproduce reported results using this environment.
For more information about each individual experiment, see this README.
Clean All Experiment Results
bash
cd experiments # cd into experiments folder
make clean
Run All Experiment Results (Takes a long time)
bash
cd experiments # cd into experiments folder
make all
Run All Experiments For a specific environment (Takes a long time)
bash
cd experiments # cd into experiments folder
make v0
Run a specific experiment
bash
cd experiments/training/v0/random_defense/tabular_q_learning/ # cd into the experiment folder
make run
Clean a specific experiment
bash
cd experiments/training/v0/random_defense/tabular_q_learning/ # cd into the experiment folder
make clean
Start tensorboard for a specifc experiment
bash
cd experiments/training/v0/random_defense/tabular_q_learning/ # cd into the experiment folder
make tensorboard
Fetch Baseline Experiment Results
By default when cloning the repo the experiment results are not included, to fetch the experiment results,
install and setup git-lfs then run:
bash
git lfs fetch --all
git lfs pull
Author & Maintainer
Kim Hammar kimham@kth.se
Copyright and license
MIT
(C) 2020, Kim Hammar
Owner
- Name: Kim Hammar
- Login: Limmen
- Kind: user
- Location: Stockholm
- Company: KTH
- Website: limmen.dev
- Twitter: KimHammar1
- Repositories: 18
- Profile: https://github.com/Limmen
PhD @KTH
Citation (CITATION.cff)
authors: - affiliation: "KTH Royal Institute of Technology" family-names: Hammar given-names: Kim - affiliation: "KTH Royal Institute of Technology" family-names: Stadler given-names: Rolf cff-version: 1.2.0 keywords: - "reinforcement learning" - "cyber security" - "intrusion prevention" message: "If you use this software, please cite it using these metadata." repository-code: "https://github.com/Limmen/gym-idsgame" title: "Finding Effective Security Strategies through Reinforcement Learning and Self-Play"
GitHub Events
Total
- Issues event: 2
- Watch event: 11
- Issue comment event: 4
- Push event: 1
- Fork event: 6
Last Year
- Issues event: 2
- Watch event: 11
- Issue comment event: 4
- Push event: 1
- Fork event: 6
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 479
- Total Committers: 2
- Avg Commits per committer: 239.5
- Development Distribution Score (DDS): 0.002
Top Committers
| Name | Commits | |
|---|---|---|
| Kim Hammar | k****m@k****e | 478 |
| Frederico Nesti | 4****i@u****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 15
- Total pull requests: 1
- Average time to close issues: 4 days
- Average time to close pull requests: about 14 hours
- Total issue authors: 13
- Total pull request authors: 1
- Average comments per issue: 1.6
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 2
- Pull request authors: 0
- Average comments per issue: 2.5
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mmcmanus1 (3)
- Roker301 (1)
- Gabriel0402 (1)
- shigens (1)
- Zhiyuan19 (1)
- fearnoone-bai (1)
- JIN-Owen (1)
- ammohamedds (1)
- renjunzhenshuai (1)
- wuTongLeaves (1)
- Peter-of-Astora (1)
- ianbryant2 (1)
- bbokka123 (1)
Pull Request Authors
- FredericoNesti (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 82 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 14
- Total maintainers: 1
pypi.org: gym-idsgame
An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym
- Homepage: https://github.com/Limmen/gym-idsgame
- Documentation: https://gym-idsgame.readthedocs.io/
- License: MIT License
-
Latest release: 1.0.13
published over 3 years ago
Rankings
Maintainers (1)
Dependencies
- gym *
- imageio *
- jsonpickle *
- matplotlib *
- numpy *
- opencv-python *
- pyglet *
- seaborn *
- sklearn *
- stable_baselines3 *
- tensorboard *
- torch *
- torchvision *
- tqdm *