ipal_ids_framework

Industrial Intrusion Detection - A framework for protocol-independent industrial intrusion detection on top of IPAL.

https://github.com/fkie-cad/ipal_ids_framework

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 40 DOI reference(s) in README
  • Academic publication links
    Links to: ieee.org, acm.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.0%) to scientific vocabulary

Keywords

anomaly-detection cps ids industrial intrusion-detection ipal
Last synced: 6 months ago · JSON representation ·

Repository

Industrial Intrusion Detection - A framework for protocol-independent industrial intrusion detection on top of IPAL.

Basic Info
  • Host: GitHub
  • Owner: fkie-cad
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 1.02 MB
Statistics
  • Stars: 23
  • Watchers: 5
  • Forks: 13
  • Open Issues: 1
  • Releases: 0
Topics
anomaly-detection cps ids industrial intrusion-detection ipal
Created over 4 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

IPAL - Industrial Intrusion Detection Framework

Logo

This repository is part of IPAL - an Industrial Protocol Abstraction Layer. IPAL aims to establish an abstract representation of industrial network traffic for subsequent unified and protocol-independent industrial intrusion detection. IPAL consists of a transcriber to automatically translate industrial traffic into the IPAL representation, an IDS Framework implementing various industrial intrusion detection systems (IIDSs), and a collection of evaluation datasets. For details about IPAL, please refer to our publications listed down below.

The ever-increasing digitization in industries enables the automatization of complex physical processes and, with progressive integration into the Internet, also large-scale distributed systems. Due to both trends, well-known cyber-security problems are inherited, which, in the past, already led to severe attacks, e.g., the striking of the Ukrainian power grid in 2015. Supplementing proactive measures, Industrial Intrusion Detection Systems (IIDSs) promise to detect such attacks timely by monitoring the communication between automatization devices or accessing the processes’ physical state. Researchers proposed many IIDS solutions until today. However, due to a lack of standard interfaces and diverse communication protocols across industrial domains, great efforts are required to adapt existing IIDSs to new domains and communication protocols. To overcome this issue, we propose IPAL - a common message format that decouples IIDSs from domain-specific communication protocols. This representation applies to most IIDSs, as all their input data requirements are covered. Moreover, the required data is extractable across multiple industrial protocols due to inherent similarities in their communication patterns.

This repository contains the ipal-iids framework together with implementations of several IIDSs based on the IPAL message and state format generated by our second project the ipal-transcriber. As shown in the overview figure below, the IIDS framework consists of two phases. In the training phase, the IIDSs learn an internal model based on a training dataset and a configuration file with IIDS specific parameters. During the live phase, the IIDSs load the trained models and search for anomalies in live data.

Overview Figure

Overview Figure

Implemented IIDSs

The IIDS framework contains implementations of the following IIDSs. Note that we distinguish between IIDSs operating on the IPAL message format (on a per-network packet basis) or on the IPAL state format (a summary of all industrial process values for a given point in time).

| IDSs | Type | Publication/Source Code | Description | |-----------------------------|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------| | Autoregression (Deprecated) | State | Paper, Paper, Code | Process prediction (not reproduced) | | BLSTM | Message/State | Paper, Code | Machine Learning - Bidirectional Long Short Term Memory | | Decision Trees | Message/State | Paper Code | (not reproduced) | | Dummy | Message/State | -- | Implements a Dummy IDS that alerts always, never, or randomly. | | DTMC* | Message | Paper, Code | Packet Sequences - Discrete-time Markov Chains | | Extra Trees | Message/State | Paper Code | (not reproduced) | | Inter-arrival time | Message | Paper | Packet Inter-arrival time | | Invariant Rules* | State | Paper, Code | Compares states against invariant rules generated from training dataset | | Isolation Forest | Message/State | Paper Code | (not reproduced) | | Kitsune | Message | Paper Code | | | GeCo | State | Implementation of "GeCos Replacing Experts: Generalizable and Comprehensible Industrial Intrusion Detection" USENIX Security 2025 | Learns a state-space model of the physical process. | | Naive Bayes | Message/State | Paper | (not reproduced) | | Optimal | Message/State | -- | Implements a "Oracle" that always classifies correctly (or always incorrect if desired). | | PASAD* | State | Paper, Code, Code | Process prediction - Process-Aware Stealthy Attack Detector | | Random Forest | Message/State | Paper, Code | Machine Learning - Random Forest | | Seq2SeqNN* | State | Paper, Code | Process Prediction - Sequence-to-Sequence Neural Networks | | SIMPLE-DecimalPlaces | Message/State | -- | Alerts if process values with fewer/more decimal places occur than during training. | | SIMPLE-Exists | Message/State | -- | Alerts if too many process value, that have never been seen before, occur over a longer period of time. | | SIMPLE-Histogram | Message/State | Paper | Histogramm of a sensor over time. | | SIMPLE-MinMax | Message/State | Paper | Minimum and Maximum of a value plus threshold | | SIMPLE-Steadytime | Message/State | Paper | Compares longest or shortest time in a single state of a sensor. | | Support Vector Machine | Message/State | Paper, Code | Machine Learning - Support Vector Machine | | TABOR* | State | Paper | Process Sequences - Time Automata and Bayesian netwORk |

Note: IDSs marked with * are not available publically, but can be obtained on request.

Publications
  • Konrad Wolsing, Eric Wagner, Antoine Saillard, and Martin Henze. 2022. IPAL: Breaking up Silos of Protocol-dependent and Domain-specific Industrial Intrusion Detection Systems. In 25th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2022), October 26–28, 2022, Limassol, Cyprus. ACM, New York, NY, USA, 17 pages. https://doi.org/10.1145/3545948.3545968
  • Wolsing, Konrad, Eric Wagner, and Martin Henze. "Poster: Facilitating Protocol-independent Industrial Intrusion Detection Systems." Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 2020 https://doi.org/10.1145/3372297.3420019

Getting started

If you are new to IPAL and want to learn about the general idea or try out our tutorials, please refer to IPAL's main repository: https://github.com/fkie-cad/ipal.

Prerequisites
  • ipal-iids requires libgsl (or libgsl-dev) to be installed. See https://www.gnu.org/software/gsl/doc/html/index.html for further information.
  • The Autoregression IIDS requires ar. Please make sure that python-dev or the corresponding version (e.g. python3.9-dev) is installed on your system
Installation (pip)

Use python3 -m pip install . to install the scripts and dependencies system-wide using the pip python package installer. This will install dependencies and the iids modules to the local site packages and add the ipal-iids, ipal-visualize-model and ipal-extend-alarms scripts to the PATH. The scripts can then be invoked system-wide (e.g. ipal-iids -h).

Installation (venv)

Alternatively, the project's dependencies can be installed locally in a virtual environment using the misc/install.sh script or manually with:

```bash python3 -m venv venv source venv/bin/activate

python3 -m pip install numpy python3 -m pip install -r requirements.txt ```

The scripts can then be invoked after activating the virtual environment from the root of the project repository, e.g.:

bash source venv/bin/activate ./ipal-evaluate -h deactivate

Installation (docker)

Use docker build -t ipal-ids-framework:latest . to build a Docker image with a pip installation of the project and development dependencies. The scripts can then be used within containers using the built image, e.g.:

bash docker run -it ipal-ids-framework:latest /bin/bash ipal-ids -h

Usage

Usage IIDS Framework

The ipal-iids consists of two phases. During training, the parameters --train.ipal or --train.state have to be provided together with a configuration file via --config. Afterwards, the live detection phase starts. Therefore, the parameters --live.ipal or --live.state have to be provided and --output defines the location where the annotated IIDS output is written to.

Each IIDS has its own options which can be retrieved by ipal-iids --default.config [ids-name].

```bash usage: ipal-iids [--train.ipal FILE] [--train.state FILE] [--train.combiner FILE] [--live.ipal FILE] [--live.state FILE] [--output FILE] [--alerts FILE] [--alerts.update] [--config FILE] [--combiner.config FILE] [--extra.config FILE] [--default.config IDS] [--combiner.default.config Combiner] [--retrain] [--hostname] [--log STR] [--logfile FILE] [--compresslevel INT] [--version] [-h] [--live.batch INT]

This program contains the ipal-iids framework together with implementations of several IIDSs based on the IPAL message and state format.

options: --train.ipal FILE input file of IPAL messages to train the IDS on ('-' stdin, '.gz' compressed). --train.state FILE input file of IPAL state messages to train the IDS on ('-' stdin, '.gz' compressed). --train.combiner FILE input file of IPAL or state messages to train the combiner on ('-' stdin, '.gz' compressed). --live.ipal FILE input file of IPAL messages to perform the live detection on ('-' stdin, '.gz' compressed). --live.state FILE input file of IPAL state messages to perform the live detection on ('-' stdin, '.gz' compressed). --output FILE output file to write the annotated IDS output to (Default:none, '-' stdout, ',gz' compress). --alerts FILE output file to write the verbose IDS alerts to (Default:none, '-' stdout, ',gz' compress). --alerts.update output also updates on ongoing IDS alerts (Default:False). --config FILE load IDS configuration and parameters from the specified file ('.gz' compressed). --combiner.config FILE load Combiner configuration and parameters from the specified file ('*.gz' compressed). --extra.config FILE load IDSs and Combiners residing outside of IPAL --default.config IDS dump the default configuration for the specified IDS to stdout and exit, can be used as a basis for writing IDS config files. Available IIDSs are: BLSTM,GeCo,DecimalPlaces,DecisionTree,D tmc,DummyIDS,ExistsIDS,ExtraTrees,Histogram,InterArrivalTimeMean,InterArrivalTimeRange,InvariantRules,IsolationForest,Kitsune,MinMax,NaiveBayes,OptimalIDS,Pasad,RandomForest,SVM,Seq2SeqN N,SteadyTime,TABOR,DummyBatchIDS --combiner.default.config Combiner dump the default configuration for the specified Combiner to stdout and exit, can be used as a basis for writing Combiner config files. Available Combiners are: Any,Matrix,Gurobi,Heuristic,LogisticRegression,MLP,SVM,LSTM --retrain retrain regardless of a trained model file being present. --hostname Add the hostname to the output. --log STR define logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) (Default: WARNING). --logfile FILE file to log to (Default: stderr). --compresslevel INT set the gzip compress level. 0 no compress, 1 fast/large, ..., 9 slow/tiny. (Default: 6) --version show program's version number and exit -h, --help Show this help message and exit. --live.batch INT commands to use batching and defines the batch size ```

Usage configuration files

The configuration file determines the parameters for each IIDS. A default configuration for each IIDS can be obtained with ipal-iids --default.config [IIDS name]:

bash ipal-iids --default.config inter-arrival-mean { "inter-arrival-mean": { "_type": "inter-arrival-mean", "model-file": "./model", "N": 4, "W": 5 } }

The IIDS framework allows for using multiple IIDSs in parallel. Each entry in the configuration file can have a different name, e.g., one IIDS for each sensor of a physical system. Currently, the output of multiple IIDSs is combined with 'or' - meaning an alert is emitted if at least one IIDS detected an anomaly.

Usage Combiner

If multiple IIDSs are used in parallel, it is possible to specify a combiner that fuses the results of the individual approaches into a unified alert. Therefore, different strategies can be used as listed in the following table:

| Combiner | (Un-)Supervised | Time-Aware | Description | |--------------------|-----------------|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------| | Any | unsup. | no | Alerts if any IDS emits an alert. | | Gurobi | sup. | no | Solves an optimization problem with Gurobi to find optimal weights for IDSs. This combiner may require a Gurobi license. | | Heuristic | sup. | no | This combiner implements a heuristic that minimizes the number of misclassifications, which maximizes accuracy. | | LSTM | sup. | yes | Time-aware LSTM over the a window of recent IIDS alerts. | | LogisticRegression | sup. | no | Learns a logistic regression combiner. | | SVM | sup. | no | Learns a SVM combiner. | | Matrix | unsup. | yes | Each IDS gets assigned a dedicated weight, or multiple for each timestep. The combiner alerts if a weighted sum of alerts/scores is greater than a threshold. |

To utilize a combiner, the IIDS framework requires a dedicated configuration file. A default configuration for each combiner can be obtained with ipal-iids --combiner.default.config [Combiner name]:

bash ipal-iids --combiner.default.config Matrix { "_type": "Matrix", "model-file": null, "matrix": [], "threshold": 0, "use_scores": false, "keys": [], "lookahead": 0 }

This configuration file must be provided in addition to the regular IIDS configuration file. An exemplary command would be:

bash ipal-iids \ --config [IDS config file] --train.state [IDS training file] \ --combiner.config [combiner config file] --train.combiner [Combiner training file] \ --live.state [live file] --output [output file]

Note that some combiners require dedicated training files. It is recommended to use a separate training file for the combiner.

Usage Preprocessor

The preprocessors are useful for IIDSs, that require a certain input format. E.g., some machine-learning IIDSs work best if their data is scaled between 0 and 1. Only IIDSs inheriting from the FeatureIDS can use the preprocessors. Initially, the preprocessors are fitted to the training data. Currently, the following preprocessors are implemented:

| Preprocessor | Description | |---------------|---------------------------------------------------------------------------------------| | aggregate | Aggregates multiple feature vectors into a single vector | | categorical | Encode, usually strings, as an array of binary indicators | | gradient | Calculates the derivative of a process value | | indicate-none | Extend each feature with a binary value indicating whether the feature is none or not | | label | Encode, usually strings, as numeric labels | | mean | Subtract mean and scale by the standard deviation | | minmax | Scale by minimum and maximum from 0 to 1 | | pca | Performs a principal component analysis on the input vector |

Multiple preprocessors can be used in series. The following example shows how preprocessors are defined in the configuration file:

json { "SVM Preprocessor Example" : { "_type": "SVM", ... "features" : ["src", "type", "state;4:PID Setpoint", "length"], "preprocessors": [ {"method" : "Mean", "features" : ["state;4:PID Setpoint", "length"]}, {"method" : "Categorical", "features" : ["type"]} ], ... } }

Usage Alerts

By default, each IDS adds the ids key to an IPAL message indicating whether the combined IDS system (all individual IDSs and the combiner) emits an alert (True or False). This is helpful for research, e.g., when calculating the detection performance. Yet, for practical use, this type of alert does not provide any context for users. To this end, some IDSs implement context-enriched alerts. These can be enabled and saved to a dedicated file with the --alerts option. These alerts are in JSON format and of the following structure:

json { "status": "new", # Can be new, closed, or updated "ids": "Steadytime", "point": "switch", # Source of the alert if identifiable "reason": "Values of switch is unknown", "start": 1751361777, "end": 1751361779, "description": "switch=1.0 has never been seen before.", "count": 1, # Counts the occurrences of this alert "id": "30b29eba-b140-4de6-9f59-b8cc1f015356", }

Note that not all IDSs support this feature yet.

Note that an update is not emitted unless the --alerts.update option is set.

Usage ipal-visualize-model

This tool allows for visualizing the trained models for an IIDS configuration. To plot a specific model use ipal-visualize-model [path-to-config-file].

```bash ipal-visualize-model -h usage: ipal-visualize-model [-h] [--log STR] [--logfile FILE] [--version] FILE

positional arguments: FILE load the IDS configuration of the trained model ('*.gz' compressed).

options: -h, --help show this help message and exit --log STR define logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL). Default is WARNING. --logfile FILE File to log to. Default is stderr. --version show program's version number and exit ```

Usage ipal-extend-alarms

The ipal-iids tool works as an online tool - meaning IIDSs have to decide whether they emit an alert live. Therefore, alerts can not be emitted retroactively, which is sometimes needed for evaluation. As few IIDSs possibly need to retroactively emit alerts, the ipal-extend-alarms script post-processes the IIDS output afterward. IIDSs with the support for ipal-extend-alarms need the parameter adjust: true to be set in their configuration files.

```bash ./ipal-extend-alarms -h usage: ipal-extend-alarms [-h] [--log STR] [--logfile FILE] [--version] FILE [FILE ...]

positional arguments: FILE files to extend alarms ('*.gz' compressed).

options: -h, --help show this help message and exit --log STR define logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL). Default is WARNING. --logfile FILE File to log to. Default is stderr. --version show program's version number and exit ```

Note that the ipal-extend-alarms tool does not implement combiners and simply combines the results with OR.

Development

Tooling

The set of tools used for development, code formatting, style checking, and testing can be installed with the following command:

bash python3 -m pip install -r requirements-dev.txt

All tools can be executed manually with the following commands and report errors if encountered:

bash black . flake8 python3 -m pytest

A black and flake8 check of modified files before any commit can also be forced using Git's pre-commit hook functionality:

bash pre-commit install

More information on the black and flake8 setup can be found at https://ljvmiranda921.github.io/notebook/2018/06/21/precommits-using-black-and-flake8/

Add an IIDS

Currently there are two ways how support for a new IIDS can be added.

At first you can create a new IIDS and make it part of the IPAL code:

  1. Add a new folder and IIDS module in ids/[ids name]/[ids name].py
  2. Create a new IIDS class inheriting the MetaIDS class (see ids/ids.py) or inheriting the FeatureIDS class (see ipal_iids/ids/featureids.py) for preprocessor support. The IIDS class may implement:
    • train: given some training data, the IIDS should learn its internal model
    • new_ipal_msg: given a new IPAL message, return whether the IIDS detected an anomaly
    • new_state_msg: given a new IPAL state message, return whether the IIDS detected an anomaly
    • save_trained_model: save the trained model to disc
    • load_trained_model: load a trained model from disc
    • visualize_model: create a Matplotlib visualization of the model for debugging purposes
  3. Add the new IIDS to the list in ids/utils.py
  4. Add the new IIDS to the list in tests/conftest.py
  5. Add the new IIDS to the implemented IIDSs table above

Note: The name of an IIDS may not begin with "_"!

Secondly you can create a config file specifying from which file the IIDS should be loaded:

  1. Create a so called "extra" config file specifying the names and locations of the new IIDSs (or combiners and preprocessors). Paths are either absolute or relativ to the position of the config file. The config file is passed to IPAL via the --extra.config argument. An example config file could look like this: ``` { "IDS": [ { "name": "TestIDS", "path": "./testids.py" } ],

    "Combiner": [ { "name": "Test", "path": "./testcombiner.py" } ],

    "Preprocessor": [ { "name": "Test", "path": "./testpreprocessor.py" } ] } ```

Note: The names of the IIDSs should be equal to the names of the corresponding Python classes. The names of the combiners should be equal to the class names without the suffix "Combiner", e.g. your combiner class is called "AwesomeCombiner" then you should write "Awesome" as the name for your combiner into your "extra" config file.

Add a preprocessor

The process for adding a new state extraction method is the following:

  1. Add a new preprocessor module in preprocessors/
  2. Create a new preprocessor class inheriting the Preprocessor class (see preprocessors/preprocessor.py). The preprocessor class may implement:
    • fit: given a set of training data, train the preprocessor on it
    • transform: preprocess a given data sample based on the fitted model
    • reset: reset the preprocessor between individual dataset
    • get_fitted_model: return a representation of the fitted mode, which can be saved to disc
    • from_fitted_model: return an initialized preprocessor based on a previously saved model
  3. Add the new preprocessor to the list in preprocessors/utils.py
  4. Add the new preprocessor to the preprocessor list table above
Add a combiner

Adding a combiner is analog to adding a new preprocessor.

Contributors

  • Antoine Saillard (RWTH Aachen University & Fraunhofer FKIE)
  • David Valero Ribes (RWTH Aachen University)
  • Dominik Kus (RWTH Aachen University)
  • Eric Wagner (Fraunhofer FKIE & RWTH Aachen University)
  • Frederik Basels (RWTH Aachen University)
  • Jonas Lohmann (RWTH Aachen University)
  • Konrad Wolsing (Fraunhofer FKIE & RWTH Aachen University)
  • Lea Thiemt (RWTH Aachen University)
  • Sven Zemanek (Fraunhofer FKIE)

License

MIT License. See LICENSE for details.

Owner

  • Name: FKIE-CAD
  • Login: fkie-cad
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Wolsing"
  given-names: "Konrad"
  orcid: "https://orcid.org/0000-0002-7571-0555"
- family-names: "Wagner"
  given-names: "Eric"
  orcid: "https://orcid.org/0000-0003-3211-1015"
- family-names: "Saillard"
  given-names: "Antoine"
  orcid: "https://orcid.org/0000-0002-8376-2726"
- family-names: "Henze"
  given-names: "Martin"
  orcid: "https://orcid.org/0000-0001-8717-2523"
title: "IPAL - Industrial Intrusion Detection Framework"
version: 1.5.3
doi: 10.1145/3545948.3545968
date-released: 2022-04-20
url: "https://github.com/fkie-cad/ipal_ids_framework"
preferred-citation:
  type: conference-paper
  authors:
  - family-names: "Wolsing"
    given-names: "Konrad"
    orcid: "https://orcid.org/0000-0002-7571-0555"
  - family-names: "Wagner"
    given-names: "Eric"
    orcid: "https://orcid.org/0000-0003-3211-1015"
  - family-names: "Saillard"
    given-names: "Antoine"
    orcid: "https://orcid.org/0000-0002-8376-2726"
  - family-names: "Henze"
    given-names: "Martin"
    orcid: "https://orcid.org/0000-0001-8717-2523"
  doi: 10.1145/3545948.3545968
  journal: In 25th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2022)
  month: 10
  title: "IPAL: Breaking up Silos of Protocol-dependent and Domain-specific Industrial Intrusion Detection Systems"
  year: 2022

GitHub Events

Total
  • Watch event: 6
  • Push event: 2
  • Fork event: 3
  • Create event: 2
Last Year
  • Watch event: 6
  • Push event: 2
  • Fork event: 3
  • Create event: 2

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: about 15 hours
  • Total issue authors: 0
  • Total pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.33
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • LasseMoench (1)
  • dominikks (1)
  • loeschzwerg (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements-dev.txt pypi
  • black * development
  • flake8 * development
  • pre-commit * development
  • pytest * development
requirements.txt pypi
  • graphviz *
  • keras *
  • matplotlib *
  • numpy *
  • pomegranate *
  • sklearn *
  • tensorflow *
  • torch *
setup.py pypi
  • ar *
  • graphviz *
  • keras *
  • matplotlib *
  • numpy *
  • pomegranate *
  • sklearn *
  • tensorflow *
  • torch *
Dockerfile docker
  • ubuntu 22.04 build
pyproject.toml pypi