dataset_tools

A tool set to work with our Stratosphere Laboratory cybersecurity datasets.

https://github.com/stratosphereips/dataset_tools

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.5%) to scientific vocabulary

Keywords

cybersecurity data-science datasets netflow network-security zeek
Last synced: 6 months ago · JSON representation ·

Repository

A tool set to work with our Stratosphere Laboratory cybersecurity datasets.

Basic Info
  • Host: GitHub
  • Owner: stratosphereips
  • License: gpl-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 46.9 KB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
cybersecurity data-science datasets netflow network-security zeek
Created over 8 years ago · Last pushed over 3 years ago
Metadata Files
Readme Contributing License Citation

README.md

Stratosphere Datasets Tools

Docker Image CI GitHub last commit (branch) Docker Pulls

A set of tools to work with the Stratosphere datasets: * zeek-histograms.py: create histograms based on Zeek log files. * merge-zeek-files.py: merge two Zeek log files.

Zeek Histogram Creator

The tool zeek-histograms.py creates histograms from any Zeek flow files. The tool supports bin sizes in hours, minutes and seconds (E.g.: 1h, 1m, or 1s). The flows do not have to be sorted before hand, the tool will recognize its time and place it on the proper bin.

Example:

```bash $ python3 zeek-histograms.py -b 10m -f dataset/001-zeek-scenario-malicious/conn.log

Zeek logs histogram creator Histogram of flows in the zeek file dataset/001-zeek-scenario-malicious/conn.log. Bin size:10m

Current time zone in this system is: CET. All flows 1970-01-01 00:50:19.981745 - 1970-01-01 01:00:19.981745: 1 1970-01-01 01:00:19.981745 - 1970-01-01 01:10:19.981745: 318 **************************************************************************************************** 1970-01-01 01:10:19.981745 - 1970-01-01 01:20:19.981745: 166 **************************************************** 1970-01-01 01:20:19.981745 - 1970-01-01 01:30:19.981745: 152 *********************************************** 1970-01-01 01:30:19.981745 - 1970-01-01 01:40:19.981745: 152 *********************************************** 1970-01-01 01:40:19.981745 - 1970-01-01 01:50:19.981745: 160 ************************************************** 1970-01-01 01:50:19.981745 - 1970-01-01 02:00:19.981745: 3 ```

Docker Image

To test the datatoolset image is working correctly, run the following command. The command will create a new container and run the zeek-histograms tool on a Zeek testing dataset: bash docker run --rm -it --name stratosphere_datatoolset stratosphereips/datatoolset:latest python3 zeek-histograms.py -b 10m -f dataset/001-zeek-scenario-malicious/conn.log

Use the public docker image with the latest version and run the tools directly on the container:

bash docker run -v /full/path/to/logs/:/datasetstool/testing-datasets --name stratosphere_datatoolset --rm -it stratosphereips/datatoolset:latest /bin/bash

Owner

  • Name: Stratosphere IPS
  • Login: stratosphereips
  • Kind: organization
  • Location: Prague

Cybersecurity Research Laboratory at the Czech Technical University in Prague. Creators of Slips, a free software machine learning-based behavioral IDS/IPS.

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  Dataset Tools: A tool set to work with our Stratosphere Laboratory cybersecurity datasets
message: 'If you use this software, please cite it as below.'
type: software
authors:
  - given-names: Sebastian
    family-names: Garcia
    email: sebastian.garcia@agents.fel.cvut.cz
    affiliation: >-
      Stratosphere Laboratory, AIC, FEL, Czech
      Technical University in Prague
    orcid: 'https://orcid.org/0000-0001-6238-9910'

GitHub Events

Total
Last Year