zeeklog2pandas

Read Zeeek/Bro log and log.gz files (even broken ones) into a Pandas Dataframe.

https://github.com/stratosphereips/zeeklog2pandas

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.1%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Read Zeeek/Bro log and log.gz files (even broken ones) into a Pandas Dataframe.

Basic Info
  • Host: GitHub
  • Owner: stratosphereips
  • License: gpl-2.0
  • Language: Python
  • Default Branch: main
  • Size: 67.4 KB
Statistics
  • Stars: 5
  • Watchers: 5
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created over 3 years ago · Last pushed over 2 years ago
Metadata Files
Readme Changelog Contributing License Citation Authors

README.md

zeeklog2pandas

GitHub last commit (branch) CI PyPI - Downloads

Read Zeek/Bro log and log.gz (even broken ones) into a Pandas Dataframe.

Installation

With pip

To install zeeklog2pandas, run this command in your terminal:

bash $ pip install zeeklog2pandas

This is the preferred method to install zeeklog2pandas, as it will always install the most recent stable release.

If you don't have pip installed, this Python installation guide can guide you through the process.

From sources

The sources for zeeklog2pandas can be downloaded from the Github repo.

You can either clone the public repository:
bash $ git clone git://github.com/stratosphereips/zeeklog2pandas Or download the tarball:

bash $ curl -OJL https://github.com/stratosphereips/zeeklog2pandas/tarball/main

Once you have a copy of the source, you can install it with:

bash $ python setup.py install

Usage

Reading a zeek log into a pandas DataFrame

To read a file, simply import the library and use the read_zeek function. ```python import pandas as pd from zeeklog2pandas import read_zeek

df = read_zeek('conn.log') # or conn.log.gz, it will be handled transparently ```

Mapping column types.

The read_zeek function only parses the zeek log files, without do any explicit conversion. You will get a warning if the zeek columns have mixed types. Also you will need to convert the ts column to datetimes. You can do that easily using pandas to_datetime help.

python In [1]: pd.to_datetime(df.ts, unit='s') Out[1]:   0     2022-07-05 12:11:45.286374144 1     2022-07-05 12:11:45.286611968 2     2022-07-05 12:11:45.286622976 3     2022-07-05 12:11:45.286629888 4     2022-07-05 12:11:45.286637056                   ...               195   2022-07-05 12:11:45.865832192 196   2022-07-05 12:11:45.865839104 197   2022-07-05 12:11:45.866068992 198   2022-07-05 12:11:45.866075904 199   2022-07-05 12:11:45.866082816 Name: ts, Length: 200, dtype: datetime64[ns]

Merging rotated logs

The read_zeek function does not merge the rotated log files. So if you have a bunch of hourly rotated zeek logs, you can easily merged into a single DataFrame doing something like

```python from os import scandir

dfs = [] s = scandir('2022-07-05/') for f in s: if f.name.startswith('ssh.'): dfs.append(read_zeek(f.path)) df = pd.concat(dfs) ```

This will merge all the ssh logs. You can replace ssh. for conn. in order to have one DataFrame with all the conn log of that day.

You need to be sure that the DataFrames have always the same columns. Otherwise you can use some other pandas method like merge or join, but you will need to take care of how the non existing values in some columns are handled.

Owner

  • Name: Stratosphere IPS
  • Login: stratosphereips
  • Kind: organization
  • Location: Prague

Cybersecurity Research Laboratory at the Czech Technical University in Prague. Creators of Slips, a free software machine learning-based behavioral IDS/IPS.

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  zeeklog2pandas: a tool to convert Zeek logs to Pandas dataframe
message: "If you use this software, please cite it as below."
authors:
- family-names: Bogado
  given-names: Joaquin
  email: joaquin.bogado@aic.fel.cvut.cz
  affiliation: >-
      Stratosphere Laboratory, AIC, FEL, Czech
      Technical University in Prague
  orcid: "0000-0001-9491-5698"
version: 1.0.2
date-released: 2022-08-02
url: "https://github.com/stratosphereips/zeeklog2pandas"

GitHub Events

Total
Last Year

Committers

Last synced: about 3 years ago

All Time
  • Total Commits: 42
  • Total Committers: 3
  • Avg Commits per committer: 14.0
  • Development Distribution Score (DDS): 0.214
Top Committers
Name Email Commits
joaquinbogado j****o@g****m 33
Veronica Valeros v****s@g****m 5
Joaquin Bogado j****o@d****m 4
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 41 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 7
  • Total maintainers: 1
pypi.org: zeeklog2pandas

Read Zeeek/Bro log and log.gz (even broken ones) into a Pandas Dataframe.

  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 41 Last month
Rankings
Dependent packages count: 6.6%
Average: 25.0%
Forks count: 30.5%
Dependent repos count: 30.6%
Stargazers count: 32.3%
Maintainers (1)
Last synced: 9 months ago

Dependencies

requirements.txt pypi
  • pandas *
requirements_dev.txt pypi
  • Sphinx ==1.8.5 development
  • bump2version ==0.5.11 development
  • coverage ==4.5.4 development
  • flake8 ==3.7.8 development
  • pip ==19.2.3 development
  • tox ==3.14.0 development
  • twine ==1.14.0 development
  • watchdog ==0.9.0 development
  • wheel ==0.33.6 development
.github/workflows/publish-to-test.yml actions
  • 8398a7/action-slack v3 composite
  • actions/checkout master composite
  • actions/setup-python v3 composite
  • pypa/gh-action-pypi-publish release/v1 composite
pyproject.toml pypi
setup.py pypi