deep-log-unstructured

Unstructured log analysis with transformers

https://github.com/superskyyy/deep-log-unstructured

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: ieee.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.6%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Unstructured log analysis with transformers

Basic Info
  • Host: GitHub
  • Owner: Superskyyy
  • License: gpl-3.0
  • Language: Jupyter Notebook
  • Default Branch: master
  • Homepage:
  • Size: 117 KB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created over 4 years ago · Last pushed over 4 years ago
Metadata Files
Readme License Citation

README.md

Self-attentive classification-based anomaly detection in unstructured logs

This repository is the unofficial implementation of Self-attentive classification-based anomaly detection in unstructured logs.

📋 Please find a demo Colab notebook at the src folder at project root

Requirements

To install requirements locally and run notebook locally, verify the dependencies in the requirements.txt:

setup pip install -r requirements.txt

When using our implementation demo, simply import the notebook at src/model/anomaly_detection.ipynb and modify the folder path to point to your datasets.

Baselines: We implemented two baselines used in the paper - PCA and Deeplog. Please refer to corresponding notebooks for their specifics.

Training

To train the model(s) in the paper, import the notebook with TPU runtime and parallel execution strategy on, each epoch at batch size 512 will take less than 2 mins for first 5 million rows of data.

Evaluation

The results can be evaluated by observing the F1-score, Recall, Precision and Accuracy. The threshold derivation is automatically iterated and can be observed.

Results

Please review the results based on our project report [NOT DISCLOSED FOR NOW].

Generally we have evidence to prove that the results are reproduciable (also surpassing previous state-of-the-art DeepLog) with some potential evaluation flaws.

Reproducing Baselines

If you want to run PCA yourself, please:

cd baselines/PCA/code

python main.py

If you want to run Deeplog:

cd baselines/Deeplog/code

python main.py

To cite the original paper

@article{nedelkoski2020self,
  title={Self-Attentive Classification-Based Anomaly Detection in Unstructured Logs},
  author={Nedelkoski, Sasho and Bogatinovski, Jasmin and Acker, Alexander and Cardoso, Jorge and Kao, Odej},
  journal={arXiv preprint arXiv:2008.09340},
  year={2020}
}

To cite our reproduced work

Please click on the button Cite this repository below the repo description. A bibitex will be generated for your convinience.

License and contributions

This code is released under GPLV3 License.

Pull requests and issues are welcomed to enhance the implementation.

Owner

  • Name: Superskyyy (AWAY - OFFLINE)
  • Login: Superskyyy
  • Kind: user
  • Location: Canada
  • Company: Queen's University

Apache SkyWalking PMC:clamp:       Master's student :books:          I'm Chinese :cn:

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Chen"
  given-names: "Yihao"
- family-names: "Guo"
  given-names: "Gary"
title: "deep-log-unstructured"
version: 0.1.0
date-released: 2021-12-10
url: "https://github.com/Superskyyy/deep-log-unstructured"

GitHub Events

Total
Last Year

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 21
  • Total Committers: 3
  • Avg Commits per committer: 7.0
  • Development Distribution Score (DDS): 0.571
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Superskyyy S****y@o****m 9
Superskyyy s****y@o****m 9
Zifeng Guo q****u@g****m 3

Issues and Pull Requests

Last synced: 12 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • keras *
  • numpy *
  • pandas *
  • tensorflow *
  • tqdm *