https://github.com/sintel-dev/orion

Unsupervised time series anomaly detection library

https://github.com/sintel-dev/orion

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, acm.org
  • Committers with academic emails
  • Institutional organization owner
    Organization sintel-dev has institutional domain (dai.lids.mit.edu)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

anomaly-detection benchmarking data-science deep-learning generative-adversarial-network machine-learning orion signals time-series unsupervised-learning

Keywords from Contributors

gan data-generation generative-ai generative-model generativeai multi-table relational-datasets sdv synthetic-data synthetic-data-generation
Last synced: 5 months ago · JSON representation

Repository

Unsupervised time series anomaly detection library

Basic Info
  • Host: GitHub
  • Owner: sintel-dev
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage: https://sintel.dev/Orion/
  • Size: 34.4 MB
Statistics
  • Stars: 1,298
  • Watchers: 32
  • Forks: 191
  • Open Issues: 54
  • Releases: 22
Topics
anomaly-detection benchmarking data-science deep-learning generative-adversarial-network machine-learning orion signals time-series unsupervised-learning
Created over 7 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Authors

README.md

“DAI-Lab” An open source project from Data to AI Lab at MIT.

“Orion”

Development Status Python PyPi Shield Tests Downloads Binder

Orion

A machine learning library for unsupervised time series anomaly detection.

| Important Links | | | --------------------------------------------- | -------------------------------------------------------------------- | | :computer: Website | Check out the Sintel Website for more information about the project. | | :book: Documentation | Quickstarts, User and Development Guides, and API Reference. | | :star: Tutorials | Checkout our notebooks | | :octocat: Repository | The link to the Github Repository of this library. | | :scroll: License | The repository is published under the MIT License. | | Community | Join our Slack Workspace for announcements and discussions. |

Overview

Orion is a machine learning library built for unsupervised time series anomaly detection. With a given time series data, we provide a number of “verified” ML pipelines (a.k.a Orion pipelines) that identify rare patterns and flag them for expert review.

The library makes use of a number of automated machine learning tools developed under Data to AI Lab at MIT.

Read about using an Orion pipeline on NYC taxi dataset in a blog series:

Part 1: Learn about unsupervised time series anomaly detection | Part 2: Learn how we use GANs to solving the problem? | Part 3: How does one evaluate anomaly detection pipelines? :--------------------------------------:|:---------------------------------------------:|:--------------------------------------------: | |

Notebooks: Discover Orion through colab by launching our notebooks!

Quickstart

Install with pip

The easiest and recommended way to install Orion is using pip:

bash pip install orion-ml

This will pull and install the latest stable release from PyPi.

In the following example we show how to use one of the Orion Pipelines.

Fit an Orion pipeline

We will load a demo data for this example:

```python3 from orion.data import load_signal

traindata = loadsignal('S-1-train') train_data.head() ```

which should show a signal with timestamp and value. timestamp value 0 1222819200 -0.366359 1 1222840800 -0.394108 2 1222862400 0.403625 3 1222884000 -0.362759 4 1222905600 -0.370746

In this example we use aer pipeline and set some hyperparameters (in this case training epochs as 5).

```python3 from orion import Orion

hyperparameters = { 'orion.primitives.aer.AER#1': { 'epochs': 5, 'verbose': True } }

orion = Orion( pipeline='aer', hyperparameters=hyperparameters )

orion.fit(train_data) ```

Detect anomalies using the fitted pipeline

Once it is fitted, we are ready to use it to detect anomalies in our incoming time series:

python3 new_data = load_signal('S-1-new') anomalies = orion.detect(new_data)

:warning: Depending on your system and the exact versions that you might have installed some WARNINGS may be printed. These can be safely ignored as they do not interfere with the proper behavior of the pipeline.

The output of the previous command will be a pandas.DataFrame containing a table of detected anomalies:

start end severity 0 1402012800 1403870400 0.122539

Leaderboard

In every release, we run Orion benchmark. We maintain an up-to-date leaderboard with the current scoring of the verified pipelines according to the benchmarking procedure.

We run the benchmark on 12 datasets with their known grounth truth. We record the score of the pipelines on each datasets. To compute the leaderboard table, we showcase the number of wins each pipeline has over the ARIMA pipeline.

| Pipeline | Outperforms ARIMA | |---------------------------|--------------------| | AER | 12 | | TadGAN | 7 | | LSTM Dynamic Thresholding | 9 | | LSTM Autoencoder | 7 | | Dense Autoencoder | 7 | | VAE | 6 | | AnomalyTransformer | 2 | | LNN | 7 | | Matrix Profile | 5 | | UniTS | 6 | | TimesFM | 7 | | GANF | 5 | | Azure | 0 |

You can find the scores of each pipeline on every signal recorded in the details Google Sheets document. The summarized results can also be browsed in the following summary Google Sheets document.

Resources

Additional resources that might be of interest: * Learn about benchmarking pipelines. * Read about pipeline evaluation. * Find out more about TadGAN.

Citation

If you use AER for your research, please consider citing the following paper:

Lawrence Wong, Dongyu Liu, Laure Berti-Equille, Sarah Alnegheimish, Kalyan Veeramachaneni. AER: Auto-Encoder with Regression for Time Series Anomaly Detection.

@inproceedings{wong2022aer, title={AER: Auto-Encoder with Regression for Time Series Anomaly Detection}, author={Wong, Lawrence and Liu, Dongyu and Berti-Equille, Laure and Alnegheimish, Sarah and Veeramachaneni, Kalyan}, booktitle={2022 IEEE International Conference on Big Data (IEEE BigData)}, pages={1152-1161}, doi={10.1109/BigData55660.2022.10020857}, organization={IEEE}, year={2022} }

If you use TadGAN for your research, please consider citing the following paper:

Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, Kalyan Veeramachaneni. TadGAN - Time Series Anomaly Detection Using Generative Adversarial Networks.

@inproceedings{geiger2020tadgan, title={TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks}, author={Geiger, Alexander and Liu, Dongyu and Alnegheimish, Sarah and Cuesta-Infante, Alfredo and Veeramachaneni, Kalyan}, booktitle={2020 IEEE International Conference on Big Data (IEEE BigData)}, pages={33-43}, doi={10.1109/BigData50022.2020.9378139}, organization={IEEE}, year={2020} }

If you use Orion which is part of the Sintel ecosystem for your research, please consider citing the following paper:

Sarah Alnegheimish, Dongyu Liu, Carles Sala, Laure Berti-Equille, Kalyan Veeramachaneni. Sintel: A Machine Learning Framework to Extract Insights from Signals. @inproceedings{alnegheimish2022sintel, title={Sintel: A Machine Learning Framework to Extract Insights from Signals}, author={Alnegheimish, Sarah and Liu, Dongyu and Sala, Carles and Berti-Equille, Laure and Veeramachaneni, Kalyan}, booktitle={Proceedings of the 2022 International Conference on Management of Data}, pages={1855–1865}, numpages={11}, publisher={Association for Computing Machinery}, doi={10.1145/3514221.3517910}, series={SIGMOD '22}, year={2022} }

Owner

  • Name: The Signal Intelligence Project
  • Login: sintel-dev
  • Kind: organization
  • Email: dai-lab@mit.edu

Systems and tools to design, develop and deploy AI applications on top of signals.

GitHub Events

Total
  • Create event: 58
  • Release event: 2
  • Issues event: 26
  • Watch event: 214
  • Delete event: 48
  • Issue comment event: 35
  • Push event: 168
  • Pull request review comment event: 21
  • Pull request event: 102
  • Pull request review event: 45
  • Fork event: 20
Last Year
  • Create event: 58
  • Release event: 2
  • Issues event: 26
  • Watch event: 214
  • Delete event: 48
  • Issue comment event: 35
  • Push event: 168
  • Pull request review comment event: 21
  • Pull request event: 102
  • Pull request review event: 45
  • Fork event: 20

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 576
  • Total Committers: 15
  • Avg Commits per committer: 38.4
  • Development Distribution Score (DDS): 0.538
Past Year
  • Commits: 77
  • Committers: 3
  • Avg Commits per committer: 25.667
  • Development Distribution Score (DDS): 0.494
Top Committers
Name Email Commits
sarahmish s****h@g****m 266
Carles Sala c****s@p****m 113
DAILab-bot 1****t 110
Hector Dominguez h****2@h****m 35
Alexander Geiger a****5@g****e 14
dyuliu u****u@g****m 13
Manuel Alvarez m****l@p****m 8
Kalyan Veeramachaneni k****u@g****m 4
Lawrence Wong 3****8 3
Linh Nguyen 7****k 3
Hramir 5****r 2
Plamen Valentinov Kolev 4****r 2
Micah Smith m****h@g****m 1
Na Rae Baek 9****U 1
kronerte 3****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 125
  • Total pull requests: 234
  • Average time to close issues: 4 months
  • Average time to close pull requests: 13 days
  • Total issue authors: 73
  • Total pull request authors: 11
  • Average comments per issue: 2.46
  • Average comments per pull request: 0.08
  • Merged pull requests: 203
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 20
  • Pull requests: 88
  • Average time to close issues: 11 days
  • Average time to close pull requests: 4 days
  • Issue authors: 10
  • Pull request authors: 3
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.01
  • Merged pull requests: 75
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • sarahmish (25)
  • ajthomas1949 (6)
  • LuSchnitt (4)
  • dxiaosa (4)
  • nunobv (4)
  • mohammadx0098 (3)
  • shaojunjie0912 (3)
  • shimonymittal (2)
  • eliashyatt (2)
  • NIckyVNYC (2)
  • fdion (2)
  • ali-Eskandarian (2)
  • SebiChesh (2)
  • mshariful (1)
  • erioe-m (1)
Pull Request Authors
  • DAILab-bot (191)
  • sarahmish (85)
  • Linh-nk (8)
  • dyuliu (3)
  • MihirT906 (2)
  • lcwong0928 (1)
  • BAN2ARU (1)
  • gsheni (1)
  • dependabot[bot] (1)
  • dxiaosa (1)
Top Labels
Issue Labels
question (62) enhancement (11) maintenance (7) bug (4) new feature (4) internal improvements (4) documentation (1) under discussion (1) datasets (1)
Pull Request Labels
enhancement (7) dependencies (6) maintenance (5) documentation (4) new feature (1) bug (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 574 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 3
  • Total versions: 47
  • Total maintainers: 4
pypi.org: orion-ml

Orion is a machine learning library built for unsupervised time series anomaly detection.

  • Versions: 47
  • Dependent Packages: 0
  • Dependent Repositories: 3
  • Downloads: 574 Last month
Rankings
Stargazers count: 2.1%
Forks count: 4.0%
Average: 7.0%
Dependent repos count: 8.9%
Downloads: 9.7%
Dependent packages count: 10.1%
Maintainers (4)
Last synced: 6 months ago

Dependencies

.github/workflows/docs.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
  • peaceiris/actions-gh-pages v3 composite
.github/workflows/latest-dependencies.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • peter-evans/create-pull-request v3 composite
.github/workflows/tests.yml actions
  • actions/checkout v1 composite
  • actions/setup-python v2 composite