https://github.com/zillow/luminaire

Luminaire is a python package that provides ML driven solutions for monitoring time series data.

https://github.com/zillow/luminaire

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.0%) to scientific vocabulary

Keywords

anomaly-detection automl forecasting outlier-detection time-series
Last synced: 5 months ago · JSON representation

Repository

Luminaire is a python package that provides ML driven solutions for monitoring time series data.

Basic Info
Statistics
  • Stars: 786
  • Watchers: 21
  • Forks: 65
  • Open Issues: 29
  • Releases: 32
Topics
anomaly-detection automl forecasting outlier-detection time-series
Created over 5 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct

README.md

Luminaire

A hands-off Anomaly Detection Library

PyPI version PyPI - Python Version License build publish docs <!-- Badges End -->


Table of contents

What is Luminaire

Luminaire is a python package that provides ML-driven solutions for monitoring time series data. Luminaire provides several anomaly detection and forecasting capabilities that incorporate correlational and seasonal patterns as well as uncontrollable variations in the data over time.

Quick Start

Install Luminaire from PyPI using pip

bash pip install luminaire

Import luminaire module in python python import luminaire

See Examples to get started. Also, refer to the Luminaire documentation for detailed description of methods and usage.

Time Series Outlier Detection Workflow

Luminaire Flow

Luminaire outlier detection workflow can be divided into 3 major components:

Data Preprocessing and Profiling Component

This component can be called to prepare a time series prior to training an anomaly detection model on it. This step applies a number of methods that make anomaly detection more accurate and reliable, including missing data imputation, identifying and removing recent outliers from training data, necessary mathematical transformations, and data truncation based on recent change points. It also generates profiling information (historical change points, trend changes, etc.) that are considered in the training process.

Profiling information for time series data can be used to monitor data drift and irregular long-term swings.

Modeling Component

This component performs time series model training based on the user-specified configuration OR optimized configuration (see Luminaire hyperparameter optimization). Luminaire model training is integrated with different structural time series models as well as filtering based models. See Luminaire outlier detection for more information.

The Luminaire modeling step can be called after the data preprocessing and profiling step to perform necessary data preparation before training.

Configuration Optimization Component

Luminaire's integration with configuration optimization enables a hands-off anomaly detection process where the user needs to provide very minimal configuration for monitoring any type of time series data. This step can be combined with the preprocessing and modeling for any auto-configured anomaly detection use case. See fully automatic outlier detection for a detailed walkthrough.

Anomaly Detection for High Frequency Time Series

Luminaire can also monitor a set of data points over windows of time instead of tracking individual data points. This approach is well-suited for streaming use cases where sustained fluctuations are of greater concern than individual fluctuations. See anomaly detection for streaming data for detailed information.

Examples

Batch Time Series Monitoring

```python import pandas as pd from luminaire.optimization.hyperparameteroptimization import HyperparameterOptimization from luminaire.exploration.dataexploration import DataExploration

data = pd.read_csv('Path to input time series data')

Input data should have a time column set as the index column of the dataframe and a value column named as 'raw'

Optimization

hoptobj = HyperparameterOptimization(freq='D') optconfig = hopt_obj.run(data=data)

Profiling

deobj = DataExploration(freq='D', **optconfig) trainingdata, preprc = de_obj.profile(data)

Identify Model

modelclassname = optconfig['LuminaireModel'] module = _import_('luminaire.model', fromlist=['']) modelclass = getattr(module, modelclassname)

Training

modelobject = modelclass(hyperparams=optconfig, freq='D') success, modeldate, trainedmodel = modelobject.train(data=trainingdata, **pre_prc)

Scoring

trained_model.score(100, '2021-01-01') ```

Streaming Time Series Monitoring

```python import pandas as pd from luminaire.model.windowdensity import WindowDensityHyperParams, WindowDensityModel from luminaire.exploration.dataexploration import DataExploration

data = pd.read_csv('Path to input time series data')

Input data should have a time column set as the index column of the dataframe and a value column named as 'raw'

Configuration Specs and Profiling

config = WindowDensityHyperParams().params deobj = DataExploration(**config) data, preprc = deobj.streamprofile(df=data) config.update(pre_prc)

Training

wdmobj = WindowDensityModel(hyperparams=config) success, trainingend, model = wdmobj.train(data=data)

Scoring

score, scoredwindow = model.score(scoringdata) # scoring_data is data over a time-window instead of a datapoint ```

Contributing

Want to help improve Luminaire? Check out our contributing documentation.

Citing

Please cite the following article if Luminaire is used for any research purpose or scientific publication:

Chakraborty, S., Shah, S., Soltani, K., Swigart, A., Yang, L., & Buckingham, K. (2020, December). Building an Automated and Self-Aware Anomaly Detection System. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 1465-1475). IEEE. (arxiv link)

Other Useful Resources

  • Chakraborty, S., Shah, S., Soltani, K., & Swigart, A. (2019, December). Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment. In 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) (pp. 523-528). IEEE. (arxiv link)

Blogs

Development Team

Luminaire is developed and maintained by Sayan Chakraborty, Smit Shah, Kiumars Soltani, Luyao Yang, Anna Swigart, Kyle Buckingham and many other contributors from the Zillow Group A.I. team.

Owner

  • Name: Zillow
  • Login: zillow
  • Kind: organization
  • Location: United States

GitHub Events

Total
  • Issues event: 2
  • Watch event: 29
  • Issue comment event: 15
  • Push event: 1
  • Pull request event: 3
  • Fork event: 8
Last Year
  • Issues event: 2
  • Watch event: 29
  • Issue comment event: 15
  • Push event: 1
  • Pull request event: 3
  • Fork event: 8

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 170
  • Total Committers: 9
  • Avg Commits per committer: 18.889
  • Development Distribution Score (DDS): 0.429
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
sayanc s****c@z****m 97
Smit Shah s****s@z****m 53
Luyao Yang 1****x 8
Artem Bashev a****v@g****m 5
pdurham2 p****8@g****m 2
earthgecko 9****o 2
markzxu 3****u 1
Panagiotis Papaemmanouil p****n@g****m 1
Anna Swigart a****t@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 50
  • Total pull requests: 56
  • Average time to close issues: 3 months
  • Average time to close pull requests: 15 days
  • Total issue authors: 20
  • Total pull request authors: 9
  • Average comments per issue: 1.12
  • Average comments per pull request: 1.07
  • Merged pull requests: 48
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 3
  • Pull request authors: 0
  • Average comments per issue: 1.33
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • sayanchk (15)
  • snazzyfox (9)
  • shahsmit14 (7)
  • Aristarhys (3)
  • Haraprasad987654321 (1)
  • grechasneak (1)
  • webbug2005 (1)
  • paulochf (1)
  • sebapehl (1)
  • kylebuckingham (1)
  • fuhao009 (1)
  • 9race (1)
  • vincentlin2 (1)
  • Frenz86 (1)
  • eaglewarrior (1)
Pull Request Authors
  • sayanchk (29)
  • shahsmit14 (8)
  • Aristarhys (7)
  • snazzyfox (6)
  • papaemman (4)
  • ferozed (3)
  • markzxu (1)
  • pdurham2 (1)
  • Phrrancis (1)
  • earthgecko (1)
Top Labels
Issue Labels
documentation (10) meta (7) bug (6) feature (4) help wanted (3) question (3) wontfix (1)
Pull Request Labels
meta (3) documentation (3) feature (2) bug (1)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 395 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 36
  • Total maintainers: 1
proxy.golang.org: github.com/zillow/luminaire
  • Versions: 15
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.4%
Average: 6.6%
Dependent repos count: 6.8%
Last synced: 6 months ago
pypi.org: luminaire

Luminaire is a python package that provides ML driven solutions for monitoring time series data

  • Versions: 21
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 395 Last month
Rankings
Stargazers count: 2.3%
Forks count: 5.6%
Downloads: 7.9%
Average: 9.5%
Dependent packages count: 10.1%
Dependent repos count: 21.6%
Maintainers (1)
Last synced: 6 months ago

Dependencies

docs/sphinx_requirements.txt pypi
  • sphinx *
  • sphinx-material *
requirements.txt pypi
  • changepy >=0.3.1
  • decorator >=5.1.0
  • hyperopt >=0.1.2
  • numpy >=1.17.5
  • pandas >=0.25.3
  • py4j <=0.10.9.3
  • pykalman >=0.9.5
  • scikit-learn >=0.24.2
  • scipy >=1.6.0
  • statsmodels >=0.13.0
.github/workflows/github-pages.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • peaceiris/actions-gh-pages v3 composite
.github/workflows/python-app.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/python-publish.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
setup.py pypi