TimeSeriesClustering

TimeSeriesClustering: An extensible framework in Julia - Published in JOSS (2019)

https://github.com/holgerteichgraeber/timeseriesclustering.jl

Keywords

clustering energy-systems hierarchical-clustering julia k-means-clustering k-medoids-clustering optimization representative-days time-series-aggregation

Keywords from Contributors

jump california capacity-expansion-planning energy-optimization-model germany pdes

Last synced: 10 months ago · JSON representation

Repository

Julia implementation of unsupervised learning methods for time series datasets. It provides functionality for clustering and aggregating, detecting motifs, and quantifying similarity between time series datasets.

Basic Info

Host: GitHub
Owner: holgerteichgraeber
License: mit
Language: Julia
Default Branch: master
Homepage:
Size: 171 MB

Statistics

Stars: 83
Watchers: 6
Forks: 23
Open Issues: 18
Releases: 15

Topics

clustering energy-systems hierarchical-clustering julia k-means-clustering k-medoids-clustering optimization representative-days time-series-aggregation

Created almost 8 years ago · Last pushed over 5 years ago

Metadata Files

Readme Changelog Contributing License

README.md

TimeSeriesClustering is a Julia implementation of unsupervised learning methods for time series datasets. It provides functionality for clustering and aggregating, detecting motifs, and quantifying similarity between time series datasets. The software provides a type system for temporal data, and provides an implementation of the most commonly used clustering methods and extreme value selection methods for temporal data. It provides simple integration of multi-dimensional time-series data (e.g. multiple attributes such as wind availability, solar availability, and electricity demand) in a single aggregation process. The software is applicable to general time series datasets and lends itself well to a multitude of application areas within the field of time series data mining.

The TimeSeriesClustering package was originally developed to perform time series aggregation for energy systems optimization problems. By reducing the number of time steps used in the optimization model, using representative periods leads to significant reductions in computational complexity of these problems. The package was previously known as ClustForOpt.jl.

The package has three main purposes: 1) Provide a simple process of finding representative periods (reducing the number of observations) for time-series input data, with implementations of the most commonly used clustering methods and extreme value selection methods. 2) Provide an interface between representative period data and application (e.g. optimization problem) by having representative period data stored in a generalized type system. 3) Provide a generalized import feature for time series, where variable names, attributes, and node names are automatically stored and can then be used later when the reduced time series is used in the application at hand (e.g. in the definition of sets of the optimization problem).

In the domain of energy systems optimization, an example problem that uses TimeSeriesClustering for its input data is the package CapacityExpansion, which implements a scalable generation and transmission capacity expansion problem.

The TimeSeriesClustering package follows the clustering framework presented in Teichgraeber and Brandt, 2019. The package is actively developed, and new features are continuously added. For a reproducible version of the methods and data of the original paper by Teichgraeber and Brandt, 2019, please refer to v0.1 (including shape based methods such as k-shape and dynamic time warping barycenter averaging).

This package is developed by Holger Teichgraeber @holgerteichgraeber and Elias Kuepper @YoungFaithful.

Installation

This package runs under julia v1.0 and higher. Install using:

julia import Pkg Pkg.add("TimeSeriesClustering")

Documentation

Documentation (Stable): Please refer to this documentation for details on how to use TimeSeriesClustering the current version of TimeSeriesClustering. This is the documentation of the default version of the package. The default version is on the master branch.

Documentation (Development): If you like to try the development version of TimeSeriesClustering, please refer to this documentation. The development version is on the dev branch.

See NEWS for significant breaking changes when updating from one version of TimeSeriesClustering to another.

Citing TimeSeriesClustering

If you find TimeSeriesClustering useful in your work, we kindly request that you cite the following paper (link):

@article{Teichgraeber2019joss, author = {Teichgraeber, Holger and Kuepper, Lucas Elias and Brandt, Adam R}, doi = {https://doi.org/10.21105/joss.01573}, journal = {Journal of Open Source Software}, number = {41}, pages = {1573}, title = {TimeSeriesClustering : An extensible framework in Julia}, volume = {4}, year = {2019} }

If you find this package useful, our paper on comparing clustering methods for energy systems optimization problems may additionally be of interest.

Quick Start Guide

This quick start guide introduces the main concepts of using TimeSeriesClustering. The examples are taken from problems in the domain of scenario reduction for energy systems optimization. For more detail on the different functionalities that TimeSeriesClustering provides, please refer to the subsequent chapters of the documentation or the examples in the examples folder, specifically workflow_introduction.jl.

Generally, the workflow consists of three steps: - load data - find representative periods (clustering + extreme period selection) - optimization

Example Workflow

After TimeSeriesClustering is installed, you can use it by saying: @repl workflow using TimeSeriesClustering

The first step is to load the data. The following example loads hourly wind, solar, and demand data for Germany (1 region) for one year. @repl workflow ts_input_data = load_timeseries_data(:CEP_GER1) The output ts_input_data is a ClustData data struct that contains the data and additional information about the data. @repl workflow ts_input_data.data # a dictionary with the data. ts_input_data.data["wind-germany"] # the wind data (choose solar, el_demand as other options in this example) ts_input_data.K # number of periods

The second step is to cluster the data into representative periods. Here, we use k-means clustering and get 5 representative periods. @repl workflow clust_res = run_clust(ts_input_data;method="kmeans",n_clust=5) ts_clust_data = clust_res.clust_data The ts_clust_data is a ClustData data struct, this time with clustered data (i.e. less representative periods). @repl workflow ts_clust_data.data # the clustered data ts_clust_data.data["wind-germany"] # the wind data. Note the dimensions compared to ts_input_data ts_clust_data.K # number of periods

If this package is used in the domain of energy systems optimization, the clustered input data can be used as input to an optimization problem. The optimization problem formulated in the package CapacityExpansion can be used with the data clustered in this example.

Owner

Name: Holger Teichgraeber
Login: holgerteichgraeber
Kind: user
Company: Stanford University

Website: www.hteich.com
Repositories: 14
Profile: https://github.com/holgerteichgraeber

Ph.D. student at Stanford. Interested in machine learning and optimization.

JOSS Publication

TimeSeriesClustering: An extensible framework in Julia

Published

September 08, 2019

DOI

10.21105/joss.01573

Volume 4, Issue 41, Page 1573

Authors

Holger Teichgraeber

Department of Energy Resources Engineering, Stanford University

Lucas Elias Kuepper

Department of Energy Resources Engineering, Stanford University

Adam R. Brandt

Department of Energy Resources Engineering, Stanford University

Editor

Daniel S. Katz

GitHub Events

Total

Issues event: 1
Watch event: 3

Last Year

Issues event: 1
Watch event: 3

Committers

Last synced: 11 months ago

All Time

Total Commits: 444
Total Committers: 10
Avg Commits per committer: 44.4
Development Distribution Score (DDS): 0.446

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
holgerteichgraeber	h**r@t**e	246
elias.kuepper	e**r@r**e	155
Holger Teichgraeber	h**h@c**u	20
Holger Teichgraeber	H****r	14
Holger Teichgraeber	h**h@c**u	3
Daniel S. Katz	d**z@i**g	2
Alan Sill	a**l@t**u	1
Julia TagBot	5****t	1
Niklas Haag	n**g@r**e	1
arbrandt	4****t	1

Committer Domains (Top 20 + Academic)

relexsolutions.de: 1 ttu.edu: 1 ieee.org: 1 cees-tool-8.stanford.edu: 1 cees-mazama.stanford.edu: 1 rwth-aachen.de: 1 teichgr.de: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 54
Total pull requests: 80
Average time to close issues: about 1 month
Average time to close pull requests: 17 days
Total issue authors: 8
Total pull request authors: 8
Average comments per issue: 2.74
Average comments per pull request: 1.4
Merged pull requests: 70
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

holgerteichgraeber (32)
YoungFaithful (16)
junglegobs (1)
SariKerckhove (1)
Deepakgthomas (1)
jarupas (1)
1256ABCDE (1)
dsambor10 (1)

Pull Request Authors

holgerteichgraeber (39)
YoungFaithful (34)
danielskatz (2)
alansill (1)
niklashaa (1)
arbrandt (1)
leonardgoeke (1)
JuliaTagBot (1)

Top Labels

Issue Labels

enhancement (21) bug (10)

Pull Request Labels

bug (8) enhancement (5) wontfix (1)

Packages

Total packages: 1
Total downloads: unknown

Total dependent packages: 1
Total dependent repositories: 0
Total versions: 4

juliahub.com: TimeSeriesClustering

Julia implementation of unsupervised learning methods for time series datasets. It provides functionality for clustering and aggregating, detecting motifs, and quantifying similarity between time series datasets.

Documentation: https://docs.juliahub.com/General/TimeSeriesClustering/stable/
License: MIT
Latest release: 0.5.3
published almost 7 years ago

Versions: 4
Dependent Packages: 1
Dependent Repositories: 0

Rankings

Forks count: 8.4%

Dependent repos count: 9.9%

Stargazers count: 10.1%

Average: 12.9%

Dependent packages count: 23.0%

Last synced: 10 months ago

TimeSeriesClustering

Science Score: 95.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Installation

Documentation

Citing TimeSeriesClustering

Quick Start Guide

Example Workflow

Owner

JOSS Publication

TimeSeriesClustering: An extensible framework in Julia

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

juliahub.com: TimeSeriesClustering

Rankings