tsfresh

Automatic extraction of relevant features from time series:

https://github.com/blue-yonder/tsfresh

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 21 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
5 of 99 committers (5.1%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (18.3%) to scientific vocabulary

Keywords

data-science feature-extraction time-series

Keywords from Contributors

alignment flexible hyperparameter-optimization neuroimaging distributed

Last synced: 6 months ago · JSON representation

Repository

Automatic extraction of relevant features from time series:

Basic Info

Host: GitHub
Owner: blue-yonder
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage: http://tsfresh.readthedocs.io
Size: 7.97 MB

Statistics

Stars: 8,940
Watchers: 169
Forks: 1,257
Open Issues: 70
Releases: 17

Topics

data-science feature-extraction time-series

Created over 9 years ago · Last pushed 6 months ago

Metadata Files

Readme Changelog License Authors

tsfresh

This repository contains the TSFRESH python package. The abbreviation stands for

"Time Series Feature extraction based on scalable hypothesis tests".

The package provides systematic time-series feature extraction by combining established algorithms from statistics, time-series analysis, signal processing, and nonlinear dynamics with a robust feature selection algorithm. In this context, the term time-series is interpreted in the broadest possible sense, such that any types of sampled data or even event sequences can be characterised.

Spend less time on feature engineering

Data Scientists often spend most of their time either cleaning data or building features. While we cannot change the first thing, the second can be automated. TSFRESH frees your time spent on building features by extracting them automatically. Hence, you have more time to study the newest deep learning paper, read hacker news or build better models.

Automatic extraction of 100s of features

TSFRESH automatically extracts 100s of features from time series. Those features describe basic characteristics of the time series such as the number of peaks, the average or maximal value or more complex features such as the time reversal symmetry statistic.

The features extracted from a exemplary time series

The set of features can then be used to construct statistical or machine learning models on the time series to be used for example in regression or classification tasks.

Forget irrelevant features

Time series often contain noise, redundancies or irrelevant information. As a result most of the extracted features will not be useful for the machine learning task at hand.

To avoid extracting irrelevant features, the TSFRESH package has a built-in filtering procedure. This filtering procedure evaluates the explaining power and importance of each characteristic for the regression or classification tasks at hand.

It is based on the well developed theory of hypothesis testing and uses a multiple test procedure. As a result the filtering process mathematically controls the percentage of irrelevant extracted features.

The TSFRESH package is described in the following open access paper:

Christ, M., Braun, N., Neuffer, J., and Kempa-Liehr A.W. (2018). Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh -- A Python package). Neurocomputing 307, p. 72-77, doi: 10.1016/j.neucom.2018.03.067.

The FRESH algorithm is described in the following whitepaper:

Christ, M., Kempa-Liehr, A.W., and Feindt, M. (2017). Distributed and parallel time series feature extraction for industrial big data applications. ArXiv e-print 1610.07717, https://arxiv.org/abs/1610.07717.

Systematic time-series feature extraction even works for unsupervised problems:

Teh, H.Y., Wang, K.I-K., Kempa-Liehr, A.W. (2021). Expect the Unexpected: Unsupervised feature selection for automated sensor anomaly detection. IEEE Sensors Journal 15.16, p. 18033-18046, doi: 10.1109/JSEN.2021.3084970.

Due to the fact that tsfresh basically provides time-series feature extraction for free, you can now concentrate on engineering new time-series, like e.g. differences of signals from synchronous measurements, which provide even better time-series features:

Kempa-Liehr, A.W., Oram, J., Wong, A., Finch, M., Besier, T. (2020). Feature engineering workflow for activity recognition from synchronized inertial measurement units. In: Pattern Recognition. ACPR 2019. Ed. by M. Cree et al. Vol. 1180. Communications in Computer and Information Science (CCIS). Singapore: Springer, p. 223–231. doi: 10.1007/978-981-15-3651-9_20.
Simmons, S., Jarvis, L., Dempsey, D., Kempa-Liehr, A.W. (2021). Data Mining on Extremely Long Time-Series. In: 2021 International Conference on Data Mining Workshops (ICDMW). Ed. by B. Xue et al. Los Alamitos: IEEE, p. 1057-1066. doi: 10.1109/ICDMW53433.2021.00137.

Systematic time-series features engineering allows to work with time-series samples of different lengths, because every time-series is projected into a well-defined feature space. This approach allows the design of robust machine learning algorithms in applications with missing data.

Kennedy, A., Gemma, N., Rattenbury, N., Kempa-Liehr, A.W. (2021). Modelling the projected separation of microlensing events using systematic time-series feature engineering. Astronomy and Computing 35.100460, p. 1–14, doi: 10.1016/j.ascom.2021.100460

Is your time-series classification problem imbalanced? There is a good chance that undersampling of time-series feature matrices might solve your problem:

Dempsey, D.E., Cronin, S.J., Mei, S., Kempa-Liehr, A.W. (2020). Automatic precursor recognition and real-time forecasting of sudden explosive volcanic eruptions at Whakaari, New Zealand. Nature Communications 11.3562, p. 1-8, doi: 10.1038/s41467-020-17375-2.

Natural language processing of written texts is an example of applying systematic time-series feature engineering to event sequences, which is described in the following open access paper:

Tang, Y., Blincoe, K., Kempa-Liehr, A.W. (2020). Enriching Feature Engineering for Short Text Samples by Language Time Series Analysis. EPJ Data Science 9.26, p. 1–59. doi: 10.1140/epjds/s13688-020-00244-9

Advantages of tsfresh

TSFRESH has several selling points, for example

it is field tested
it is unit tested
the filtering process is statistically/mathematically correct
it has a comprehensive documentation
it is compatible with sklearn, pandas and numpy
it allows anyone to easily add their favorite features
it both runs on your local machine or even on a cluster

Next steps

If you are interested in the technical workings, go to see our comprehensive Read-The-Docs documentation at http://tsfresh.readthedocs.io.

The algorithm, especially the filtering part are also described in the paper mentioned above.

We appreciate any contributions, if you are interested in helping us to make TSFRESH the biggest archive of feature extraction methods in python, just head over to our How-To-Contribute instructions.

If you want to try out tsfresh quickly or if you want to integrate it into your workflow, we also have a docker image available:

docker pull nbraun/tsfresh

Backwards compatibility

If you need to reproduce or update time-series features, which were computed with the matrixprofile feature calculators, you need to create a Python 3.8 environment:

conda create --name tsfresh__py_3.8 python=3.8
conda activate tsfresh__py_3.8
pip install tsfresh[matrixprofile]

Acknowledgements

The research and development of TSFRESH was funded in part by the German Federal Ministry of Education and Research under grant number 01IS14004 (project iPRODICT).

Owner

Name: Blue Yonder GmbH
Login: blue-yonder
Kind: organization
Location: Karlsruhe, Germany

Website: https://www.blueyonder.com
Repositories: 32
Profile: https://github.com/blue-yonder

GitHub Events

Total

Create event: 6
Release event: 2
Issues event: 5
Watch event: 475
Delete event: 4
Issue comment event: 30
Push event: 16
Pull request event: 23
Pull request review event: 29
Pull request review comment event: 25
Fork event: 51

Last Year

Create event: 6
Release event: 2
Issues event: 5
Watch event: 475
Delete event: 4
Issue comment event: 30
Push event: 16
Pull request event: 23
Pull request review event: 29
Pull request review comment event: 25
Fork event: 51

Committers

Last synced: 9 months ago

All Time

Total Commits: 554
Total Committers: 99
Avg Commits per committer: 5.596
Development Distribution Score (DDS): 0.679

Past Year

Commits: 16
Committers: 5
Avg Commits per committer: 3.2
Development Distribution Score (DDS): 0.563

Top Committers

Name	Email	Commits
Nils Braun	n****n	178
Maximilian Christ	m**t@m**m	151
akem134@elan	a**r@a**z	48
earthgecko	g**n@o**k	12
Julius Neuffer	j****f	11
Denis Barbier	b**r@i**r	11
Marx	M**s@f**e	8
Maximilian Christ	m**t@m**l	8
Niklas Haas	n**a@m**g	6
Scott-Simmons	5****s	5
Delyan	d**v@g**m	4
Nigel Bosch	p**b@g**m	4
Stephan Müller	m**l@s**u	4
flyingdutchman23	f**n@p**u	3
moritzgelb	m****b	3
Oli	o**s@t**e	3
Thibault de Boissiere	t**e@s**m	3
Julius Neuffer	j**r@b**m	3
Brian Sang	s**i@g**m	2
Derrick	d****s	2
Timo Klerx	t**k@m**e	2
Sean M. Law	7****w	2
Sarius2009	4****9	2
Mario Kahlhofer	m**r@g**m	2
Evans Doe Ocansey	5****e	2
vin tang	v**g@g**m	2
Dimitris Spathis	d**s@g**m	1
Dominic White	g**b@d**m	1
Emanuele Fumagalli	e****f	1
Erlend Aune	e**3@g**m	1
and 69 more...

Committer Domains (Top 20 + Academic)

me.com: 2 auckland.ac.nz: 1 of-networks.co.uk: 1 imacs.polytechnique.fr: 1 frey1.de: 1 mailbox.org: 1 stephanmueller.eu: 1 posteo.eu: 1 t-online.de: 1 seeingmachines.com: 1 blue-yonder.com: 1 mail.upb.de: 1 domwhite.com: 1 ardentresearch.com: 1 pm.me: 1 126.com: 1 donders.ru.nl: 1 uni-muenster.de: 1 reply.de: 1 case.edu: 1 umich.edu: 1 alumni.ubc.ca: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 99
Total pull requests: 74
Average time to close issues: 6 months
Average time to close pull requests: about 2 months
Total issue authors: 89
Total pull request authors: 42
Average comments per issue: 5.28
Average comments per pull request: 2.04
Merged pull requests: 45
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 4
Pull requests: 25
Average time to close issues: 24 days
Average time to close pull requests: 8 days
Issue authors: 4
Pull request authors: 6
Average comments per issue: 2.0
Average comments per pull request: 0.48
Merged pull requests: 16
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

b-y-f (7)
maxdembovsky (2)
mendel5 (2)
heib6xinyu (2)
NoamGit (1)
Yasslight90 (1)
liorshk (1)
kempa-liehr (1)
loginofdeath (1)
leonanto (1)
papaschlumpf1972 (1)
jtlz2 (1)
momijiame (1)
LoganCarvalho (1)
seb-koch (1)

Pull Request Authors

nils-braun (19)
Scott-Simmons (13)
Sarius2009 (4)
igor-pechersky (2)
Apsylem (2)
aioneersSZ (2)
heib6xinyu (2)
dilwong (2)
evansdoe (2)
YamaByte (2)
mendel5 (2)
dom-white (2)
abigsmall (1)
HolgerPeters (1)
NAThompson (1)

Top Labels

Issue Labels

bug (68) enhancement (7) help wanted (3)

Pull Request Labels

Packages

Total packages: 7
Total downloads:
- pypi 318,975 last-month
Total docker downloads: 23,172

Total dependent packages: 37
(may contain duplicates)
Total dependent repositories: 258
(may contain duplicates)
Total versions: 87
Total maintainers: 6

pypi.org: tsfresh

tsfresh extracts relevant characteristics from time series

Homepage: https://github.com/blue-yonder/tsfresh
Documentation: https://tsfresh.readthedocs.io/
License: MIT
Latest release: 0.21.1
published 6 months ago

Versions: 33
Dependent Packages: 35
Dependent Repositories: 246
Downloads: 318,942 Last month
Docker Downloads: 23,172

Rankings

Stargazers count: 0.3%

Dependent packages count: 0.4%

Downloads: 0.7%

Average: 0.8%

Dependent repos count: 1.0%

Forks count: 1.2%

Docker downloads count: 1.4%

Maintainers (4)

jneuff liehr MaxChrist nbraun

Last synced: 6 months ago

proxy.golang.org: github.com/blue-yonder/tsfresh

Documentation: https://pkg.go.dev/github.com/blue-yonder/tsfresh#section-documentation
License: mit
Latest release: v0.21.1
published 6 months ago

Versions: 35
Dependent Packages: 0
Dependent Repositories: 1

Rankings

Stargazers count: 0.8%

Forks count: 0.8%

Average: 3.7%

Dependent repos count: 4.8%

Dependent packages count: 8.5%

Last synced: 6 months ago

conda-forge.org: tsfresh

Homepage: http://github.com/blue-yonder/tsfresh
License: MIT
Latest release: 0.19.0
published about 4 years ago

Versions: 11
Dependent Packages: 2
Dependent Repositories: 5

Rankings

Stargazers count: 3.7%

Forks count: 4.4%

Average: 10.6%

Dependent repos count: 14.7%

Dependent packages count: 19.6%

Last synced: 6 months ago

pypi.org: rtm-tsfresh

tsfresh extracts relevant characteristics from time series

Homepage: https://github.com/blue-yonder/tsfresh
Documentation: https://rtm-tsfresh.readthedocs.io/
License: MIT
Latest release: 1.1.102
published over 3 years ago

Versions: 3
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 13 Last month

Rankings

Stargazers count: 0.3%

Forks count: 1.2%

Dependent packages count: 6.6%

Average: 15.4%

Dependent repos count: 30.6%

Downloads: 38.1%

Maintainers (1)

rtmvitalsigns

Last synced: 6 months ago

pypi.org: peritus-test-tsfresh

tsfresh extracts relevant characteristics from time series

Homepage: https://github.com/blue-yonder/tsfresh
Documentation: https://peritus-test-tsfresh.readthedocs.io/
License: MIT
Latest release: 1.1.0
published over 3 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 12 Last month

Rankings

Stargazers count: 0.3%

Forks count: 1.2%

Dependent packages count: 6.6%

Average: 17.7%

Dependent repos count: 30.6%

Downloads: 49.9%

Maintainers (1)

aliacovella

Last synced: 6 months ago

pypi.org: peritus-tsfresh

tsfresh extracts relevant characteristics from time series

Homepage: https://github.com/blue-yonder/tsfresh
Documentation: https://peritus-tsfresh.readthedocs.io/
License: MIT
Latest release: 1.0.0
published over 3 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 8 Last month

Rankings

Stargazers count: 0.3%

Forks count: 1.2%

Dependent packages count: 7.4%

Dependent repos count: 22.2%

Average: 22.3%

Downloads: 80.2%

Maintainers (1)

aliacovella

Last synced: 6 months ago

anaconda.org: tsfresh

The package provides systematic time-series feature extraction by combining established algorithms from statistics, time-series analysis, signal processing, and nonlinear dynamics with a robust feature selection algorithm. In this context, the term time-series is interpreted in the broadest possible sense, such that any types of sampled data or even event sequences can be characterized.

Homepage: https://github.com/blue-yonder/tsfresh
License: MIT
Latest release: 0.21.1
published 6 months ago

Versions: 3
Dependent Packages: 0
Dependent Repositories: 5

Rankings

Stargazers count: 9.6%

Forks count: 10.6%

Average: 26.0%

Dependent packages count: 41.0%

Dependent repos count: 42.9%

Last synced: 6 months ago

Dependencies

.github/workflows/benchmark_default_branch.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
actions/upload-artifact v2 composite

.github/workflows/deploy.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/stylecheck.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
pre-commit/action v2.0.0 composite

.github/workflows/test.yml actions

actions/cache v3 composite
actions/checkout v3 composite
actions/setup-python v3 composite
actions/upload-artifact v1 composite
codecov/codecov-action v3 composite

.github/workflows/test_all.yml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/setup-python v2 composite
actions/upload-artifact v1 composite

Dockerfile docker

base latest build
python 3.8 build
python 3.8-slim build

binder/requirements.txt pypi

setup.py pypi

tsfresh

Science Score: 59.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

tsfresh

Spend less time on feature engineering

Automatic extraction of 100s of features

Forget irrelevant features

Advantages of tsfresh

Next steps

Backwards compatibility

Acknowledgements

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: tsfresh

Rankings

Maintainers (4)

proxy.golang.org: github.com/blue-yonder/tsfresh

Rankings

conda-forge.org: tsfresh

Rankings

pypi.org: rtm-tsfresh

Rankings

Maintainers (1)

pypi.org: peritus-test-tsfresh

Rankings

Maintainers (1)

pypi.org: peritus-tsfresh

Rankings

Maintainers (1)

anaconda.org: tsfresh

Rankings

Dependencies