https://github.com/predict-idlab/tsflex
Flexible time series feature extraction & processing
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: sciencedirect.com -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Flexible time series feature extraction & processing
Basic Info
- Host: GitHub
- Owner: predict-idlab
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://predict-idlab.github.io/tsflex/
- Size: 27.9 MB
Statistics
- Stars: 422
- Watchers: 8
- Forks: 26
- Open Issues: 32
- Releases: 5
Topics
Metadata Files
README.md
tsflex is a toolkit for flexible time series processing & feature extraction, that is efficient and makes few assumptions about sequence data.
Useful links
Installation
| | command |
| :--------------------------------------------------- | :------------------------------------ |
| pip | pip install tsflex |
| conda | conda install -c conda-forge tsflex |
Usage
tsflex is built to be intuitive, so we encourage you to copy-paste this code and toy with some parameters!
Feature extraction
```python import pandas as pd; import numpy as np; import scipy.stats as ss from tsflex.features import MultipleFeatureDescriptors, FeatureCollection from tsflex.utils.data import loadempaticadata
1. Load sequence-indexed data (in this case a time-index)
dftmp, dfacc, dfibi = loadempatica_data(['tmp', 'acc', 'ibi'])
2. Construct your feature extraction configuration
fc = FeatureCollection( MultipleFeatureDescriptors( functions=[np.min, np.mean, np.std, ss.skew, ss.kurtosis], seriesnames=["TMP", "ACCx", "ACC_y", "IBI"], windows=["15min", "30min"], strides="15min", ) )
3. Extract features
fc.calculate(data=[dftmp, dfacc, dfibi], approvesparsity=True) ```
Note that the feature extraction is performed on multivariate data with varying sample rates. | signal | columns | sample rate | |:-------|:-------|------------------:| | dftmp | ["TMP"]| 4Hz | | dfacc | ["ACCx", "ACCy", "ACCz" ]| 32Hz | | dfibi | ["IBI"]| irregularly sampled |
Processing
Why tsflex? ✨
Flexible:- handles multivariate/multimodal time series
- versatile function support => integrates with many packages for:
- processing (e.g., scipy.signal, statsmodels.tsa)
- feature extraction (e.g., numpy, scipy.stats, antropy, nolds, seglearn¹, tsfresh¹, tsfel¹)
- feature extraction handles multiple strides & window sizes
Efficient:
- view-based operations for processing & feature extraction => extremely low memory peak & fast execution time
- see: feature extraction benchmark visualization
- view-based operations for processing & feature extraction => extremely low memory peak & fast execution time
Intuitive:
- maintains the sequence-index of the data
- feature extraction constructs interpretable output column names
- intuitive API
Few assumptionsabout the sequence data:- no assumptions about sampling rate
- able to deal with multivariate asynchronous data
i.e. data with small time-offsets between the modalities
Advanced functionalities:- apply FeatureCollection.reduce after feature selection for faster inference
- use function execution time logging to discover processing and feature extraction bottlenecks
- embedded SeriesPipeline & FeatureCollection serialization
- time series chunking
¹ These integrations are shown in integration-example notebooks.
Future work 🔨
- scikit-learn integration for both processing and feature extraction
note: is actively developed upon sklearn integration branch. - Support time series segmentation (exposing under the hood strided-rolling functionality) - see this issue
- Support for multi-indexed dataframes
=> Also see the enhancement issues
Contributing 👪
We are thrilled to see your contributions to further enhance tsflex.
See this guide for more instructions on how to contribute.
Referencing our package
If you use tsflex in a scientific publication, we would highly appreciate citing us as:
bibtex
@article{vanderdonckt2021tsflex,
author = {Van Der Donckt, Jonas and Van Der Donckt, Jeroen and Deprost, Emiel and Van Hoecke, Sofie},
title = {tsflex: flexible time series processing \& feature extraction},
journal = {SoftwareX},
year = {2021},
url = {https://github.com/predict-idlab/tsflex},
publisher={Elsevier}
}
Link to the paper: https://www.sciencedirect.com/science/article/pii/S2352711021001904
👤 Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost
Owner
- Name: PreDiCT.IDLab
- Login: predict-idlab
- Kind: organization
- Location: Ghent - Belgium
- Website: http://predict.idlab.ugent.be/
- Repositories: 55
- Profile: https://github.com/predict-idlab
Repositories of the IDLab PreDiCT research group
GitHub Events
Total
- Issues event: 2
- Watch event: 24
- Issue comment event: 2
Last Year
- Issues event: 2
- Watch event: 24
- Issue comment event: 2
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| jervdrdo | j****t@u****e | 403 |
| jonas | j****t@u****e | 235 |
| epdprost | e****t@u****e | 23 |
| Niels Praet | n****1@g****m | 12 |
| Mathieu De Meue | m****u@d****e | 2 |
| Jeroen Boeye | j****e@g****m | 1 |
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 51
- Total pull requests: 78
- Average time to close issues: 2 months
- Average time to close pull requests: 12 days
- Total issue authors: 14
- Total pull request authors: 6
- Average comments per issue: 1.16
- Average comments per pull request: 1.77
- Merged pull requests: 66
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 1
- Average time to close issues: 26 days
- Average time to close pull requests: about 2 hours
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 9.0
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- jvdd (28)
- jonasvdd (9)
- emield12 (2)
- GillesVandewiele (2)
- riedel (1)
- EXJUSTICE (1)
- arturdaraujo (1)
- windischbauer (1)
- IKetchup (1)
- deepkachhawa7 (1)
- cameron-hobbs (1)
- saheel1115 (1)
Pull Request Authors
- jvdd (65)
- jonasvdd (14)
- NielsPraet (4)
- emield12 (1)
- jeroenboeye (1)
- matdemeue (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- 165 dependencies
- Sphinx ^3.5.2 develop
- black ^20.8b1 develop
- catch22 ^0.2.0 develop
- fastparquet 0.8.0 develop
- flake8 ^3.9.0 develop
- jupyterlab ^3.2.9 develop
- memory-profiler ^0.58.0 develop
- pdoc3 ^0.9.2 develop
- pydocstyle ^5.1.1 develop
- pytest ^6.2.3 develop
- pytest-cov ^2.12.1 develop
- scipy ^1.7.3 develop
- seglearn ^1.2.3 develop
- statsmodels 0.12.2 develop
- tsfel ^0.1.4 develop
- tsfresh ^0.18.0 develop
- dill ^0.3.4
- multiprocess ^0.70.12
- numpy ^1.21.5
- pandas ^1.3.5
- python >=3.7.1,<3.11
- tqdm ^4.62.3
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- snok/install-poetry v1 composite
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/upload-artifact v2 composite
- codecov/codecov-action v1 composite
- snok/install-poetry v1.3.1 composite
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- snok/install-poetry v1 composite
- actions/checkout v3 composite
- github/codeql-action/analyze v2 composite
- github/codeql-action/init v2 composite
- CodSpeedHQ/action v2 composite
- actions/cache v2 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- snok/install-poetry v1 composite
- scikit-learn *
