https://github.com/predict-idlab/tsdownsample

High-performance time series downsampling algorithms for visualization

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, acm.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary

Keywords

aggregation downsampling fast fpcs lttb m4 minmax performance python simd time-series visualization

Last synced: 5 months ago · JSON representation

Repository

High-performance time series downsampling algorithms for visualization

Basic Info

Host: GitHub
Owner: predict-idlab
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage:
Size: 641 KB

Statistics

Stars: 193
Watchers: 9
Forks: 18
Open Issues: 17
Releases: 5

Topics

aggregation downsampling fast fpcs lttb m4 minmax performance python simd time-series visualization

Created about 3 years ago · Last pushed 12 months ago

Metadata Files

Readme Contributing Funding License

tsdownsample

Extremely fast time series downsampling 📈 for visualization, written in Rust.

Features ✨

Fast: written in rust with PyO3 bindings
- leverages optimized argminmax - which is SIMD accelerated with runtime feature detection
- scales linearly with the number of data points 
- multithreaded with Rayon (in Rust)
  
  Why we do not use Python multiprocessing
  Citing the PyO3 docs on parallelism:
  
  CPython has the infamous Global Interpreter Lock, which prevents several threads from executing Python bytecode in parallel. This makes threading in Python a bad fit for CPU-bound tasks and often forces developers to accept the overhead of multiprocessing.
  In Rust - which is a compiled language - there is no GIL, so CPU-bound tasks can be parallelized (with Rayon) with little to no overhead.
Efficient: memory efficient
- works on views of the data (no copies)
- no intermediate data structures are created
Flexible: works on any type of data
- supported datatypes are
- for x: f32, f64, i16, i32, i64, u16, u32, u64, datetime64, timedelta64
- for y: f16, f32, f64, i8, i16, i32, i64, u8, u16, u32, u64, datetime64, timedelta64, bool
  
  !! 🚀 f16 argminmax is 200-300x faster than numpy
  In contrast with all other data types above, f16 is not hardware supported (i.e., no instructions for f16) by most modern CPUs!!
  🐌 Programming languages facilitate support for this datatype by either (i) upcasting to f32 or (ii) using a software implementation.
  💡 As for argminmax, only comparisons are needed - and thus no arithmetic operations - creating a symmetrical ordinal mapping from f16 to i16 is sufficient. This mapping allows to use the hardware supported scalar and SIMD i16 instructions - while not producing any memory overhead 🎉
  More details are described in argminmax PR #1.
Easy to use: simple & flexible API

Install

bash pip install tsdownsample

Usage

```python from tsdownsample import MinMaxLTTBDownsampler import numpy as np

Create a time series

y = np.random.randn(10000000) x = np.arange(len(y))

Downsample to 1000 points (assuming constant sampling rate)

sds = MinMaxLTTBDownsampler().downsample(y, nout=1000)

Select downsampled data

downsampledy = y[sds]

Downsample to 1000 points using the (possible irregularly spaced) x-data

sds = MinMaxLTTBDownsampler().downsample(x, y, nout=1000)

Select downsampled data

downsampledx = x[sds] downsampledy = y[sds] ```

Downsampling algorithms & API

Downsampling API 📑

Each downsampling algorithm is implemented as a class that implements a downsample method. The signature of the downsample method:

downsample([x], y, n_out, **kwargs) -> ndarray[uint64]

Arguments:

x is optional
x and y are both positional arguments
n_out is a mandatory keyword argument that defines the number of output values^*
**kwargs are optional keyword arguments (see table below):
- parallel: whether to use multi-threading (default: False)
  ❗ The max number of threads can be configured with the TSDOWNSAMPLE_MAX_THREADS ENV var (e.g. os.environ["TSDOWNSAMPLE_MAX_THREADS"] = "4")
- ...

Returns: a ndarray[uint64] of indices that can be used to index the original data.

^*When there are gaps in the time series, fewer than n_out indices may be returned.

Downsampling algorithms 📈

The following downsampling algorithms (classes) are implemented:

| Downsampler | Description | **kwargs | | ---:| --- |--- | | MinMaxDownsampler | selects the min and max value in each bin | parallel | | M4Downsampler | selects the min, max, first and last value in each bin | parallel | | LTTBDownsampler | performs the Largest Triangle Three Buckets algorithm | parallel | | MinMaxLTTBDownsampler | (new two-step algorithm 🎉) first selects n_out * minmax_ratio min and max values, then further reduces these to n_out values using the Largest Triangle Three Buckets algorithm | parallel, minmax_ratio^* |

^*Default value for minmax_ratio is 4, which is empirically proven to be a good default. More details here: https://arxiv.org/abs/2305.00332

Handling NaNs

This library supports two NaN-policies:

Omit NaNs (NaNs are ignored during downsampling).
Return index of first NaN once there is at least one present in the bin of the considered data.

| Omit NaNs | Return NaNs | | ----------------------: | :------------------------- | | MinMaxDownsampler | NaNMinMaxDownsampler | | M4Downsampler | NaNM4Downsampler | | MinMaxLTTBDownsampler | NaNMinMaxLTTBDownsampler | | LTTBDownsampler | |

Note that NaNs are not supported for x-data.

Limitations & assumptions 🚨

Assumes;

x-data is (non-strictly) monotonic increasing (i.e., sorted)
no NaNs in x-data

👤 Jeroen Van Der Donckt

Owner

Name: PreDiCT.IDLab
Login: predict-idlab
Kind: organization
Location: Ghent - Belgium

Website: http://predict.idlab.ugent.be/
Repositories: 55
Profile: https://github.com/predict-idlab

Repositories of the IDLab PreDiCT research group

GitHub Events

Total

Create event: 7
Release event: 1
Issues event: 4
Watch event: 38
Delete event: 1
Issue comment event: 14
Push event: 31
Pull request review event: 2
Pull request event: 10
Fork event: 5

Last Year

Create event: 7
Release event: 1
Issues event: 4
Watch event: 38
Delete event: 1
Issue comment event: 14
Push event: 31
Pull request review event: 2
Pull request event: 10
Fork event: 5

Committers

Last synced: almost 3 years ago

All Time

Total Commits: 22
Total Committers: 5
Avg Commits per committer: 4.4
Development Distribution Score (DDS): 0.545

Top Committers

Name	Email	Commits
Jeroen Van Der Donckt	1**d@u**m	10
Jeroen Van Der Donckt	b**d@g**m	9
Saveliy Yusufov	s**v@g**m	1
jayceslesar	j**r@b**m	1
Jayce Slesar	4**r@u**m	1

Committer Domains (Top 20 + Academic)

beta.team: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 27
Total pull requests: 64
Average time to close issues: 4 months
Average time to close pull requests: 11 days
Total issue authors: 11
Total pull request authors: 9
Average comments per issue: 2.41
Average comments per pull request: 1.11
Merged pull requests: 42
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 15
Average time to close issues: 3 months
Average time to close pull requests: 28 days
Issue authors: 1
Pull request authors: 5
Average comments per issue: 1.0
Average comments per pull request: 1.07
Merged pull requests: 6
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

jvdd (11)
jonasvdd (4)
jayceslesar (3)
lcs-crr (2)
NielsPraet (1)
my1e5 (1)
LeiRui (1)
daveah (1)
mike-iqmo (1)
Hoxbro (1)

Pull Request Authors

jvdd (48)
NielsPraet (6)
jonasvdd (4)
jayceslesar (3)
diliop (2)
my1e5 (2)
leviska (2)
TomaSajt (1)
smu160 (1)

Top Labels

Issue Labels

enhancement (8) bug (4) documentation (1) help wanted (1) unsure (1)

Pull Request Labels

enhancement (2)

Packages

Total packages: 2
Total downloads:
- pypi 488,926 last-month
Total docker downloads: 733

Total dependent packages: 3
(may contain duplicates)
Total dependent repositories: 39
(may contain duplicates)
Total versions: 23
Total maintainers: 2

pypi.org: tsdownsample

Time series downsampling in rust

Homepage: https://github.com/predict-idlab/tsdownsample
Documentation: https://tsdownsample.readthedocs.io/
License: MIT
Latest release: 0.1.4
published about 3 years ago

Versions: 19
Dependent Packages: 3
Dependent Repositories: 39
Downloads: 488,926 Last month
Docker Downloads: 733

Rankings

Downloads: 0.6%

Dependent repos count: 2.3%

Dependent packages count: 3.1%

Docker downloads count: 3.8%

Average: 4.8%

Stargazers count: 7.1%

Forks count: 11.9%

Maintainers (2)

jvdd jonvdrdo

Last synced: 6 months ago

proxy.golang.org: github.com/predict-idlab/tsdownsample

Documentation: https://pkg.go.dev/github.com/predict-idlab/tsdownsample#section-documentation
License: mit
Latest release: v0.1.3
published about 2 years ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.4%

Average: 6.6%

Dependent repos count: 6.8%

Last synced: 6 months ago

Dependencies

downsample_rs/Cargo.toml cargo

criterion 0.3.0 development
argminmax 0.2
half 2.1
ndarray 0.15.6

pyproject.toml pypi

numpy >=1.21
pandas >=1.3
python ^3.7.1

.github/workflows/ci-downsample_rs.yml actions

Swatinem/rust-cache v1 composite
actions-rs/toolchain v1 composite
actions/checkout v2 composite

.github/workflows/ci-tsdownsample.yml actions

PyO3/maturin-action v1 composite
Swatinem/rust-cache v2 composite
actions-rs/toolchain v1 composite
actions/checkout v3 composite
actions/download-artifact v3 composite
actions/setup-python v4 composite
actions/upload-artifact v3 composite
codecov/codecov-action v3 composite

tests/requirements-linting.txt pypi

black * test
isort * test
mypy * test
ruff * test

tests/requirements.txt pypi

pytest * test
pytest-cov * test

.github/workflows/codeql.yml actions

actions/checkout v3 composite
github/codeql-action/analyze v2 composite
github/codeql-action/init v2 composite

.github/workflows/codspeed.yml actions

CodSpeedHQ/action v1 composite
Swatinem/rust-cache v2 composite
actions-rs/toolchain v1 composite
actions/checkout v3 composite
actions/setup-python v4 composite

Cargo.toml cargo

downsample_rs/dev_utils/Cargo.toml cargo

notebooks/requirements.txt pypi

numpy *
pandas *
tsdownsample *

https://github.com/predict-idlab/tsdownsample

Science Score: 36.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

tsdownsample

Features ✨

Install

Usage

Create a time series

Downsample to 1000 points (assuming constant sampling rate)

Select downsampled data

Downsample to 1000 points using the (possible irregularly spaced) x-data

Select downsampled data

Downsampling algorithms & API

Downsampling API 📑

Downsampling algorithms 📈

Handling NaNs

Limitations & assumptions 🚨

Owner

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: tsdownsample

Rankings

Maintainers (2)

proxy.golang.org: github.com/predict-idlab/tsdownsample

Rankings

Dependencies