timeseriesflattener

timeseriesflattener: A Python package for summarizing features from (medical) time series - Published in JOSS (2023)

https://github.com/aarhus-psychiatry-research/timeseriesflattener

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
○
Committers with academic emails
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

electronic-healthcare-data irregular-time-series machine-learning python python3 time-series-analysis

Keywords from Contributors

dependency-distance descriptive-statistics readability readability-scores spacy spacy-extension syntactic-analysis climate-science dimensionality-reduction pca

Scientific Fields

Mathematics Computer Science - 84% confidence

Last synced: 6 months ago · JSON representation ·

Repository

Converting irregularly spaced time series, such as eletronic health records, into dataframes for tabular classification.

Basic Info

Host: GitHub
Owner: Aarhus-Psychiatry-Research
License: mit
Language: Python
Default Branch: main
Homepage: https://Aarhus-Psychiatry-Research.github.io/timeseriesflattener
Size: 114 MB

Statistics

Stars: 19
Watchers: 1
Forks: 2
Open Issues: 0
Releases: 100

Topics

electronic-healthcare-data irregular-time-series machine-learning python python3 time-series-analysis

Created about 3 years ago · Last pushed 8 months ago

Metadata Files

Readme Changelog Contributing License Code of conduct Citation Zenodo

Timeseriesflattener

Time series from e.g. electronic health records often have a large number of variables, are sampled at irregular intervals and tend to have a large number of missing values. Before this type of data can be used for prediction modelling with machine learning methods such as logistic regression or XGBoost, the data needs to be reshaped.

In essence, the time series need to be flattened so that each prediction time is represented by a set of predictor values and an outcome value. These predictor values can be constructed by aggregating the preceding values in the time series within a certain time window.

timeseriesflattener aims to simplify this process by providing an easy-to-use and fully-specified pipeline for flattening complex time series.

🔧 Installation

To get started using timeseriesflattener simply install it using pip by running the following line in your terminal:

pip install timeseriesflattener

⚡ Quick start

```py import datetime as dt

import numpy as np import polars as pl

Load a dataframe with times you wish to make a prediction

predictiontimesdf = pl.DataFrame( {"id": [1, 1, 2], "date": ["2020-01-01", "2020-02-01", "2020-02-01"]} )

Load a dataframe with raw values you wish to aggregate as predictors

predictordf = pl.DataFrame( { "id": [1, 1, 1, 2], "date": ["2020-01-15", "2019-12-10", "2019-12-15", "2020-01-02"], "predictorvalue": [1, 2, 3, 4], } )

Load a dataframe specifying when the outcome occurs

outcomedf = pl.DataFrame({"id": [1], "date": ["2020-03-01"], "outcomevalue": [1]})

Specify how to aggregate the predictors and define the outcome

from timeseriesflattener import ( MaxAggregator, MinAggregator, OutcomeSpec, PredictionTimeFrame, PredictorSpec, ValueFrame, )

predictorspec = PredictorSpec( valueframe=ValueFrame( initdf=predictordf, entityidcolname="id", valuetimestampcolname="date" ), lookbehinddistances=[dt.timedelta(days=1)], aggregators=[MaxAggregator(), MinAggregator()], fallback=np.nan, columnprefix="pred", )

outcomespec = OutcomeSpec( valueframe=ValueFrame( initdf=outcomedf, entityidcolname="id", valuetimestampcolname="date" ), lookaheaddistances=[dt.timedelta(days=1)], aggregators=[MaxAggregator(), MinAggregator()], fallback=np.nan, columnprefix="outc", )

Instantiate TimeseriesFlattener and add the specifications

from timeseriesflattener import Flattener

result = Flattener( predictiontimeframe=PredictionTimeFrame( initdf=predictiontimesdf, entityidcolname="id", timestampcolname="date" ) ).aggregatetimeseries(specs=[predictorspec, outcomespec]) result.df

```

Output:

| | id | date | predictiontimeuuid | predtestfeaturewithin30daysmeanfallbacknan | outctestoutcomewithin31daysmaximumfallback0_dichotomous | | --: | --: | :------------------ | :-------------------- | -------------------------------------------------: | --------------------------------------------------------------: | | 0 | 1 | 2020-01-01 00:00:00 | 1-2020-01-01-00-00-00 | 2.5 | 0 | | 1 | 1 | 2020-02-01 00:00:00 | 1-2020-02-01-00-00-00 | 1 | 1 | | 2 | 2 | 2020-02-01 00:00:00 | 2-2020-02-01-00-00-00 | 4 | 0 |

📖 Tutorial

🎓 Tutorials
📖 General docs

💬 Where to ask questions

| Type | | | ------------------------------- | ---------------------- | | 🚨 Bug Reports | GitHub Issue Tracker | | 🎁 Feature Requests & Ideas | GitHub Issue Tracker | | 👩‍💻 Usage Questions | GitHub Discussions | | 🗯 General Discussion | GitHub Discussions |

🎓 Projects

PSYCOP projects use timeseriesflattener, see more at the monorepo.

Owner

Name: PSYCOP, Aarhus Psychiatry Research
Login: Aarhus-Psychiatry-Research
Kind: organization
Location: Aarhus, Denmark

Website: https://www.psykiatrien.rm.dk/afdelinger/auhpsykiatrien/afdeling-for-depression-og-angst/About/
Repositories: 4
Profile: https://github.com/Aarhus-Psychiatry-Research

PSYchiatric Clinical Outcome Prediction (PSYCOP). Aarhus University Hospital - Psychiatry

JOSS Publication

timeseriesflattener: A Python package for summarizing features from (medical) time series

Published

March 29, 2023

DOI

10.21105/joss.05197

Volume 8, Issue 83, Page 5197

Authors

Martin Bernstorff

Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark, Department of Clinical Medicine, Aarhus University, Aarhus, Denmark, Center for Humanities Computing, Aarhus University, Aarhus, Denmark

Kenneth Enevoldsen

Department of Clinical Medicine, Aarhus University, Aarhus, Denmark, Center for Humanities Computing, Aarhus University, Aarhus, Denmark

Jakob Damgaard

Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark, Psychosis Research Unit, Aarhus University Hospital - Psychiatry, Denmark

Andreas Danielsen

Department of Clinical Medicine, Aarhus University, Aarhus, Denmark, Psychosis Research Unit, Aarhus University Hospital - Psychiatry, Denmark

Lasse Hansen

Editor

Marcel Stimberg

Citation (citation.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "Bernstorff"
    given-names: "Martin"
    orcid: "https://orcid.org/0000-0002-0234-5390"
  - family-names: "Enevoldsen"
    given-names: "Kenneth"
    orcid: "https://orcid.org/0000-0001-8733-0966"
  - family-names: "Damgaard"
    given-names: "Jakob Grøhn"
    orcid: "https://orcid.org/0000-0001-7092-2391"
  - family-names: "Danielsen"
    given-names: "Andreas"
    orcid: "https://orcid.org/0000-0002-6585-3616"
  - family-names: "Hansen"
    given-names: "Lasse"
    orcid: "https://orcid.org/0000-0003-1113-4779"
message: If you use this software, please cite our article in Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: "Bernstorff"
    given-names: "Martin"
    orcid: "https://orcid.org/0000-0002-0234-5390"
  - family-names: "Enevoldsen"
    given-names: "Kenneth"
    orcid: "https://orcid.org/0000-0001-8733-0966"
  - family-names: "Damgaard"
    given-names: "Jakob Grøhn"
    orcid: "https://orcid.org/0000-0001-7092-2391"
  - family-names: "Danielsen"
    given-names: "Andreas"
    orcid: "https://orcid.org/0000-0002-6585-3616"
  - family-names: "Hansen"
    given-names: "Lasse"
    orcid: "https://orcid.org/0000-0003-1113-4779"
  date-published: 2023-01-26
  doi: 10.21105/joss.05197
  issn: 2475-9066
  issue: 83
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 5197
  title: "timeseriesflattener: A Python package for summarizing features from (medical) time series"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.05197"
  volume: 8
title: "timeseriesflattener: A Python package for summarizing features from (medical) time series"

GitHub Events

Total

Create event: 25
Release event: 3
Issues event: 3
Watch event: 3
Delete event: 23
Member event: 1
Issue comment event: 37
Push event: 17
Pull request review event: 20
Pull request event: 38

Last Year

Create event: 25
Release event: 3
Issues event: 3
Watch event: 3
Delete event: 23
Member event: 1
Issue comment event: 37
Push event: 17
Pull request review event: 20
Pull request event: 38

Committers

Last synced: 7 months ago

All Time

Total Commits: 1,557
Total Committers: 13
Avg Commits per committer: 119.769
Development Distribution Score (DDS): 0.449

Past Year

Commits: 61
Committers: 6
Avg Commits per committer: 10.167
Development Distribution Score (DDS): 0.311

Top Committers

Name	Email	Commits
Martin Bernstorff	m**f@g**m	858
Lasse	l**0@g**m	240
dependabot[bot]	4****]	117
github-actions	g**s@g**m	105
bokajgd	b**d@g**m	83
sarakolding	s**g@l**k	54
github-actions	a**n@g**m	40
Kenneth Enevoldsen	k**n@g**m	34
semantic-release	s****e	9
frillecode	f**5@g**m	7
Yaroslav Halchenko	d**n@o**m	4
erikperfalk	e**k@g**m	4
signekb	s**k@g**m	2

Committer Domains (Top 20 + Academic)

github.com: 2 onerussian.com: 1 live.dk: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 18
Total pull requests: 83
Average time to close issues: 25 days
Average time to close pull requests: 12 days
Total issue authors: 4
Total pull request authors: 5
Average comments per issue: 2.44
Average comments per pull request: 1.65
Merged pull requests: 16
Bot issues: 0
Bot pull requests: 68

Past Year

Issues: 2
Pull requests: 46
Average time to close issues: 13 days
Average time to close pull requests: 14 days
Issue authors: 1
Pull request authors: 4
Average comments per issue: 1.0
Average comments per pull request: 1.43
Merged pull requests: 7
Bot issues: 0
Bot pull requests: 40

View more stats

Top Authors

Issue Authors

MartinBernstorff (63)
HLasse (23)
dependabot[bot] (3)
bokajgd (2)
sarakolding (1)
XiaoJia849 (1)

Pull Request Authors

dependabot[bot] (153)
MartinBernstorff (69)
HLasse (35)
sarakolding (12)
bokajgd (4)

Top Labels

Issue Labels

in-progress (45) Stale (9) dependencies (3)

Pull Request Labels

dependencies (155) Stale (7) closed-by-stalebot (1)

Packages

Total packages: 1
Total downloads:
- pypi 284 last-month

Total dependent packages: 1
Total dependent repositories: 1
Total versions: 102
Total maintainers: 1

pypi.org: timeseriesflattener

A package for converting time series data from e.g. electronic health records into wide format data.

Documentation: https://timeseriesflattener.readthedocs.io/
License: MIT License Copyright (c) 2022 PSYCOP Group, Aarhus University Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Latest release: 2.5.2
published 8 months ago

Versions: 102
Dependent Packages: 1
Dependent Repositories: 1
Downloads: 284 Last month

Rankings

Dependent packages count: 4.7%

Downloads: 5.7%

Average: 10.7%

Dependent repos count: 21.7%

Maintainers (1)

ryqiem

Last synced: 6 months ago

Dependencies

.github/workflows/generate_paper_pdf.yml actions

actions/checkout v2 composite
actions/upload-artifact v1 composite
openjournals/openjournals-draft-action master composite

.github/actions/test/action.yml actions

MartinBernstorff/cache-poetry-and-venv latest composite
actions/setup-python v4 composite
snok/install-poetry v1 composite

.github/actions/test_tutorials/action.yml actions

MartinBernstorff/cache-poetry-and-venv latest composite
actions/setup-python v4 composite
snok/install-poetry v1 composite

.github/workflows/dependabot_automerge.yml actions

hmarr/auto-approve-action v2 composite

.github/workflows/documentation.yml actions

JamesIves/github-pages-deploy-action 4.1.4 composite
MartinBernstorff/cache-poetry-and-venv latest composite
actions/checkout v3 composite
actions/setup-python v4 composite
snok/install-poetry v1 composite

.github/workflows/main_test_and_release.yml actions

./.github/actions/test * composite
./.github/actions/test_tutorials * composite
actions/checkout v3 composite
actions/checkout v2 composite
relekang/python-semantic-release v7.32.0 composite

.github/workflows/pre-commit.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite
pre-commit/action v3.0.0 composite

pyproject.toml pypi

black >=22.8.0,<22.10.1 develop
docformatter >=1.5.0, <1.5.2 develop
flake8 >=5.0.0, <5.0.6 develop
furo 2022.9.29 develop
mypy >=0.971,<0.992 develop
myst-parser >=0.18.1,<0.18.2 develop
pre-commit >=2.20.0, <2.20.2 develop
pylint >=2.15.5,<2.16.0 develop
pytest >=7.1.3, <7.1.5 develop
pytest-cov >=3.0.0,<4.0.1 develop
pytest-xdist >=2.4.0, <2.5.2 develop
sphinx >=5.3.0,<5.4.0 develop
sphinx-copybutton >=0.5.1,<0.5.2 develop
sphinx_design >=0.3.0,<0.3.1 develop
sphinxext-opengraph >=0.7.3,<0.7.4 develop
SQLAlchemy >=1.4.41, <1.5.42
catalogue >=2.0.0, <2.1.0
coloredlogs >14.0.0,<15.1.0
dask >=2022.9.0,<2022.12.0
deepchecks >=0.8.0,<0.10.0
dill >=0.3.0, <0.3.6
frozendict >=2.3.4,<2.4.0
jupyter >=1.0.0,<1.1.0
numpy >=1.23.3,<1.23.6
pandas >=1.4.0,<1.6.0
protobuf <=3.20.3
psutil >=5.9.1, <6.0.0
psycopmlutils >=0.2.4, <0.3.0
pyarrow >=9.0.0,<9.1.0
pydantic >=1.9.0, <1.10.0
pyodbc >=4.0.34, <4.0.36
python >=3.9, <3.11
scikit-learn >=1.1.2, <1.1.3
scipy >=1.8.0,<1.9.4
skimpy >=0.0.7,<0.1.0
srsly >=2.4.4,<2.4.6
wandb >=0.12.0,<0.13.5
wasabi >=0.9.1,<0.10.2

timeseriesflattener

Science Score: 98.0%

Keywords

Keywords from Contributors

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Timeseriesflattener

🔧 Installation

⚡ Quick start

Load a dataframe with times you wish to make a prediction

Load a dataframe with raw values you wish to aggregate as predictors

Load a dataframe specifying when the outcome occurs

Specify how to aggregate the predictors and define the outcome

Instantiate TimeseriesFlattener and add the specifications

📖 Tutorial

💬 Where to ask questions

🎓 Projects

Owner

JOSS Publication

timeseriesflattener: A Python package for summarizing features from (medical) time series

Authors

Editor

Tags

Citation (citation.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: timeseriesflattener

Rankings

Maintainers (1)

Dependencies