timeseriesflattener
timeseriesflattener: A Python package for summarizing features from (medical) time series - Published in JOSS (2023)
https://github.com/aarhus-psychiatry-research/timeseriesflattener
Science Score: 98.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Keywords from Contributors
Scientific Fields
Repository
Converting irregularly spaced time series, such as eletronic health records, into dataframes for tabular classification.
Basic Info
- Host: GitHub
- Owner: Aarhus-Psychiatry-Research
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://Aarhus-Psychiatry-Research.github.io/timeseriesflattener
- Size: 114 MB
Statistics
- Stars: 19
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 100
Topics
Metadata Files
README.md
Timeseriesflattener
Time series from e.g. electronic health records often have a large number of variables, are sampled at irregular intervals and tend to have a large number of missing values. Before this type of data can be used for prediction modelling with machine learning methods such as logistic regression or XGBoost, the data needs to be reshaped.
In essence, the time series need to be flattened so that each prediction time is represented by a set of predictor values and an outcome value. These predictor values can be constructed by aggregating the preceding values in the time series within a certain time window.
timeseriesflattener aims to simplify this process by providing an easy-to-use and fully-specified pipeline for flattening complex time series.
🔧 Installation
To get started using timeseriesflattener simply install it using pip by running the following line in your terminal:
pip install timeseriesflattener
⚡ Quick start
```py import datetime as dt
import numpy as np import polars as pl
Load a dataframe with times you wish to make a prediction
predictiontimesdf = pl.DataFrame( {"id": [1, 1, 2], "date": ["2020-01-01", "2020-02-01", "2020-02-01"]} )
Load a dataframe with raw values you wish to aggregate as predictors
predictordf = pl.DataFrame( { "id": [1, 1, 1, 2], "date": ["2020-01-15", "2019-12-10", "2019-12-15", "2020-01-02"], "predictorvalue": [1, 2, 3, 4], } )
Load a dataframe specifying when the outcome occurs
outcomedf = pl.DataFrame({"id": [1], "date": ["2020-03-01"], "outcomevalue": [1]})
Specify how to aggregate the predictors and define the outcome
from timeseriesflattener import ( MaxAggregator, MinAggregator, OutcomeSpec, PredictionTimeFrame, PredictorSpec, ValueFrame, )
predictorspec = PredictorSpec( valueframe=ValueFrame( initdf=predictordf, entityidcolname="id", valuetimestampcolname="date" ), lookbehinddistances=[dt.timedelta(days=1)], aggregators=[MaxAggregator(), MinAggregator()], fallback=np.nan, columnprefix="pred", )
outcomespec = OutcomeSpec( valueframe=ValueFrame( initdf=outcomedf, entityidcolname="id", valuetimestampcolname="date" ), lookaheaddistances=[dt.timedelta(days=1)], aggregators=[MaxAggregator(), MinAggregator()], fallback=np.nan, columnprefix="outc", )
Instantiate TimeseriesFlattener and add the specifications
from timeseriesflattener import Flattener
result = Flattener( predictiontimeframe=PredictionTimeFrame( initdf=predictiontimesdf, entityidcolname="id", timestampcolname="date" ) ).aggregatetimeseries(specs=[predictorspec, outcomespec]) result.df
```
Output:
| | id | date | predictiontimeuuid | predtestfeaturewithin30daysmeanfallbacknan | outctestoutcomewithin31daysmaximumfallback0_dichotomous | | --: | --: | :------------------ | :-------------------- | -------------------------------------------------: | --------------------------------------------------------------: | | 0 | 1 | 2020-01-01 00:00:00 | 1-2020-01-01-00-00-00 | 2.5 | 0 | | 1 | 1 | 2020-02-01 00:00:00 | 1-2020-02-01-00-00-00 | 1 | 1 | | 2 | 2 | 2020-02-01 00:00:00 | 2-2020-02-01-00-00-00 | 4 | 0 |
📖 Tutorial
💬 Where to ask questions
| Type | | | ------------------------------- | ---------------------- | | 🚨 Bug Reports | GitHub Issue Tracker | | 🎁 Feature Requests & Ideas | GitHub Issue Tracker | | 👩💻 Usage Questions | GitHub Discussions | | 🗯 General Discussion | GitHub Discussions |
🎓 Projects
PSYCOP projects use timeseriesflattener, see more at the monorepo.
Owner
- Name: PSYCOP, Aarhus Psychiatry Research
- Login: Aarhus-Psychiatry-Research
- Kind: organization
- Location: Aarhus, Denmark
- Website: https://www.psykiatrien.rm.dk/afdelinger/auhpsykiatrien/afdeling-for-depression-og-angst/About/
- Repositories: 4
- Profile: https://github.com/Aarhus-Psychiatry-Research
PSYchiatric Clinical Outcome Prediction (PSYCOP). Aarhus University Hospital - Psychiatry
JOSS Publication
timeseriesflattener: A Python package for summarizing features from (medical) time series
Authors
Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark, Department of Clinical Medicine, Aarhus University, Aarhus, Denmark, Center for Humanities Computing, Aarhus University, Aarhus, Denmark
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark, Center for Humanities Computing, Aarhus University, Aarhus, Denmark
Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark, Psychosis Research Unit, Aarhus University Hospital - Psychiatry, Denmark
Tags
time series electronic health records medical time series feature extractionCitation (citation.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Bernstorff"
given-names: "Martin"
orcid: "https://orcid.org/0000-0002-0234-5390"
- family-names: "Enevoldsen"
given-names: "Kenneth"
orcid: "https://orcid.org/0000-0001-8733-0966"
- family-names: "Damgaard"
given-names: "Jakob Grøhn"
orcid: "https://orcid.org/0000-0001-7092-2391"
- family-names: "Danielsen"
given-names: "Andreas"
orcid: "https://orcid.org/0000-0002-6585-3616"
- family-names: "Hansen"
given-names: "Lasse"
orcid: "https://orcid.org/0000-0003-1113-4779"
message: If you use this software, please cite our article in Journal of Open Source Software.
preferred-citation:
authors:
- family-names: "Bernstorff"
given-names: "Martin"
orcid: "https://orcid.org/0000-0002-0234-5390"
- family-names: "Enevoldsen"
given-names: "Kenneth"
orcid: "https://orcid.org/0000-0001-8733-0966"
- family-names: "Damgaard"
given-names: "Jakob Grøhn"
orcid: "https://orcid.org/0000-0001-7092-2391"
- family-names: "Danielsen"
given-names: "Andreas"
orcid: "https://orcid.org/0000-0002-6585-3616"
- family-names: "Hansen"
given-names: "Lasse"
orcid: "https://orcid.org/0000-0003-1113-4779"
date-published: 2023-01-26
doi: 10.21105/joss.05197
issn: 2475-9066
issue: 83
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 5197
title: "timeseriesflattener: A Python package for summarizing features from (medical) time series"
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.05197"
volume: 8
title: "timeseriesflattener: A Python package for summarizing features from (medical) time series"
GitHub Events
Total
- Create event: 25
- Release event: 3
- Issues event: 3
- Watch event: 3
- Delete event: 23
- Member event: 1
- Issue comment event: 37
- Push event: 17
- Pull request review event: 20
- Pull request event: 38
Last Year
- Create event: 25
- Release event: 3
- Issues event: 3
- Watch event: 3
- Delete event: 23
- Member event: 1
- Issue comment event: 37
- Push event: 17
- Pull request review event: 20
- Pull request event: 38
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Martin Bernstorff | m****f@g****m | 858 |
| Lasse | l****0@g****m | 240 |
| dependabot[bot] | 4****] | 117 |
| github-actions | g****s@g****m | 105 |
| bokajgd | b****d@g****m | 83 |
| sarakolding | s****g@l****k | 54 |
| github-actions | a****n@g****m | 40 |
| Kenneth Enevoldsen | k****n@g****m | 34 |
| semantic-release | s****e | 9 |
| frillecode | f****5@g****m | 7 |
| Yaroslav Halchenko | d****n@o****m | 4 |
| erikperfalk | e****k@g****m | 4 |
| signekb | s****k@g****m | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 18
- Total pull requests: 83
- Average time to close issues: 25 days
- Average time to close pull requests: 12 days
- Total issue authors: 4
- Total pull request authors: 5
- Average comments per issue: 2.44
- Average comments per pull request: 1.65
- Merged pull requests: 16
- Bot issues: 0
- Bot pull requests: 68
Past Year
- Issues: 2
- Pull requests: 46
- Average time to close issues: 13 days
- Average time to close pull requests: 14 days
- Issue authors: 1
- Pull request authors: 4
- Average comments per issue: 1.0
- Average comments per pull request: 1.43
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 40
Top Authors
Issue Authors
- MartinBernstorff (63)
- HLasse (23)
- dependabot[bot] (3)
- bokajgd (2)
- sarakolding (1)
- XiaoJia849 (1)
Pull Request Authors
- dependabot[bot] (153)
- MartinBernstorff (69)
- HLasse (35)
- sarakolding (12)
- bokajgd (4)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 284 last-month
- Total dependent packages: 1
- Total dependent repositories: 1
- Total versions: 102
- Total maintainers: 1
pypi.org: timeseriesflattener
A package for converting time series data from e.g. electronic health records into wide format data.
- Documentation: https://timeseriesflattener.readthedocs.io/
- License: MIT License Copyright (c) 2022 PSYCOP Group, Aarhus University Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
Latest release: 2.5.2
published 6 months ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v2 composite
- actions/upload-artifact v1 composite
- openjournals/openjournals-draft-action master composite
- MartinBernstorff/cache-poetry-and-venv latest composite
- actions/setup-python v4 composite
- snok/install-poetry v1 composite
- MartinBernstorff/cache-poetry-and-venv latest composite
- actions/setup-python v4 composite
- snok/install-poetry v1 composite
- hmarr/auto-approve-action v2 composite
- JamesIves/github-pages-deploy-action 4.1.4 composite
- MartinBernstorff/cache-poetry-and-venv latest composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- snok/install-poetry v1 composite
- ./.github/actions/test * composite
- ./.github/actions/test_tutorials * composite
- actions/checkout v3 composite
- actions/checkout v2 composite
- relekang/python-semantic-release v7.32.0 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- pre-commit/action v3.0.0 composite
- black >=22.8.0,<22.10.1 develop
- docformatter >=1.5.0, <1.5.2 develop
- flake8 >=5.0.0, <5.0.6 develop
- furo 2022.9.29 develop
- mypy >=0.971,<0.992 develop
- myst-parser >=0.18.1,<0.18.2 develop
- pre-commit >=2.20.0, <2.20.2 develop
- pylint >=2.15.5,<2.16.0 develop
- pytest >=7.1.3, <7.1.5 develop
- pytest-cov >=3.0.0,<4.0.1 develop
- pytest-xdist >=2.4.0, <2.5.2 develop
- sphinx >=5.3.0,<5.4.0 develop
- sphinx-copybutton >=0.5.1,<0.5.2 develop
- sphinx_design >=0.3.0,<0.3.1 develop
- sphinxext-opengraph >=0.7.3,<0.7.4 develop
- SQLAlchemy >=1.4.41, <1.5.42
- catalogue >=2.0.0, <2.1.0
- coloredlogs >14.0.0,<15.1.0
- dask >=2022.9.0,<2022.12.0
- deepchecks >=0.8.0,<0.10.0
- dill >=0.3.0, <0.3.6
- frozendict >=2.3.4,<2.4.0
- jupyter >=1.0.0,<1.1.0
- numpy >=1.23.3,<1.23.6
- pandas >=1.4.0,<1.6.0
- protobuf <=3.20.3
- psutil >=5.9.1, <6.0.0
- psycopmlutils >=0.2.4, <0.3.0
- pyarrow >=9.0.0,<9.1.0
- pydantic >=1.9.0, <1.10.0
- pyodbc >=4.0.34, <4.0.36
- python >=3.9, <3.11
- scikit-learn >=1.1.2, <1.1.3
- scipy >=1.8.0,<1.9.4
- skimpy >=0.0.7,<0.1.0
- srsly >=2.4.4,<2.4.6
- wandb >=0.12.0,<0.13.5
- wasabi >=0.9.1,<0.10.2