skpro

A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python

https://github.com/sktime/skpro

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
✓
Committers with academic emails
1 of 22 committers (4.5%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.9%) to scientific vocabulary

Keywords

ai data-science distributional-regression distributions failure-prediction framework machine-learning prediction probability-distributions python regression sklearn sktime survival-analysis survival-models survival-prediction time-to-event

Keywords from Contributors

interactive ode pde mesh interpretability profiles transformers sequences generic projection

Last synced: 6 months ago · JSON representation ·

Repository

A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python

Basic Info

Host: GitHub
Owner: sktime
License: bsd-3-clause
Language: Python
Default Branch: main
Homepage: https://skpro.readthedocs.io/en/latest
Size: 11.8 MB

Statistics

Stars: 278
Watchers: 10
Forks: 57
Open Issues: 83
Releases: 26

Topics

Created over 8 years ago · Last pushed 6 months ago

Metadata Files

Readme Contributing License Code of conduct Citation Codeowners Authors

README.md

:rocket: Version 2.9.3 out now! Read the release notes here..

skpro is a library for supervised probabilistic prediction in python. It provides scikit-learn-like, scikit-base compatible interfaces to:

tabular supervised regressors for probabilistic prediction - interval, quantile and distribution predictions
tabular probabilistic time-to-event and survival prediction - instance-individual survival distributions
metrics to evaluate probabilistic predictions, e.g., pinball loss, empirical coverage, CRPS, survival losses
reductions to turn scikit-learn regressors into probabilistic skpro regressors, such as bootstrap or conformal
building pipelines and composite models, including tuning via probabilistic performance metrics
symbolic probability distributions with value domain of pandas.DataFrame-s and pandas-like interface

| Overview | | |---|---| | Open Source | | | Tutorials | | | Community | | | CI/CD | | | Code | | | Downloads | PyPI - Downloads | | Citation | |

:books: Documentation

| Documentation | | | -------------------------- | -------------------------------------------------------------- | | :star: Tutorials | New to skpro? Here's everything you need to know! | | :clipboard: Binder Notebooks | Example notebooks to play with in your browser. | | :womantechnologist: User Guides | How to use skpro and its features. | | :scissors: Extension Templates | How to build your own estimator using skpro's API. | | :controlknobs: API Reference | The detailed reference for skpro's API. | | :hammerandwrench: Changelog | Changes and version history. | | :deciduous_tree: Roadmap | skpro's software and community development plan. | | :pencil: Related Software | A list of related software. |

:speech_balloon: Where to ask questions

Questions and feedback are extremely welcome! We strongly believe in the value of sharing help publicly, as it allows a wider audience to benefit from it.

skpro is maintained by the sktime community, we use the same social channels.

| Type | Platforms | | ------------------------------- | --------------------------------------- | | :bug: Bug Reports | GitHub Issue Tracker | | :sparkles: Feature Requests & Ideas | GitHub Issue Tracker | | :womantechnologist: Usage Questions | GitHub Discussions · Stack Overflow | | :speechballoon: General Discussion | GitHub Discussions | | :factory: Contribution & Development | dev-chat channel · Discord | | :globewithmeridians: Community collaboration session | Discord - Fridays 13 UTC, dev/meet-ups channel |

:dizzy: Features

Our objective is to enhance the interoperability and usability of the AI model ecosystem:

skpro is compatible with scikit-learn and sktime, e.g., an sktime proba forecaster can be built with an skpro proba regressor which in an sklearn regressor with proba mode added by skpro
skpro provides a mini-package management framework for first-party implementations, and for interfacing popular second- and third-party components, such as cyclic-boosting, MAPIE, or ngboost packages.

skpro curates libraries of components of the following types:

| Module | Status | Links | |---|---|---| | Probabilistic tabular regression | maturing | Tutorial · API Reference · Extension Template | | Time-to-event (survival) prediction | maturing | Tutorial · API Reference · Extension Template | | Performance metrics | maturing | API Reference | | Probability distributions | maturing | Tutorial · API Reference · Extension Template |

:hourglassflowingsand: Installing `skpro`

To install skpro, use pip:

bash pip install skpro

or, with maximum dependencies,

bash pip install skpro[all_extras]

Releases are available as source packages and binary wheels. You can see all available wheels here.

:zap: Quickstart

Making probabilistic predictions

``` python from sklearn.datasets import loaddiabetes from sklearn.ensemble import RandomForestRegressor from sklearn.linearmodel import LinearRegression from sklearn.modelselection import traintest_split

from skpro.regression.residual import ResidualDouble

step 1: data specification

X, y = loaddiabetes(returnXy=True, asframe=True) Xtrain, Xnew, ytrain, _ = traintest_split(X, y)

step 2: specifying the regressor - any compatible regressor is valid!

example - "squaring residuals" regressor

random forest for mean prediction

linear regression for variance prediction

regmean = RandomForestRegressor() regresid = LinearRegression() regproba = ResidualDouble(regmean, reg_resid)

step 3: fitting the model to training data

regproba.fit(Xtrain, y_train)

step 4: predicting labels on new data

probabilistic prediction modes - pick any or multiple

full distribution prediction

ypredproba = regproba.predictproba(X_new)

interval prediction

ypredinterval = regproba.predictinterval(X_new, coverage=0.9)

quantile prediction

ypredquantiles = regproba.predictquantiles(X_new, alpha=[0.05, 0.5, 0.95])

variance prediction

ypredvar = regproba.predictvar(X_new)

mean prediction is same as "classical" sklearn predict, also available

ypredmean = regproba.predict(Xnew) ```

Evaluating predictions

``` python

step 5: specifying evaluation metric

from skpro.metrics import CRPS

metric = CRPS() # continuous rank probability score - any skpro metric works!

step 6: evaluat metric, compare predictions to actuals

metric(ytest, ypred_proba)

32.19 ```

:wave: How to get involved

There are many ways to get involved with development of skpro, which is developed by the sktime community. We follow the all-contributors specification: all kinds of contributions are welcome - not just code.

| Documentation | | | -------------------------- | -------------------------------------------------------------- | | :giftheart: Contribute | How to contribute to skpro. | | :schoolsatchel: Mentoring | New to open source? Apply to our mentoring program! | | :date: Meetings | Join our discussions, tutorials, workshops, and sprints! | | :womanmechanic: Developer Guides | How to further develop the skpro code base. | | :medalsports: Contributors | A list of all contributors. | | :raisinghand: Roles | An overview of our core community roles. | | :moneywithwings: Donate | Fund sktime and skpro maintenance and development. | | :classicalbuilding: Governance | How and by whom decisions are made in the sktime community. |

:wave: Citation

To cite skpro in a scientific publication, see citations.

Owner

Name: sktime
Login: sktime
Kind: organization
Email: sktime.toolbox@gmail.com

Website: https://www.sktime.net
Repositories: 29
Profile: https://github.com/sktime

A unified framework for machine learning with time series

Citation (CITATION.rst)

Gressmann, F., Király, F. J., Mateen, B., & Oberhauser, H. (2018). Probabilistic supervised learning. ArXiv:1801.00753 [Cs, Math, Stat]. Retrieved from http://arxiv.org/abs/1801.00753 ::

    @article{skpro,
      archivePrefix = {arXiv},
      eprinttype = {arxiv},
      eprint = {1801.00753},
      primaryClass = {cs, math, stat},
      title = {Probabilistic Supervised Learning},
      url = {http://arxiv.org/abs/1801.00753},
      urldate = {2018-01-03},
      date = {2018-01-02},
      author = {Gressmann, Frithjof and Kir{\'a}ly, Franz J. and Mateen, Bilal and Oberhauser, Harald}
    }

GitHub Events

Total

Create event: 69
Release event: 11
Issues event: 16
Watch event: 46
Delete event: 31
Issue comment event: 133
Push event: 139
Pull request review comment event: 29
Pull request review event: 71
Pull request event: 158
Fork event: 16

Last Year

Create event: 69
Release event: 11
Issues event: 16
Watch event: 46
Delete event: 31
Issue comment event: 133
Push event: 139
Pull request review comment event: 29
Pull request review event: 71
Pull request event: 158
Fork event: 16

Committers

Last synced: 8 months ago

All Time

Total Commits: 502
Total Committers: 22
Avg Commits per committer: 22.818
Development Distribution Score (DDS): 0.418

Past Year

Commits: 94
Committers: 8
Avg Commits per committer: 11.75
Development Distribution Score (DDS): 0.415

Top Committers

Name	Email	Commits
Franz Király	f**y@u**k	292
Frithjof Gressmann	h**o@n**e	101
dependabot[bot]	4****]	50
ShreeshaM07	1****7	12
Julian Fong	4****g	8
Vitaly Davydov	1**0@g**m	7
Sai Revanth	1****5	5
Meraldo Antonio	3****o	4
Malik Akbar Hashemi Rafsanjani	7****n	4
setoguchi-naoki	1****i	3
Alex-JG3	4****3	3
Viktor Szépe	v**r@s**t	2
Sukrit Jindal	1****t	2
Jof	j**y@p**k	1
Anirban Ray	3****a	1
Anshuman Dangwal	1****5	1
Mridul Jain	1****l	1
Nilesh Das	n**8@g**m	1
RUPESH-KUMAR01	1****1	1
Ramon Bussing	4****B	1
bhavikar04	7****4	1
duydl	5****l	1

Committer Domains (Top 20 + Academic)

posteo.co.uk: 1 szepe.net: 1 nocio.de: 1 ucl.ac.uk: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 110
Total pull requests: 728
Average time to close issues: 4 months
Average time to close pull requests: 7 days
Total issue authors: 17
Total pull request authors: 27
Average comments per issue: 1.79
Average comments per pull request: 1.33
Merged pull requests: 606
Bot issues: 1
Bot pull requests: 107

Past Year

Issues: 16
Pull requests: 187
Average time to close issues: 9 days
Average time to close pull requests: 6 days
Issue authors: 5
Pull request authors: 12
Average comments per issue: 0.75
Average comments per pull request: 0.67
Merged pull requests: 132
Bot issues: 1
Bot pull requests: 54

View more stats

Top Authors

Issue Authors

fkiraly (86)
ShreeshaM07 (5)
julian-fong (3)
Alex-JG3 (2)
meraldoantonio (2)
meh2135 (1)
honestee (1)
satya-pattnaik (1)
joshdunnlime (1)
malikrafsan (1)
dependabot[bot] (1)
gmgeorg (1)
bhavikar04 (1)
Ram0nB (1)
fsaforo1 (1)

Pull Request Authors

fkiraly (472)
dependabot[bot] (107)
ShreeshaM07 (27)
julian-fong (24)
meraldoantonio (16)
SaiRevanth25 (14)
malikrafsan (10)
tingiskhan (6)
VascoSch92 (6)
setoguchi-naoki (5)
bhavikar04 (5)
szepeviktor (4)
sukjingitsit (4)
JahnaviDhanaSri (3)
RUPESH-KUMAR01 (3)

Top Labels

Issue Labels

feature request (46) module:probability&simulation (36) enhancement (34) module:regression (32) bug (25) good first issue (17) implementing algorithms (16) maintenance (13) API design (12) interfacing algorithms (10) module:survival&time-to-event (7) implementing framework (6) module:metrics&benchmarking (5) documentation (4) module:tests (4) module:base-framework (3) module:datatypes (3) math&theory (2) module:transformations (1)

Pull Request Labels

maintenance (259) enhancement (236) module:probability&simulation (161) module:regression (107) documentation (92) bug (76) release (63) implementing algorithms (43) module:tests (35) module:survival&time-to-event (35) interfacing algorithms (33) module:base-framework (24) module:metrics&benchmarking (20) module:datatypes (16) implementing framework (9)

Packages

Total packages: 2
Total downloads:
- pypi 32,194 last-month

Total dependent packages: 1
(may contain duplicates)
Total dependent repositories: 1
(may contain duplicates)
Total versions: 53
Total maintainers: 2

proxy.golang.org: github.com/sktime/skpro

Documentation: https://pkg.go.dev/github.com/sktime/skpro#section-documentation
License: bsd-3-clause
Latest release: v2.9.3+incompatible
published 7 months ago

Versions: 27
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.5%

Average: 6.7%

Dependent repos count: 6.9%

Last synced: 6 months ago

pypi.org: skpro

A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python

Homepage: https://github.com/sktime/skpro
Documentation: https://github.com/sktime/skpro
License: BSD License
Latest release: 2.9.3
published 7 months ago

Versions: 26
Dependent Packages: 1
Dependent Repositories: 1
Downloads: 32,194 Last month

Rankings

Stargazers count: 6.6%

Dependent packages count: 7.3%

Forks count: 8.6%

Average: 11.4%

Downloads: 12.5%

Dependent repos count: 22.1%

Maintainers (2)

frthjf fkiraly

Last synced: 6 months ago

Dependencies

.github/workflows/cancel.yml actions

styfle/cancel-workflow-action 0.9.1 composite

.github/workflows/dependency-review.yml actions

actions/checkout v3 composite
actions/dependency-review-action v1 composite

.github/workflows/test.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite
codecov/codecov-action v3 composite
pre-commit/action v3.0.0 composite
trilom/file-changes-action v1.2.4 composite

.github/workflows/wheels.yml actions

actions/checkout v3 composite
actions/download-artifact v2 composite
actions/setup-python v4 composite
actions/upload-artifact v2 composite
conda-incubator/setup-miniconda v2 composite
pypa/gh-action-pypi-publish release/v1 composite

pyproject.toml pypi

numpy >=1.21.0,<1.25
packaging *
pandas >=1.1.0
scikit-base >=0.4.3
scikit-learn >=0.24.0,<1.2.0
scipy <2.0.0,>=1.2.0

skpro

Science Score: 77.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

:books: Documentation

:speech_balloon: Where to ask questions

:dizzy: Features

:hourglassflowingsand: Installing skpro

:zap: Quickstart

Making probabilistic predictions

step 1: data specification

step 2: specifying the regressor - any compatible regressor is valid!

example - "squaring residuals" regressor

random forest for mean prediction

linear regression for variance prediction

step 3: fitting the model to training data

step 4: predicting labels on new data

probabilistic prediction modes - pick any or multiple

full distribution prediction

interval prediction

quantile prediction

variance prediction

mean prediction is same as "classical" sklearn predict, also available

Evaluating predictions

step 5: specifying evaluation metric

step 6: evaluat metric, compare predictions to actuals

:wave: How to get involved

:wave: Citation

Owner

Citation (CITATION.rst)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/sktime/skpro

Rankings

pypi.org: skpro

Rankings

Maintainers (2)

Dependencies

:hourglassflowingsand: Installing `skpro`