skpro

A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python

https://github.com/sktime/skpro

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    1 of 22 committers (4.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.9%) to scientific vocabulary

Keywords

ai data-science distributional-regression distributions failure-prediction framework machine-learning prediction probability-distributions python regression sklearn sktime survival-analysis survival-models survival-prediction time-to-event

Keywords from Contributors

interactive ode pde mesh interpretability profiles transformers sequences generic projection
Last synced: 4 months ago · JSON representation ·

Repository

A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python

Basic Info
Statistics
  • Stars: 278
  • Watchers: 10
  • Forks: 57
  • Open Issues: 83
  • Releases: 26
Topics
ai data-science distributional-regression distributions failure-prediction framework machine-learning prediction probability-distributions python regression sklearn sktime survival-analysis survival-models survival-prediction time-to-event
Created over 8 years ago · Last pushed 5 months ago
Metadata Files
Readme Contributing License Code of conduct Citation Codeowners Authors

README.md

:rocket: Version 2.9.3 out now! Read the release notes here..

skpro is a library for supervised probabilistic prediction in python. It provides scikit-learn-like, scikit-base compatible interfaces to:

  • tabular supervised regressors for probabilistic prediction - interval, quantile and distribution predictions
  • tabular probabilistic time-to-event and survival prediction - instance-individual survival distributions
  • metrics to evaluate probabilistic predictions, e.g., pinball loss, empirical coverage, CRPS, survival losses
  • reductions to turn scikit-learn regressors into probabilistic skpro regressors, such as bootstrap or conformal
  • building pipelines and composite models, including tuning via probabilistic performance metrics
  • symbolic probability distributions with value domain of pandas.DataFrame-s and pandas-like interface

| Overview | | |---|---| | Open Source | BSD 3-clause | | Tutorials | Binder !youtube | | Community | !discord !slack | | CI/CD | github-actions !codecov readthedocs platform | | Code | !pypi !conda !python-versions !black | | Downloads | PyPI - Downloads PyPI - Downloads Downloads | | Citation | DOI |

:books: Documentation

| Documentation | | | -------------------------- | -------------------------------------------------------------- | | :star: Tutorials | New to skpro? Here's everything you need to know! | | :clipboard: Binder Notebooks | Example notebooks to play with in your browser. | | :womantechnologist: User Guides | How to use skpro and its features. | | :scissors: Extension Templates | How to build your own estimator using skpro's API. | | :controlknobs: API Reference | The detailed reference for skpro's API. | | :hammerandwrench: Changelog | Changes and version history. | | :deciduous_tree: Roadmap | skpro's software and community development plan. | | :pencil: Related Software | A list of related software. |

:speech_balloon: Where to ask questions

Questions and feedback are extremely welcome! We strongly believe in the value of sharing help publicly, as it allows a wider audience to benefit from it.

skpro is maintained by the sktime community, we use the same social channels.

| Type | Platforms | | ------------------------------- | --------------------------------------- | | :bug: Bug Reports | GitHub Issue Tracker | | :sparkles: Feature Requests & Ideas | GitHub Issue Tracker | | :womantechnologist: Usage Questions | GitHub Discussions · Stack Overflow | | :speechballoon: General Discussion | GitHub Discussions | | :factory: Contribution & Development | dev-chat channel · Discord | | :globewithmeridians: Community collaboration session | Discord - Fridays 13 UTC, dev/meet-ups channel |

:dizzy: Features

Our objective is to enhance the interoperability and usability of the AI model ecosystem:

  • skpro is compatible with scikit-learn and sktime, e.g., an sktime proba forecaster can be built with an skpro proba regressor which in an sklearn regressor with proba mode added by skpro

  • skpro provides a mini-package management framework for first-party implementations, and for interfacing popular second- and third-party components, such as cyclic-boosting, MAPIE, or ngboost packages.

skpro curates libraries of components of the following types:

| Module | Status | Links | |---|---|---| | Probabilistic tabular regression | maturing | Tutorial · API Reference · Extension Template | | Time-to-event (survival) prediction | maturing | Tutorial · API Reference · Extension Template | | Performance metrics | maturing | API Reference | | Probability distributions | maturing | Tutorial · API Reference · Extension Template |

:hourglassflowingsand: Installing skpro

To install skpro, use pip:

bash pip install skpro

or, with maximum dependencies,

bash pip install skpro[all_extras]

Releases are available as source packages and binary wheels. You can see all available wheels here.

:zap: Quickstart

Making probabilistic predictions

``` python from sklearn.datasets import loaddiabetes from sklearn.ensemble import RandomForestRegressor from sklearn.linearmodel import LinearRegression from sklearn.modelselection import traintest_split

from skpro.regression.residual import ResidualDouble

step 1: data specification

X, y = loaddiabetes(returnXy=True, asframe=True) Xtrain, Xnew, ytrain, _ = traintest_split(X, y)

step 2: specifying the regressor - any compatible regressor is valid!

example - "squaring residuals" regressor

random forest for mean prediction

linear regression for variance prediction

regmean = RandomForestRegressor() regresid = LinearRegression() regproba = ResidualDouble(regmean, reg_resid)

step 3: fitting the model to training data

regproba.fit(Xtrain, y_train)

step 4: predicting labels on new data

probabilistic prediction modes - pick any or multiple

full distribution prediction

ypredproba = regproba.predictproba(X_new)

interval prediction

ypredinterval = regproba.predictinterval(X_new, coverage=0.9)

quantile prediction

ypredquantiles = regproba.predictquantiles(X_new, alpha=[0.05, 0.5, 0.95])

variance prediction

ypredvar = regproba.predictvar(X_new)

mean prediction is same as "classical" sklearn predict, also available

ypredmean = regproba.predict(Xnew) ```

Evaluating predictions

``` python

step 5: specifying evaluation metric

from skpro.metrics import CRPS

metric = CRPS() # continuous rank probability score - any skpro metric works!

step 6: evaluat metric, compare predictions to actuals

metric(ytest, ypred_proba)

32.19 ```

:wave: How to get involved

There are many ways to get involved with development of skpro, which is developed by the sktime community. We follow the all-contributors specification: all kinds of contributions are welcome - not just code.

| Documentation | | | -------------------------- | -------------------------------------------------------------- | | :giftheart: Contribute | How to contribute to skpro. | | :schoolsatchel: Mentoring | New to open source? Apply to our mentoring program! | | :date: Meetings | Join our discussions, tutorials, workshops, and sprints! | | :womanmechanic: Developer Guides | How to further develop the skpro code base. | | :medalsports: Contributors | A list of all contributors. | | :raisinghand: Roles | An overview of our core community roles. | | :moneywithwings: Donate | Fund sktime and skpro maintenance and development. | | :classicalbuilding: Governance | How and by whom decisions are made in the sktime community. |

:wave: Citation

To cite skpro in a scientific publication, see citations.

Owner

  • Name: sktime
  • Login: sktime
  • Kind: organization
  • Email: sktime.toolbox@gmail.com

A unified framework for machine learning with time series

Citation (CITATION.rst)

Gressmann, F., Király, F. J., Mateen, B., & Oberhauser, H. (2018). Probabilistic supervised learning. ArXiv:1801.00753 [Cs, Math, Stat]. Retrieved from http://arxiv.org/abs/1801.00753 ::

    @article{skpro,
      archivePrefix = {arXiv},
      eprinttype = {arxiv},
      eprint = {1801.00753},
      primaryClass = {cs, math, stat},
      title = {Probabilistic Supervised Learning},
      url = {http://arxiv.org/abs/1801.00753},
      urldate = {2018-01-03},
      date = {2018-01-02},
      author = {Gressmann, Frithjof and Kir{\'a}ly, Franz J. and Mateen, Bilal and Oberhauser, Harald}
    }

GitHub Events

Total
  • Create event: 69
  • Release event: 11
  • Issues event: 16
  • Watch event: 46
  • Delete event: 31
  • Issue comment event: 133
  • Push event: 139
  • Pull request review comment event: 29
  • Pull request review event: 71
  • Pull request event: 158
  • Fork event: 16
Last Year
  • Create event: 69
  • Release event: 11
  • Issues event: 16
  • Watch event: 46
  • Delete event: 31
  • Issue comment event: 133
  • Push event: 139
  • Pull request review comment event: 29
  • Pull request review event: 71
  • Pull request event: 158
  • Fork event: 16

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 502
  • Total Committers: 22
  • Avg Commits per committer: 22.818
  • Development Distribution Score (DDS): 0.418
Past Year
  • Commits: 94
  • Committers: 8
  • Avg Commits per committer: 11.75
  • Development Distribution Score (DDS): 0.415
Top Committers
Name Email Commits
Franz Király f****y@u****k 292
Frithjof Gressmann h****o@n****e 101
dependabot[bot] 4****] 50
ShreeshaM07 1****7 12
Julian Fong 4****g 8
Vitaly Davydov 1****0@g****m 7
Sai Revanth 1****5 5
Meraldo Antonio 3****o 4
Malik Akbar Hashemi Rafsanjani 7****n 4
setoguchi-naoki 1****i 3
Alex-JG3 4****3 3
Viktor Szépe v****r@s****t 2
Sukrit Jindal 1****t 2
Jof j****y@p****k 1
Anirban Ray 3****a 1
Anshuman Dangwal 1****5 1
Mridul Jain 1****l 1
Nilesh Das n****8@g****m 1
RUPESH-KUMAR01 1****1 1
Ramon Bussing 4****B 1
bhavikar04 7****4 1
duydl 5****l 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 110
  • Total pull requests: 728
  • Average time to close issues: 4 months
  • Average time to close pull requests: 7 days
  • Total issue authors: 17
  • Total pull request authors: 27
  • Average comments per issue: 1.79
  • Average comments per pull request: 1.33
  • Merged pull requests: 606
  • Bot issues: 1
  • Bot pull requests: 107
Past Year
  • Issues: 16
  • Pull requests: 187
  • Average time to close issues: 9 days
  • Average time to close pull requests: 6 days
  • Issue authors: 5
  • Pull request authors: 12
  • Average comments per issue: 0.75
  • Average comments per pull request: 0.67
  • Merged pull requests: 132
  • Bot issues: 1
  • Bot pull requests: 54
Top Authors
Issue Authors
  • fkiraly (86)
  • ShreeshaM07 (5)
  • julian-fong (3)
  • Alex-JG3 (2)
  • meraldoantonio (2)
  • meh2135 (1)
  • honestee (1)
  • satya-pattnaik (1)
  • joshdunnlime (1)
  • malikrafsan (1)
  • dependabot[bot] (1)
  • gmgeorg (1)
  • bhavikar04 (1)
  • Ram0nB (1)
  • fsaforo1 (1)
Pull Request Authors
  • fkiraly (472)
  • dependabot[bot] (107)
  • ShreeshaM07 (27)
  • julian-fong (24)
  • meraldoantonio (16)
  • SaiRevanth25 (14)
  • malikrafsan (10)
  • tingiskhan (6)
  • VascoSch92 (6)
  • setoguchi-naoki (5)
  • bhavikar04 (5)
  • szepeviktor (4)
  • sukjingitsit (4)
  • JahnaviDhanaSri (3)
  • RUPESH-KUMAR01 (3)
Top Labels
Issue Labels
feature request (46) module:probability&simulation (36) enhancement (34) module:regression (32) bug (25) good first issue (17) implementing algorithms (16) maintenance (13) API design (12) interfacing algorithms (10) module:survival&time-to-event (7) implementing framework (6) module:metrics&benchmarking (5) documentation (4) module:tests (4) module:base-framework (3) module:datatypes (3) math&theory (2) module:transformations (1)
Pull Request Labels
maintenance (259) enhancement (236) module:probability&simulation (161) module:regression (107) documentation (92) bug (76) release (63) implementing algorithms (43) module:tests (35) module:survival&time-to-event (35) interfacing algorithms (33) module:base-framework (24) module:metrics&benchmarking (20) module:datatypes (16) implementing framework (9)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 32,194 last-month
  • Total dependent packages: 1
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 53
  • Total maintainers: 2
proxy.golang.org: github.com/sktime/skpro
  • Versions: 27
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 6.9%
Last synced: 4 months ago
pypi.org: skpro

A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python

  • Versions: 26
  • Dependent Packages: 1
  • Dependent Repositories: 1
  • Downloads: 32,194 Last month
Rankings
Stargazers count: 6.6%
Dependent packages count: 7.3%
Forks count: 8.6%
Average: 11.4%
Downloads: 12.5%
Dependent repos count: 22.1%
Maintainers (2)
Last synced: 4 months ago

Dependencies

.github/workflows/cancel.yml actions
  • styfle/cancel-workflow-action 0.9.1 composite
.github/workflows/dependency-review.yml actions
  • actions/checkout v3 composite
  • actions/dependency-review-action v1 composite
.github/workflows/test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • codecov/codecov-action v3 composite
  • pre-commit/action v3.0.0 composite
  • trilom/file-changes-action v1.2.4 composite
.github/workflows/wheels.yml actions
  • actions/checkout v3 composite
  • actions/download-artifact v2 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v2 composite
  • conda-incubator/setup-miniconda v2 composite
  • pypa/gh-action-pypi-publish release/v1 composite
pyproject.toml pypi
  • numpy >=1.21.0,<1.25
  • packaging *
  • pandas >=1.1.0
  • scikit-base >=0.4.3
  • scikit-learn >=0.24.0,<1.2.0
  • scipy <2.0.0,>=1.2.0