SurPyval

SurPyval: Survival Analysis with Python - Published in JOSS (2021)

https://github.com/derrynknife/surpyval

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

actuarial-science churn-prediction non-parametric parametric-distribution parametric-methods probability-plot reliability reliability-analysis reliability-engineering risk-analysis survival-analysis weibull

Scientific Fields

Medicine Life Sciences - 63% confidence
Earth and Environmental Sciences Physical Sciences - 40% confidence
Engineering Computer Science - 40% confidence
Last synced: 4 months ago · JSON representation

Repository

A Python package for survival analysis. The most flexible survival analysis package available. SurPyval can work with arbitrary combinations of observed, censored, and truncated data. SurPyval can also fit distributions with 'offsets' with ease, for example the three parameter Weibull distribution.

Basic Info
Statistics
  • Stars: 49
  • Watchers: 3
  • Forks: 5
  • Open Issues: 19
  • Releases: 0
Topics
actuarial-science churn-prediction non-parametric parametric-distribution parametric-methods probability-plot reliability reliability-analysis reliability-engineering risk-analysis survival-analysis weibull
Created about 7 years ago · Last pushed about 1 year ago
Metadata Files
Readme Contributing License Support

README.md

surpyval logo

SurPyval - Survival Analysis in Python

actions PyPI version PyPI - Python Version Documentation Status DOI

Yet another Python survival analysis tool.

This is another pure python survival analysis tool so why was it needed? The intent of this package was to closely mimic the scipy API as close as possible with a simple .fit() method for any type of distribution (parametric or non-parametric); other survival analysis packages don't completely mimic that API. Further, there is currently (at the time of writing) no pacakage that can take an arbitrary comination of observed, censored, and truncated data. Finally, surpyval is unique in that it can be used with multiple parametric estimation methods. This allows for an analyst to determine a distribution for the parameters if another method fails. The parametric methods available are Maximum Likelihood Estimation (MLE), Probability Plotting (MPP), Mean Square Error (MSE), Method of Moments (MOM), and Maximum Product of Spacing (MPS). Surpyval can, for each type of estimator, take the following types of input data:

| Method | Para/Non-Para | Observed | Censored | Truncated | | ------ | ---- |-----|------|------| | MLE | Parametric | Yes | Yes | Yes | | MPP | Parametric | Yes | Yes | Limited | | MSE | Parametric | Yes | Yes | Limited | | MOM | Parametric | Yes | No | No | | MPS | Parametric | Yes | Yes | No | | Kaplan-Meier | Non-Parametric | Yes | Right only | Left only | | Nelson-Aalen | Non-Parametric | Yes | Right only | Left only | | Fleming-Harrington | Non-Parametric | Yes | Right only | Left only | | Turnbull | Non-Parametric | Yes | Yes | Yes |

SurPyval also offers many different distributions for users, and because of the flexible implementation adding new distributions is easy. Further, the power of SurPyval lay in the robust parameter estimation, as such, some distributions, those that are supported on the half real line, can be offset to make a three- or four-parameter version. The currently available distributions are:

| Distribution | Offsetable | | ------------- | ---- | | Weibull | Yes | | Normal | No | | LogNormal | Yes | | Gamma | Yes | | Beta | No | | Uniform | No | | Exponential | Yes | | Exponentiated Weibull | Yes | | Gumbel | No | | Logistic | No | | LogLogistic | Yes |

This project spawned from a Reliaility Engineering project; due to the history of reliability engineers estimating parameters from a probability plot. SurPyval has continued this tradition to ensure that any parametric distribution can have the estimate plotted on a probability plot. These visualisations enable an analyst to get a sense of the goodness of fit of the parametric distribution with the non-parametric distribution.

Install and Quick Intro

SurPyval can be installed via pip using the PyPI repository

bash pip install surpyval

If you're familiar with survival analysis, and Weibull plotting, the following is a quick start.

```python from surpyval import Weibull from surpyval.datasets import BoforsSteel

Fetch some data that comes with SurPyval

data = BoforsSteel.df

x = data['x'] n = data['n']

model = Weibull.fit(x=x, n=n, offset=True) model.plot(); ```

Weibull Data and Distribution

Documentation

SurPyval is well documented, and improving, at the main documentation.

Development

Dependencies

pip install -r requirements_dev.txt

Testing

Run the testing suite by simply executing: bash pytest or use coverage to get a coverage report: bash coverage run -m pytest # Run pytest under coverage's watch coverage report # Print coverage report coverage html # Make a html coverage report (really useful), open htmlcov/index.html

Pre-commit

  • Pip install pre-commit (it's in requirements_dev.txt anyways)
  • Run pre-commit install which sets up the git hook scripts
  • If you'd like, run pre-commit run --all-files to run the hooks on all files
  • When you go to commit, it will only proceed after all the hooks succeed

Contact

Email derryn if you want any features or to see how SurPyval can be used for you.

JOSS Publication

SurPyval: Survival Analysis with Python
Published
August 11, 2021
Volume 6, Issue 64, Page 3484
Authors
Derryn Knife
Independent researcher
Editor
Dan Foreman-Mackey ORCID
Tags
survival analysis parameter estimation censored data truncated data maximum likelihood product spacing estimation method of moments mean square error probability plotting probability plotting parameter estimation

GitHub Events

Total
  • Issues event: 1
  • Watch event: 4
  • Issue comment event: 1
  • Push event: 7
  • Pull request event: 1
  • Create event: 2
Last Year
  • Issues event: 1
  • Watch event: 4
  • Issue comment event: 1
  • Push event: 7
  • Pull request event: 1
  • Create event: 2

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 668
  • Total Committers: 7
  • Avg Commits per committer: 95.429
  • Development Distribution Score (DDS): 0.355
Past Year
  • Commits: 5
  • Committers: 1
  • Avg Commits per committer: 5.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Derryn Knife d****e@g****m 431
Derryn Knife d****e@D****l 187
Anthony Carbone a****e@g****m 40
Knife d****e@b****m 3
Derryn Knife d****e@d****n 3
Dan Foreman-Mackey f****y@g****m 2
Derryn Knife d****e@D****y 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 36
  • Total pull requests: 15
  • Average time to close issues: 4 months
  • Average time to close pull requests: about 6 hours
  • Total issue authors: 9
  • Total pull request authors: 3
  • Average comments per issue: 1.69
  • Average comments per pull request: 0.0
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • derrynknife (16)
  • CamDavidsonPilon (10)
  • lisandrojim (4)
  • scottkds (1)
  • alaskamike (1)
  • Scipiock (1)
  • leester1690 (1)
  • subha000git (1)
  • jspobst (1)
Pull Request Authors
  • derrynknife (17)
  • anthonycarbone (3)
  • dfm (1)
Top Labels
Issue Labels
enhancement (7) good first issue (2) help wanted (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 4,150 last-month
  • Total dependent packages: 1
  • Total dependent repositories: 1
  • Total versions: 24
  • Total maintainers: 1
pypi.org: surpyval

A python package for survival analysis

  • Versions: 24
  • Dependent Packages: 1
  • Dependent Repositories: 1
  • Downloads: 4,150 Last month
  • Docker Downloads: 0
Rankings
Dependent packages count: 3.2%
Stargazers count: 10.4%
Downloads: 12.3%
Average: 12.7%
Forks count: 15.4%
Dependent repos count: 22.1%
Maintainers (1)
Last synced: 4 months ago

Dependencies

.github/workflows/actions.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v3 composite
requirements.txt pypi
  • lifelines ==0.27.4
  • numba ==0.56.4
  • numpy-indexed ==0.3.5
  • reliability ==0.8.6
requirements_dev.txt pypi
  • black * development
  • coverage * development
  • flake8 * development
  • flake8-pyproject * development
  • mypy * development
  • pre-commit * development
  • pytest * development
setup.py pypi
  • autograd *
  • autograd_gamma *
  • formulaic *
  • matplotlib *
  • numba *
  • numpy *
  • numpy_indexed *
  • pandas *
  • scipy *
pyproject.toml pypi