distr6

R6 object-oriented interface for probability distributions.

https://github.com/xoopr/distr6

Science Score: 51.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    4 of 23 committers (17.4%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.7%) to scientific vocabulary

Keywords from Contributors

standardization
Last synced: 7 months ago · JSON representation ·

Repository

R6 object-oriented interface for probability distributions.

Basic Info
Statistics
  • Stars: 102
  • Watchers: 7
  • Forks: 23
  • Open Issues: 1
  • Releases: 44
Created about 7 years ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

distr6

distr6 status badge tic

Repo
Status Lifecycle

codecov License: MIT

Paper Zenodo

What is distr6?

distr6 is a unified and clean interface to organise the probability distributions implemented in R into one R6 object oriented package, as well as adding distributions yet to implemented in R, currently we have 42 probability distributions as well as 11 kernels. Building the package from the ground up and making use of tried and tested design patterns (as per Gamma et al. 1994), distr6 aims to make probability distributions easy to use, understand and analyse.

distr6 extends the work of Peter Ruckdeschel, Matthias Kohl et al. who created the first object-oriented (OO) interface for distributions using S4. Their distr package is currently the gold-standard in R for OO distribution handling. Using R6 we aim to take this even further and to create a scalable interface that can continue to grow with the community. Full details of the API and class structure can be seen in the distr6 website.

Main Features

distr6 is not intended to replace the base R distributions function but instead to give an alternative that focuses on distributions as objects that can be manipulated and accessed as required. The main features therefore centre on OOP practices, design patterns and API design. Of particular note:

All distributions in base R introduced as objects with methods for common statistical functions including pdf, cdf, inverse cdf, simulation, mean, variance, skewness and kurtosis

``` r B <- Binomial$new(prob = 0.5, size = 10) B$pdf(1:10)

> [1] 0.0097656250 0.0439453125 0.1171875000 0.2050781250 0.2460937500

> [6] 0.2050781250 0.1171875000 0.0439453125 0.0097656250 0.0009765625

B$kurtosis()

> [1] -0.2

B$rand(5)

> [1] 7 7 4 7 6

summary(B)

> Binomial Probability Distribution.

> Parameterised with:

>

> Id Support Value Tags

> 1: prob [0,1] 0.5 linked,required

> 2: qprob [0,1] linked,required

> 3: size ℕ+ 10 required

>

>

> Quick Statistics

> Mean: 5

> Variance: 2.5

> Skewness: 0

> Ex. Kurtosis: -0.2

>

> Support: {0, 1,...,9, 10} Scientific Type: ℕ0

>

> Traits: discrete; univariate

> Properties: symmetric; platykurtic; no skew

```

Flexible construction of distributions for common parameterisations

``` r Exponential$new(rate = 2)

> Exp(rate = 2)

Exponential$new(scale = 2)

> Exp(scale = 2)

Normal$new(mean = 0, prec = 2)

> Norm(mean = 0, prec = 2)

Normal$new(mean = 0, sd = 3)$parameters()

> Id Support Value Tags

> 1: mean ℝ 0 required

> 2: prec ℝ+ linked,required

> 3: sd ℝ+ 3 linked,required

> 4: var ℝ+ linked,required

```

Decorators for extending functionality of distributions to more complex modelling methods

``` r B <- Binomial$new() decorate(B, "ExoticStatistics")

> Binomial is now decorated with ExoticStatistics

> Binom(prob = 0.5, size = 10)

B$survival(2)

> [1] 0.9453125

decorate(B, "CoreStatistics")

> Binomial is now decorated with CoreStatistics

> Binom(prob = 0.5, size = 10)

B$kthmoment(6)

> Results from numeric calculations are approximate only. Better results may be available.

> [1] 190

```

Wrappers including truncation, huberization and product distributions for manipulation and composition of distributions.

``` r B <- Binomial$new() TruncatedDistribution$new(B, lower = 2, upper = 5) #Or: truncate(B,2,5)

> TruncBinom(Binomprob = 0.5, Binomsize = 10, trunclower = 2, truncupper = 5)

N <- Normal$new() MixtureDistribution$new(list(B,N), weights = c(0.1, 0.9))

> Binom wX Norm

ProductDistribution$new(list(B,N))

> Binom X Norm

```

Additionally set6 is used for symbolic representation of sets for Distribution typing

``` r Binomial$new()$traits$type

> ℕ0

Binomial$new()$properties$support

> {0, 1,...,9, 10}

```

Usage

distr6 has three primary use-cases:

  1. Upgrading base Extend the R distributions functions to classes so that each distribution additionally has basic statistical methods including expectation and variance and properties/traits including discrete/continuous, univariate/multivariate, etc.
  2. Statistics Implementing decorators and adaptors to manipulate distributions including distribution composition. Additionally functionality for numeric calculations based on any arbitrary distribution.
  3. Modelling Probabilistic modelling using distr6 objects as the modelling targets. Objects as targets is an understood ML paradigm and introducing distributions as classes is the first step to implementing probabilistic modelling.

Installation

distr6 can be installed from R-Universe

``` r

Enable repository from raphaels1

options(repos = c( raphaels1 = 'https://raphaels1.r-universe.dev', CRAN = 'https://cloud.r-project.org'))

Download and install distr6 in R

install.packages('distr6') ```

And GitHub

r remotes::install_github("xoopR/distr6")

distr6 will not be on CRAN.

Future Plans

Our plans for the next update include

  • A generalised qqplot for comparing any distributions
  • A finalised FunctionImputation decorator with different imputation strategies
  • Discrete distribution subtraction (negative convolution)
  • A wrapper for scaling distributions to a given mean and variance
  • More probability distributions
  • Any other good suggestions made between now and then!

Package Development and Contributing

distr6 is released under the MIT licence with acknowledgements to the LGPL-3 licence of distr. Therefore any contributions to distr6 will also be accepted under the MIT licence. We welcome all bug reports, issues, questions and suggestions which can be raised here but please read through our contributing guidelines for details including our code of conduct.

Acknowledgements

distr6 is the result of a collaboration between many people, universities and institutions across the world, without whom the speed and performance of the package would not be up to the standard it is. Firstly we acknowledge all the work of Prof. Dr. Peter Ruckdeschel and Prof. Dr. Matthias Kohl in developing the original distr family of packages. Secondly their significant contributions to the planning and design of distr6 including the distribution and probability family class structures. A team of undergraduates at University College London implemented many of the probability distributions and designed the plotting interface. The team consists of Shen Chen (@ShenSeanChen), Jordan Deenichin (@jdeenichin), Chengyang Gao (@garoc371), Chloe Zhaoyuan Gu (@gzy823), Yunjie He (@RoyaHe), Xiaowen Huang (@w090613), Shuhan Liu (@shliu99), Runlong Yu (@Edwinyrl), Chijing Zeng (@britneyzeng) and Qian Zhou (@yumizhou47). We also want to thank Prof. Dr. Bernd Bischl for discussions about design choices and useful features, particularly advice on the ParameterSet class. Finally University College London and The Alan Turing Institute for hosting workshops, meetings and providing coffee whenever needed.

Owner

  • Name: xoop
  • Login: xoopR
  • Kind: organization
  • Location: London, UK

xoop is a universe of packages for class object-oriented programming in R.

Citation (CITATION)

@article{RJ-2021-055,
  author = {Raphael Sonabend and Franz J. Király},
  title = {{distr6: R6 Object-Oriented Probability Distributions
          Interface in R}},
  year = {2021},
  journal = {{The R Journal}},
  doi = {10.32614/RJ-2021-055},
  url = {https://doi.org/10.32614/RJ-2021-055},
  pages = {444--466},
  volume = {13},
  number = {1}
}

GitHub Events

Total
  • Watch event: 2
Last Year
  • Watch event: 2

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 1,583
  • Total Committers: 23
  • Avg Commits per committer: 68.826
  • Development Distribution Score (DDS): 0.241
Past Year
  • Commits: 109
  • Committers: 4
  • Avg Commits per committer: 27.25
  • Development Distribution Score (DDS): 0.514
Top Committers
Name Email Commits
Raphael r****5@u****k 1,201
RaphaelS1 r****d@g****m 178
Ain Toha n****5@u****k 41
Michal Lauer m****5@g****m 36
jdeenichin 4****n 20
john b****n@g****m 19
shliu99 4****9 16
Chijing Zeng 4****g 15
RoyaHe 5****e 14
Xiaowen Huang 4****3 13
Chengyang Gao g****1@g****m 9
gzy823 g****n@o****m 4
garoc371 4****1 4
Shuhan Liu s****9@g****m 2
heyunjie 3****e 2
github-actions 4****] 2
yumizhou47 5****7 1
Rich FitzJohn r****n@i****k 1
Edwinyrl 3****l 1
IlyaZar z****n@w****e 1
Michael Chirico c****m@g****m 1
Edwinyrl e****l@h****m 1
ShenSeanChen 3****n 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: about 2 years ago

All Time
  • Total issues: 53
  • Total pull requests: 55
  • Average time to close issues: 8 months
  • Average time to close pull requests: 17 days
  • Total issue authors: 9
  • Total pull request authors: 7
  • Average comments per issue: 3.85
  • Average comments per pull request: 1.47
  • Merged pull requests: 51
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 10
  • Average time to close issues: 12 days
  • Average time to close pull requests: 6 days
  • Issue authors: 2
  • Pull request authors: 4
  • Average comments per issue: 2.25
  • Average comments per pull request: 1.6
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • RaphaelS1 (16)
  • bblodfon (3)
  • fkiraly (2)
  • MichaelChirico (1)
  • cbrueffer (1)
  • Rosie23 (1)
Pull Request Authors
  • RaphaelS1 (3)
  • MichalLauer (2)
  • dependabot[bot] (1)
Top Labels
Issue Labels
low priority (8) good first issue (6) medium priority (4) analytical (4) on hold (4) enhancement (3) documentation (3) numerical (1) distribution (1)
Pull Request Labels
dependencies (1)

Dependencies

DESCRIPTION cran
  • R6 * imports
  • Rcpp * imports
  • checkmate * imports
  • data.table * imports
  • ooplah * imports
  • param6 >= 0.2.4 imports
  • set6 >= 0.2.3 imports
  • stats * imports
  • GoFKernel * suggests
  • R62S3 * suggests
  • actuar * suggests
  • cubature * suggests
  • extraDistr * suggests
  • knitr * suggests
  • plotly * suggests
  • pracma * suggests
  • rmarkdown * suggests
  • testthat * suggests
.github/workflows/make-release.yml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • actions/create-release v1 composite
  • actions/download-artifact v2 composite
  • actions/upload-artifact v2 composite
  • actions/upload-release-asset v1 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
.github/workflows/pkgdown.yml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
.github/workflows/rcmdcheck.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact main composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v1 composite
.github/workflows/version-check.yml actions
  • actions/checkout v2.1.1 composite
  • actions/checkout v2 composite