Overlapping

Overlapping: a R package for Estimating Overlapping in Empirical Distributions - Published in JOSS (2018)

https://github.com/masspastore/overlapping

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README and JOSS metadata
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

package r
Last synced: 6 months ago · JSON representation

Repository

Estimation of Overlapping in Empirical Distributions.

Basic Info
  • Host: GitHub
  • Owner: masspastore
  • License: gpl-3.0
  • Language: TeX
  • Default Branch: master
  • Homepage:
  • Size: 731 KB
Statistics
  • Stars: 8
  • Watchers: 0
  • Forks: 3
  • Open Issues: 2
  • Releases: 0
Topics
package r
Created over 7 years ago · Last pushed over 6 years ago
Metadata Files
Readme Changelog License

README.md

overlapping

Estimation of Overlapping in Empirical Distributions

Overlapping can be defined as the area intersected by two or more probability density functions. The main idea of this package is to offer an easy way to quantify the similarity (or the difference) between two or more empirical distributions by using the overlap between their kernel density estimates.

Set up

To install this github version type in R:

```{r}

if devtools is not installed yet:

install.packages( "devtools" )

library( devtools ) install_github( "masspastore/overlapping" ) ```

Main function

The main function, overlap, provides an approximation of the overlapping area of two or more kernel density estimations from empirical data.

  • overlap
    • Input: a list of numerical vectors to be compared; each vector is an element of the list.
    • Output: a data frame with information used for computing overlapping (only for graphical purposes) and estimated overlapped areas relative to each couple of distributions.

Note

The function overlap() calls the density() function for computing kernel density estimates. Consequently, the estimation of overlapping area depends on method used in this latter function. The algorithm used in density.default disperses the mass of the empirical distribution function over a regular grid of at least 512 points and then uses the fast Fourier transform to convolve this approximation with a discretized version of the kernel and then uses linear approximation to evaluate the density at the specified points (see help(density) for details).

Examples

```{r,results="markup"} set.seed( 20150605 )

EXAMPLE 1

creating a list with three different empirical distributions

x <- list( X1 = rnorm(100), X2 = rt(50,8), X3 = rchisq(80,2) )

out <- overlap( x, plot = TRUE ) out$OV # estimated overlapped areas

EXAMPLE 2

simulate eight random samples

dataList <- list() for (j in 1:8) dataList <- c(dataList, list(rnorm(30)))

OV <- overlap(dataList) # compute overlapping for all pairs head(OV$DD) # see the first rows of this data set table(OV$DD$k) # k indicates the pairs

plot all pairs

ggplot(OV$DD, aes(x,y1))+facetwrap(~k)+geomribbon(aes(ymin=0,ymax=y1),alpha=.3,fill="red")+ geom_ribbon(aes(ymin=0,ymax=y2),alpha=.3,fill="blue")+xlab("")+ylab("")

choose a single pair to be represented

K <- "Y1-Y2" data <- subset(OV$DD, k==K) # create a subset

plot it

ggplot(data, aes(x,y1))+geomribbon(aes(ymin=0,ymax=y1),alpha=.3,fill="red")+ geomribbon(aes(ymin=0,ymax=y2),alpha=.3,fill="blue")+ ggtitle(paste0("Overlap Y1-Y2 = ",round(OV$OV[K]*100,2),"%"))+xlab("")+ylab("") ```

Support/Bug Reports

Users may contact the author at massimiliano.pastore[at]unipd.it for support or to report issues.

References

Pastore, M. (2018). Overlapping: a R package for Estimating Overlapping in Empirical Distributions. The Journal of Open Source Software, 3 (32), 1023. URL: https://doi.org/10.21105/joss.01023

Pastore, M., Calcagnì, A. (2019). Measuring Distribution Similarities Between Samples: A Distribution-Free Overlapping Index. Frontiers in Psychology, 10:1089. URL: https://doi.org/10.3389/fpsyg.2019.01089

Owner

  • Name: Massimiliano Pastore
  • Login: masspastore
  • Kind: user
  • Location: Padova, ITALY
  • Company: Department of Developmental and Social Psychology

JOSS Publication

Overlapping: a R package for Estimating Overlapping in Empirical Distributions
Published
December 05, 2018
Volume 3, Issue 32, Page 1023
Authors
Massimiliano Pastore
Department of Developmental and Social Psychology, University of Padova
Editor
Arfon Smith ORCID
Tags
statistics

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 68
  • Total Committers: 4
  • Avg Commits per committer: 17.0
  • Development Distribution Score (DDS): 0.044
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Massimiliano Pastore 3****e 65
Thomas J. Leeper t****r@g****m 1
Arfon Smith a****n 1
***** g****7@g****m 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 9
  • Total pull requests: 3
  • Average time to close issues: 9 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 2
  • Total pull request authors: 3
  • Average comments per issue: 3.56
  • Average comments per pull request: 1.0
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • russellpierce (7)
  • soodoku (2)
Pull Request Authors
  • arfon (1)
  • leeper (1)
  • soodoku (1)
Top Labels
Issue Labels
Pull Request Labels