Overlapping
Overlapping: a R package for Estimating Overlapping in Empirical Distributions - Published in JOSS (2018)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 5 DOI reference(s) in README and JOSS metadata -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Repository
Estimation of Overlapping in Empirical Distributions.
Basic Info
Statistics
- Stars: 8
- Watchers: 0
- Forks: 3
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md
overlapping
Estimation of Overlapping in Empirical Distributions
Overlapping can be defined as the area intersected by two or more probability density functions. The main idea of this package is to offer an easy way to quantify the similarity (or the difference) between two or more empirical distributions by using the overlap between their kernel density estimates.
Set up
To install this github version type in R:
```{r}
if devtools is not installed yet:
install.packages( "devtools" )
library( devtools ) install_github( "masspastore/overlapping" ) ```
Main function
The main function, overlap, provides an approximation of the overlapping area of two or more kernel density estimations from empirical data.
- overlap
- Input: a list of numerical vectors to be compared; each vector is an element of the list.
- Output: a data frame with information used for computing overlapping (only for graphical purposes) and estimated overlapped areas relative to each couple of distributions.
Note
The function overlap() calls the density() function for computing kernel density estimates. Consequently, the estimation of overlapping area depends on method used in this latter function. The algorithm used in density.default disperses the mass of the empirical distribution function over a regular grid of at least 512 points and then uses the fast Fourier transform to convolve this approximation with a discretized version of the kernel and then uses linear approximation to evaluate the density at the specified points (see help(density) for details).
Examples
```{r,results="markup"} set.seed( 20150605 )
EXAMPLE 1
creating a list with three different empirical distributions
x <- list( X1 = rnorm(100), X2 = rt(50,8), X3 = rchisq(80,2) )
out <- overlap( x, plot = TRUE ) out$OV # estimated overlapped areas
EXAMPLE 2
simulate eight random samples
dataList <- list() for (j in 1:8) dataList <- c(dataList, list(rnorm(30)))
OV <- overlap(dataList) # compute overlapping for all pairs head(OV$DD) # see the first rows of this data set table(OV$DD$k) # k indicates the pairs
plot all pairs
ggplot(OV$DD, aes(x,y1))+facetwrap(~k)+geomribbon(aes(ymin=0,ymax=y1),alpha=.3,fill="red")+ geom_ribbon(aes(ymin=0,ymax=y2),alpha=.3,fill="blue")+xlab("")+ylab("")
choose a single pair to be represented
K <- "Y1-Y2" data <- subset(OV$DD, k==K) # create a subset
plot it
ggplot(data, aes(x,y1))+geomribbon(aes(ymin=0,ymax=y1),alpha=.3,fill="red")+ geomribbon(aes(ymin=0,ymax=y2),alpha=.3,fill="blue")+ ggtitle(paste0("Overlap Y1-Y2 = ",round(OV$OV[K]*100,2),"%"))+xlab("")+ylab("") ```
Support/Bug Reports
Users may contact the author at massimiliano.pastore[at]unipd.it for support or to report issues.
References
Pastore, M. (2018). Overlapping: a R package for Estimating Overlapping in Empirical Distributions. The Journal of Open Source Software, 3 (32), 1023. URL: https://doi.org/10.21105/joss.01023
Pastore, M., Calcagnì, A. (2019). Measuring Distribution Similarities Between Samples: A Distribution-Free Overlapping Index. Frontiers in Psychology, 10:1089. URL: https://doi.org/10.3389/fpsyg.2019.01089
Owner
- Name: Massimiliano Pastore
- Login: masspastore
- Kind: user
- Location: Padova, ITALY
- Company: Department of Developmental and Social Psychology
- Website: https://psicostat.dpss.psy.unipd.it/~massimiliano.pastore/
- Repositories: 1
- Profile: https://github.com/masspastore
JOSS Publication
Overlapping: a R package for Estimating Overlapping in Empirical Distributions
Authors
Department of Developmental and Social Psychology, University of Padova
Tags
statisticsGitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Massimiliano Pastore | 3****e | 65 |
| Thomas J. Leeper | t****r@g****m | 1 |
| Arfon Smith | a****n | 1 |
| ***** | g****7@g****m | 1 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 9
- Total pull requests: 3
- Average time to close issues: 9 days
- Average time to close pull requests: 1 day
- Total issue authors: 2
- Total pull request authors: 3
- Average comments per issue: 3.56
- Average comments per pull request: 1.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- russellpierce (7)
- soodoku (2)
Pull Request Authors
- arfon (1)
- leeper (1)
- soodoku (1)
