mixR
mixR: An R package for Finite Mixture Modeling for Both Raw and Binned Data - Published in JOSS (2022)
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 14 DOI reference(s) in README -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.5%) to scientific vocabulary
Scientific Fields
Repository
R package for fitting finite mixture models for both raw and binned data
Basic Info
- Host: GitHub
- Owner: GaryBAYLOR
- License: gpl-3.0
- Language: R
- Default Branch: master
- Size: 18 MB
Statistics
- Stars: 9
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 2
Metadata Files
README.md
mixR: An R package for finite mixture modeling for both raw and binned data
Why mixR?
R programming language provides a rich collection of packages for building and analyzing finite mixture models which are widely used in unsupervised learning such as model-based clustering and density estimation. For example,
- mclust can be used to build Gaussian mixture models with different covariance structures
- mixtools implements parametric and non-parametric mixture models as well as mixtures of Gaussian regressions
- flexmix provides a general framework for finite mixtures of regression models
- mixdist fits mixture models for grouped and conditional data (also called binned data).
To our knowledge, almost all R packages for finite mixture models are designed to use raw data as the modeling input except mixdist. However the popular model selection methods based on information criteria or bootstrapping likelihood ratio test (McLachlan, 1987; Feng & McCulloch, 1996; Yu & Harvill, 2019) are not implemented in mixdist.
mixR is a package that aims to bridge this gap and to unify the interface for finite mixture modeling for both raw and binned data.
Installation
For stable/pre-compiled(for Windows and OS X) version, please install from CRAN:
r
install.packages('mixR')
To get the latest development version from Github: ```r
install.packages('devtools')
devtools::install_github('garybaylor/mixR') ```
Examples
- Fitting a normal mixture model ```r library(mixR)
generate data from a Normal mixture model
set.seed(102) x1 = rmixnormal(1000, c(0.3, 0.7), c(-2, 3), c(2, 1))
fit a Normal mixture model
mod1 = mixfit(x1, ncomp = 2)
plot the fitted model
plot(mod1)
fit a Normal mixture model (equal variance)
mod1_ev = mixfit(x1, ncomp = 2, ev = TRUE) ```
- Fitting a Weibull mixture model
r # generate data from a Weibull mixture model x2 = rmixweibull(1000, c(0.4, 0.6), c(0.6, 1.3), c(0.1, 0.1)) mod2_weibull = mixfit(x2, family = 'weibull', ncomp = 2) - Fitting a mixture model with binned data ```r head(Stamp2) ## lower upper freq ## 1 0.0595 0.0605 1 ## 5 0.0635 0.0645 2 ## 6 0.0645 0.0655 1 ## 7 0.0655 0.0665 1 ## 9 0.0675 0.0685 1 ## 10 0.0685 0.0695 7 modbinned = mixfit(Stamp2, ncomp = 7, family = 'weibull') plot(modbinned)
data binned from numeric data
x1binned = bin(x1, seq(min(x1), max(x1), length = 30)) mod1binned = mixfit(x1_binned, ncomp = 2) ```
- Mixture model selection by BIC ```r # Selecting the best g for Normal mixture model s_normal = select(x2, ncomp = 2:6)
Selecting the best g for Weibull mixture model
s_weibull = select(x2, ncomp = 2:6, family = 'weibull')
plot(sweibull) plot(snormal) ```
- Mixture model selection by bootstrap likelihood ratio test (LRT) ```r b1 = bs.test(x1, ncomp = c(2, 3)) plot(b1, main = 'Bootstrap LRT for Normal Mixture Models (g = 2 vs g = 3)') b1$pvalue
b2 = bs.test(x2, ncomp = c(2, 4)) plot(b2, main = 'Bootstrap LRT for Normal Mixture Models (g = 2 vs g = 4)') b2$pvalue ``` For more examples please check the vignette An Introduction to mixR.
Contributor Code of Conduct
Everyone is welcome to contribute to the project through reporting issues, posting feature requests, updating documentation, submitting pull requests, or contact the project maintainer directly. To maintain a friendly atmosphere and to collaborate in a fun and productive way, we expect contributors to abide by the Contributor Code of Conduct.
Citation
Yu, Y., (2022). mixR: An R package for Finite Mixture Modeling for Both Raw and Binned Data. Journal of Open Source Software, 7(69), 4031, https://doi.org/10.21105/joss.04031
BibTex information
@article{Yu2022,
doi = {10.21105/joss.04031},
url = {https://doi.org/10.21105/joss.04031},
year = {2022},
publisher = {The Open Journal},
volume = {7},
number = {69},
pages = {4031},
author = {Youjiao Yu},
title = {mixR: An R package for Finite Mixture Modeling for Both Raw and Binned Data},
journal = {Journal of Open Source Software}
}
Owner
- Name: cookie_monster
- Login: GaryBAYLOR
- Kind: user
- Location: Bay Area
- Repositories: 3
- Profile: https://github.com/GaryBAYLOR
Data Scientist
GitHub Events
Total
- Issues event: 1
Last Year
- Issues event: 1
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Youjiao (Gary) Yu | j****o@g****m | 66 |
| soodoku | g****7@g****m | 1 |
| RetoSchmucki | r****7@g****m | 1 |
| Xiaozhen Han | x****n@X****l | 1 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 10
- Total pull requests: 29
- Average time to close issues: about 1 month
- Average time to close pull requests: 1 day
- Total issue authors: 3
- Total pull request authors: 3
- Average comments per issue: 2.0
- Average comments per pull request: 0.07
- Merged pull requests: 29
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: about 2 months
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- soodoku (8)
- DzmitryGB (1)
- welch16 (1)
Pull Request Authors
- GaryBAYLOR (27)
- RetoSchmucki (2)
- soodoku (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- R >= 3.5.0 depends
- Rcpp >= 1.0.6 imports
- ggplot2 >= 3.3.3 imports
- graphics * imports
- stats * imports
- knitr * suggests
- mockery * suggests
- rmarkdown * suggests
- testthat >= 3.0.0 suggests