gmm_diag and gmm_full

gmm_diag and gmm_full: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation - Published in JOSS (2017)

https://github.com/conradsnicta/armadillo-gmm

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
○
Committers with academic emails
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

armadillo clustering clustering-algorithm cpp em-algorithm expectation-maximization gaussian-mixture-models gmm k-means k-means-clustering machine-learning mapreduce openmp statistics

Last synced: 6 months ago · JSON representation

Repository

gmm_diag and gmm_full: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation

Basic Info

Host: GitHub
Owner: conradsnicta
License: apache-2.0
Language: C++
Default Branch: main
Homepage:
Size: 1000 Bytes

Statistics

Stars: 11
Watchers: 1
Forks: 4
Open Issues: 0
Releases: 0

Topics

armadillo clustering clustering-algorithm cpp em-algorithm expectation-maximization gaussian-mixture-models gmm k-means k-means-clustering machine-learning mapreduce openmp statistics

Created over 8 years ago · Last pushed over 3 years ago

Metadata Files

Readme

README.md

gmmdiag and gmmfull: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation

The gmm_diag and gmm_full classes are included in recent versions of the Armadillo C++ library.
Documentation for the gmm_diag and gmm_full classes is available online: http://arma.sourceforge.net/docs.html#gmm_diag.
The gmm_diag and gmm_full classes provide multi-threaded (parallelised) implementations of Gaussian mixture models (GMMs) and the associated k-means and Expectation-Maximisation (EM) training algorithms.
The gmm_diag class is specifically tailored for diagonal covariance matrices (all entries outside the main diagonal in each covariance matrix are assumed to be zero), while the gmm_full class is tailored for full covariance matrices. The gmm_diag class is typically much faster to train and use than the gmm_full class, at the potential cost of some reduction in modelling accuracy.
The interface for the gmm_diag and gmm_full classes allows the user full control over the parameters for GMM fitting, as well as easy and flexible access to the trained model. Specifically, the two classes contain functions for likelihood evaluation, vector quantisation, histogram generation, data synthesis, and parameter modification, in addition to training (learning) the GMM parameters via the EM algorithm. The classes use several techniques to improve numerical stability and promote convergence of EM based training, such as keeping as much as possible of the internal computations in the log domain, and ensuring the covariance matrices stay positive-definite.
To achieve multi-threading, the k-means and EM training algorithms have been reformulated into a MapReduce-like framework and implemented with the aid of OpenMP pragma directives. As such, the EM algorithm runs much quicker on multi-core machines when OpenMP is enabled during compilation (use the -fopenmp option in GCC and clang compilers).

Owner

Name: conradsnicta
Login: conradsnicta
Kind: user
Location: San Francisco

Repositories: 3
Profile: https://github.com/conradsnicta

https://arma.sourceforge.net https://coot.sourceforge.io

JOSS Publication

gmm_diag and gmm_full: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation

Published

October 16, 2017

DOI

10.21105/joss.00365

Volume 2, Issue 18, Page 365

Authors

Conrad Sanderson

Data61, CSIRO, Australia, University of Queensland, Australia, Arroyo Consortium

Ryan Curtin

Symantec Corporation, USA, Arroyo Consortium

Editor

Arfon Smith

GitHub Events

Total

Last Year

Committers

Last synced: 7 months ago

All Time

Total Commits: 1
Total Committers: 1
Avg Commits per committer: 1.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
conrad	c**d@l**n	1

Committer Domains (Top 20 + Academic)

localhost.localdomain: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

gmm_diag and gmm_full

Science Score: 93.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

gmmdiag and gmmfull: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation

Owner

JOSS Publication

gmm_diag and gmm_full: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels