diceR

Diverse Cluster Ensemble in R

https://github.com/alinetalhouk/dicer

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    2 of 9 committers (22.2%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (22.3%) to scientific vocabulary
Last synced: 7 months ago · JSON representation

Repository

Diverse Cluster Ensemble in R

Basic Info
Statistics
  • Stars: 38
  • Watchers: 6
  • Forks: 10
  • Open Issues: 3
  • Releases: 23
Created over 9 years ago · Last pushed 10 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---



```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```
# diceR 


[![R-CMD-check](https://github.com/AlineTalhouk/diceR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/AlineTalhouk/diceR/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/AlineTalhouk/diceR/graph/badge.svg)](https://app.codecov.io/gh/AlineTalhouk/diceR)
[![CRAN status](https://www.r-pkg.org/badges/version/diceR)](https://CRAN.R-project.org/package=diceR)
[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/grand-total/diceR?color=orange)](https://r-pkg.org/pkg/diceR)



## Overview

The goal of `diceR` is to provide a systematic framework for generating diverse cluster ensembles in R. There are a lot of nuances in cluster analysis to consider. We provide a process and a suite of functions and tools to implement a systematic framework for cluster discovery, guiding the user through the generation of a diverse clustering solutions from data, ensemble formation, algorithm selection and the arrival at a final consensus solution. We have additionally developed visual and analytical validation tools to help with the assessment of the final result. We implemented a wrapper function `dice()` that allows the user to easily obtain results and assess them. Thus, the package is accessible to both end user with limited statistical knowledge. Full access to the package is available for informaticians and statisticians and the functions are easily expanded. More details can be found in our companion paper published at [BMC Bioinformatics](https://doi.org/10.1186/s12859-017-1996-y).

## Installation

You can install `diceR` from CRAN with:

```{r install_CRAN, message=FALSE, eval=FALSE}
install.packages("diceR")
```

Or get the latest development version from GitHub:

```{r install_github, message=FALSE, eval=FALSE}
# install.packages("devtools")
devtools::install_github("AlineTalhouk/diceR")
```

## Example

The following example shows how to use the main function of the package, `dice()`. A data matrix `hgsc` contains a subset of gene expression measurements of High Grade Serous Carcinoma Ovarian cancer patients from the Cancer Genome Atlas publicly available datasets. Samples as rows, features as columns. The function below runs the package through the `dice()` function. We specify (a range of) `nk` clusters over `reps` subsamples of the data containing 80% of the full samples. We also specify the clustering `algorithms` to be used and the ensemble functions used to aggregated them in `cons.funs`.

```{r example, results='hide'}
library(diceR)
data(hgsc)
obj <- dice(
  hgsc,
  nk = 4,
  reps = 5,
  algorithms = c("hc", "diana"),
  cons.funs = c("kmodes", "majority"),
  progress = FALSE,
  verbose = FALSE
)
```

The first few cluster assignments are shown below:

```{r assignments}
knitr::kable(head(obj$clusters))
```

You can also compare the base `algorithms` with the `cons.funs` using internal evaluation indices:

```{r compare}
knitr::kable(obj$indices$ii$`4`)
```


## Pipeline

This figure is a visual schematic of the pipeline that `dice()` implements.

![Ensemble Clustering pipeline.](man/figures/pipeline.png)


Please visit the [overview](https://alinetalhouk.github.io/diceR/articles/overview.html "diceR overview") page for more detail.

Owner

  • Name: Aline Talhouk
  • Login: AlineTalhouk
  • Kind: user

GitHub Events

Total
  • Create event: 1
  • Issues event: 2
  • Release event: 1
  • Watch event: 5
  • Push event: 10
Last Year
  • Create event: 1
  • Issues event: 2
  • Release event: 1
  • Watch event: 5
  • Push event: 10

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 1,069
  • Total Committers: 9
  • Avg Commits per committer: 118.778
  • Development Distribution Score (DDS): 0.09
Past Year
  • Commits: 21
  • Committers: 1
  • Avg Commits per committer: 21.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Derek Chiu d****u@b****a 973
gliu@bccrc.ca g****u@a****a 41
Aline Talhouk a****k@b****a 35
gliu@bccrc.ca g****u@c****a 7
Derek Chiu d****u@s****a 5
Billy Chen b****n@c****a 3
Dustin Johnson d****n@c****a 2
liujohnson118 g****u@b****a 2
JakeNel28 j****8@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 82
  • Total pull requests: 21
  • Average time to close issues: 3 months
  • Average time to close pull requests: 28 days
  • Total issue authors: 34
  • Total pull request authors: 5
  • Average comments per issue: 3.52
  • Average comments per pull request: 0.9
  • Merged pull requests: 17
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: 7 days
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • AlineTalhouk (25)
  • dchiu911 (13)
  • jenj133 (5)
  • billyhpchen (3)
  • mattpaletta (2)
  • ghost (2)
  • bw4sz (2)
  • tiagochst (2)
  • JoeDonBonner (2)
  • angel-bee2018 (1)
  • phiala (1)
  • carcarre11 (1)
  • nobody1999 (1)
  • JakeNel28 (1)
  • bernard-liew (1)
Pull Request Authors
  • dchiu911 (16)
  • Dustin21 (2)
  • romainfrancois (1)
  • JakeNel28 (1)
  • jerryji1993 (1)
Top Labels
Issue Labels
discussion (11) bug (10) enhancement (8) feature (6) question (4) task (4) Priority: Low (2) Priority: Medium (1) help wanted (1)
Pull Request Labels
feature (1)

Packages

  • Total packages: 2
  • Total downloads:
    • cran 627 last-month
  • Total dependent packages: 1
    (may contain duplicates)
  • Total dependent repositories: 2
    (may contain duplicates)
  • Total versions: 25
  • Total maintainers: 1
cran.r-project.org: diceR

Diverse Cluster Ensemble in R

  • Versions: 24
  • Dependent Packages: 1
  • Dependent Repositories: 2
  • Downloads: 627 Last month
Rankings
Forks count: 6.8%
Stargazers count: 8.4%
Average: 14.5%
Dependent packages count: 18.1%
Dependent repos count: 19.2%
Downloads: 20.1%
Maintainers (1)
Last synced: 8 months ago
conda-forge.org: r-dicer
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 34.0%
Forks count: 40.9%
Stargazers count: 41.4%
Average: 41.9%
Dependent packages count: 51.2%
Last synced: 8 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.5 depends
  • NMF * imports
  • RankAggreg * imports
  • Rcpp * imports
  • abind * imports
  • assertthat * imports
  • clValid * imports
  • class * imports
  • clue * imports
  • clusterCrit * imports
  • dplyr >= 0.7.5 imports
  • ggplot2 * imports
  • infotheo * imports
  • klaR * imports
  • magrittr * imports
  • mclust * imports
  • methods * imports
  • purrr >= 0.2.3 imports
  • stringr * imports
  • tidyr * imports
  • yardstick * imports
  • RColorBrewer * suggests
  • Rtsne * suggests
  • apcluster * suggests
  • cluster * suggests
  • covr * suggests
  • dbscan * suggests
  • e1071 * suggests
  • kernlab * suggests
  • knitr * suggests
  • kohonen * suggests
  • mixedClust * suggests
  • pander * suggests
  • poLCA * suggests
  • progress * suggests
  • rlang * suggests
  • rmarkdown * suggests
  • sigclust * suggests
  • testthat * suggests