https://github.com/cbg-ethz/graphclust_neurips
Network-Based Clustering of Pan-Cancer Data Accounting for Clinical Covariates
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.6%) to scientific vocabulary
Keywords
Repository
Network-Based Clustering of Pan-Cancer Data Accounting for Clinical Covariates
Basic Info
Statistics
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Network-Based Clustering of Pan-Cancer Data Accounting for Clinical Covariates
This repository contains the code to reproduce the results of the NeurIPS 2022 LMRL workshop paper "Network-Based Clustering of Pan-Cancer Data Accounting for Clinical Covariates".
Installation
In order to install the package, it suffices to launch
R CMD INSTALL path/to/graphClust
from a terminal, or make install from within the package source folder.
Being hosted on GitHub, it is possible to use the install_github
tool from an R session:
``` if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(c("Rgraphviz", "RBGL"))
library("devtools") installgithub("cbg-ethz/graphClustNeurIPS") ```
graphClust requires R >= 3.5, and depends on
pcalg, reshape2, BiDAG (>= 2.0.2),
RBGL, clue and grDevices.
Simulations
Figure 2 can be reproduced by running the script simulations/figure_2-simulation.R. Analogously, Figure 4 in the appendix can be reproduced by running the script simulations/figure_4-simulation.R. The simulations can be modified and executed in the simulations/cluster-scripts folder.
Pan-Cancer Data
Figure 3 can be reproduced by runnign the script tcga_analysis/figure_3-km_plot.R. The results of Table 1 can be reproduced by runnign the script tcga_analysis/table_1-cox_analysis.R. A reproducability analysis for a range of different seeds can be found in tcga_analysis/reproducability_different_seeds. The hyperparameters of the cluster algorithms can be modified and executed in the tcga_analysis/clustering folder folder.
Example
```{r eval=FALSE} library(graphClust)
Simulate binary data from 3 clusters
kclust <- 3 ss <- c(400, 500, 600) # samples in each cluster simulationdata <- sampleData(kclust = kclust, nvars = 20, nsamples = ss) sampleddata <- simulationdata$sampled_data
Network-based clustering
clusterres <- getclusters(sampleddata, kclust = k_clust)
Calculate the ARI
library(mclust) adjustedRandIndex(simulationdata$clustermembership, clusterrest$clustermembership)
Visualize the networks
library(ggplot2) library(ggraph) library(igraph) library(ggpubr)
graphClust::plotclusters(clusterres_t)
Visualize a single network
mygraph <- igraph::graphfromadjacencymatrix(clusterrest$DAGs[[1]], mode="directed") graphClust::niceDAGplot(my_graph)
```
Owner
- Name: Computational Biology Group (CBG)
- Login: cbg-ethz
- Kind: organization
- Location: Basel, Switzerland
- Website: https://www.bsse.ethz.ch/cbg
- Twitter: cbg_ethz
- Repositories: 91
- Profile: https://github.com/cbg-ethz
Beerenwinkel Lab at ETH Zurich
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0