Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: DII-LIH-Luxembourg
- License: mit
- Language: R
- Default Branch: main
- Size: 3.03 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 2
- Open Issues: 4
- Releases: 2
Metadata Files
README.md

Cytometry Cluster Annotation and Differential Abundance Suite
Efficient and reproducible cytometry data
Aims:
• facilitating the process of cluster annotation while reducing user bias • improving reproducibility
Key features:
• defining the threshold of positive/negative marker expression
• interactive inspection of cluster phenotypes
• automatic merging of populations
• differential abundance analysis
Installation Instructions
``` r library(devtools)
Install all required packages
devtools::install_github("DII-LIH-Luxembourg/cycadas", dependencies = TRUE)
library(cycadas)
start the cycadas shiny app
cycadas() ```
Demo dataset
To enable tool exploration, we provide the demo dataset that can be loaded (Load tab → Demo Data) either as cluster expression data only (Load Cluster Expression Demo Data, allowing the user to create the annotation) or as annotated data (Load Annotated Demo Data which include the annotation tree).
This demo dataset is generated from the publicly available mass cytometry data of patients with idiopathic Parkinson's disease and healthy controls (Capelle, C.M. et al., Nat Commun, 2023) that were clustered with GigaSOM to generate 1600 clusters.
Loading SingleCellExperiment Data (CATALYST)
This is optional. If you wish to load data clustered with CATALYST or other Tools using the Single Cell Format, please install:
``` r if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install("CATALYST") BiocManager::install("SingleCellExperiment") ```
Data input, Single Cell Format
```{r}
CATALYST Workflow using Single Cell Experiment data object
Preprocessing ...
Cluster with CATALYST
sce <- cluster(sce, features = "type", xdim = 10, ydim = 10, maxK = 20, verbose = FALSE, seed = 1)
Save the object as .rds file
saveRDS(sce, "my_sce.rds")
Load into CyCadas
Annotate the desired (meta-)cluster levels
use the integrated merge function within Cycadas
save the object
load it back into your workflow
sce <- readRDS("Annotated_sce.rds")
continue downstream analysis
...
```
Data Input from FlowSOM
Median Expression and Cluster Frequencies:
``` r
within your clustering workflow create sample_ids according to the metadata files:
sampleids <- rep(metadata$sampleid, fsApply(fcs, nrow))
library(FlowSOM) fsom <- ReadInput(fcs, transform = FALSE, scale = FALSE)
set.seed(42) som <- BuildSOM(fsom, colsToUse = lineage_markers, xdim=20, ydim=20, rlen=40)
expr_median <- som$map$medianValues
Calculate cluster frequencies
clusteringtable <- as.numeric(table(som$map$mapping[,1])) clusteringprop <- round(clusteringtable / sum(clusteringtable) * 100, 2) dfprop <- as.data.frame(clusteringprop) dfprop$cluster <- rownames(dfprop)
write.csv(exprmedian, "exprmedian.csv", row.names = F) write.csv(dfprop, "clusterfreq.csv")
----------------------------------------------------------------------------
Generate the Proportion Table
----------------------------------------------------------------------------
countstable <- table(som$map$mapping[,1], sampleids) propstable <- t(t(countstable) / colSums(counts_table)) * 100
props <- as.data.frame.matrix(props_table)
write.csv(props, "proportion_table.csv") ```
Data Input from GigaSOM.jl
Median Expression and Cluster Frequencies:
``` julia gridSize = 20 nEpochs = 40
Assume the dataset is loaded in distributed data info di, e.g. using loadFCSSet.
som = initGigaSOM(di, gridSize, gridSize, seed = 42) # set a seed value here som = trainGigaSOM(som, di, epochs = nEpochs) mapping_di = mapToGigaSOM(som, di)
num_cluster = gridSize^2
Get the Cluster Frequencies for CyCadas:
clusterFreq = dcount(numcluster, mappingdi) df = DataFrame(cluster = 1:length(df), clusteringprop = clusterFreq) df.clusteringprop = df.clusteringprop ./ sum(df.clusteringprop) CSV.write("cluster_freq.csv", df)
Get the count table per fileID (optional - Count Table in CyCadas).
Assume md is a data frame that describes the data
(i.e., it contains a row for all filenames loaded in di in the same order,
together with sample identifiers)
files = distributeFCSFileVector(:fileIDs, md[:, :filename]) counttbl = dcountbuckets(numcluster, mappingdi, size(md, 1), files) ct = DataFrame(counttbl, :auto) rename!(ct, md.sampleid) CSV.write("clustercounts.csv", ct)
Get the median Marker Expressions.
Assume lineage_markers is a human-readable list of markers used in clustering
(here used for annotating the median expression table)
exprtbl = dmedianbuckets(di, numcluster, mappingdi, cols) et = DataFrame(exprtbl, :auto) rename!(et, lineagemarkers) CSV.write("median_expr.csv", et) ```
Detailed workflow for each method can be found in the data section.
Data exploration
The UMAP interactive tab allows the preview of marker expression in the clusters selected by the user on the UMAP:

In the UMAP Marker expression tab, user can investigate the expression level of the selected marker across all the clusters.

Thresholds
In the Thresholds tab, the estimation of threshold value defining negative and positive marker expression of each marker is based on 1-dimensional k-means clustering and Mclust. A silhouette score chooses the best estimation of each marker. The bimodality for every marker is assessed and the bimodal coefficient values are reported. The blue threshold line indicates that data meets the bimodal distribution criteria, otherwise it is colored red. The threshold value can be manually adjusted by clicking on the scatterplot.
Expression of CD8a with blue threshold line indicating the bimodal distribution:

Expression of TCRgd with red threshold line indicating that this marker expression does not follow the bimodal distribution:

Annotation
The Annotation tab allows performing the annotation in a tree-based hierarchical process - initially, the main cell types are defined, followed by the identification of their subtypes (with the level of detail defined by the user).
All the clusters are initially defined as "unassigned". Then, upon the selection of positive and negative markers defining the population, clusters characterized by given expression pattern are re-assigned from the parent node to the child node.
Scheme depicting the process of building the annotation tree:

Cropped fragment of the completed annotation tree:

Upon selection of the node, heatmap displaying the expression of all the markers in all the clusters belonging to this node is shown.
Heatmap depicting phenotype of clusters annotated as CD8+ TEM cells:

Differential abundance analysis
In the Differential Abundance tab, a pairwise Wilcoxon test on all the nodes is performed upon selecting the desired multiple testing correction method:

DA Interactive Tree allows exploration of abundance of all the defined subpopulations across the conditions by selecting the node on the annotation tree.
Upon clicking on the desired node...

... proportion of the selected celltype across the condition is plotted.

Data export
Differential abundance analysis results, as well as proportion table (% of defined cell populations across all the samples) can be exported in the Differential Abundance tab.

Files enabling the continuation of the analysis - modified threshold values, as well as annotation tree structure, can be exported from the Thresholds and Annotation tabs, respectively, and re-loaded (Load tab) to continue the analysis.
Exporting annotation tree:

Exporting threshold values:

Owner
- Name: Luxembourg Institute of Health
- Login: DII-LIH-Luxembourg
- Kind: organization
- Location: 29, rue Henri Koch, L-4354 Esch-sur-Alzette
- Website: www.lih.lu
- Repositories: 2
- Profile: https://github.com/DII-LIH-Luxembourg
Department of Infection and Immunity
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it using the following:"
title: "CyCadas"
authors:
- family-names: "Hunewald"
given-names: "Oliver"
orcid: "https://orcid.org/0000-0001-5402-5084" # optional
affiliation: "Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
Bioinformatics & AI, Department of Medical Informatics, Luxembourg Institute of Health, Strassen, Luxembourg"
date-released: 2024-10-07
- family-names: "Demczuk"
given-names: "Agnieszka"
orcid: "https://orcid.org/0000-0001-9868-7653" # optional
affiliation: "Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
version: "1.0.0"
doi: "10.1234/bioinformatics/btae595"
url: "https://https://github.com/DII-LIH-Luxembourg/cycadas"
license: "MIT"
GitHub Events
Total
- Issues event: 1
- Issue comment event: 1
- Push event: 9
- Pull request review event: 2
- Pull request event: 4
- Create event: 4
Last Year
- Issues event: 1
- Issue comment event: 1
- Push event: 9
- Pull request review event: 2
- Pull request event: 4
- Create event: 4
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 2
- Total pull requests: 0
- Average time to close issues: 20 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.5
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 0
- Average time to close issues: 20 days
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.5
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- oHunewald (1)
- exaexa (1)
Pull Request Authors
- exaexa (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- testthat >= 3.0.0 suggests