CaDrA

Candidate Drivers Analysis: Multi-Omic Search for Candidate Drivers of Functional Signatures

https://github.com/montilab/cadra

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (17.3%) to scientific vocabulary

Last synced: 7 months ago · JSON representation

Repository

Candidate Drivers Analysis: Multi-Omic Search for Candidate Drivers of Functional Signatures

Basic Info

Host: GitHub
Owner: montilab
License: gpl-3.0
Language: R
Default Branch: master
Homepage: http://montilab.github.io/CaDrA/
Size: 51.8 MB

Statistics

Stars: 24
Watchers: 6
Forks: 1
Open Issues: 8
Releases: 1

Created over 10 years ago · Last pushed over 1 year ago

Metadata Files

Readme Changelog License

README.Rmd

---
output: rmarkdown::github_document
---



```{r, include=FALSE, echo=FALSE, message=FALSE, warning=FALSE}
knitr::opts_chunk$set(fig.path="./man/figures/", message=FALSE, collapse = TRUE, comment="")

# Load SummarizedExperiment
library(SummarizedExperiment)

# Load CaDrA
library(devtools)
load_all()
```

# CaDrA

![build](https://github.com/montilab/cadra/workflows/rcmdcheck/badge.svg)
![Gitter](https://img.shields.io/gitter/room/montilab/cadra)
![GitHub issues](https://img.shields.io/github/issues/montilab/cadra)
![GitHub last commit](https://img.shields.io/github/last-commit/montilab/cadra)
  
**Ca**ndidate **Dr**ivers **A**nalysis: Multi-Omic Search for Candidate Drivers of Functional Signatures

**CaDrA** is an R package that supports a heuristic search framework aimed at identifying candidate drivers of a molecular phenotype of interest. 

The main function takes two inputs:

i) A binary multi-omics dataset, which can be represented as a matrix of binary features or a **SummarizedExperiment** class object where the rows are 1/0 vectors indicating the presence/absence of ‘omics’ features (e.g. somatic mutations, copy number alterations, epigenetic marks, etc.), and the columns are the samples.
ii) A molecular phenotype of interest which can be represented as a vector of continuous scores (e.g. protein expression, pathway activity, etc.)

Based on these two inputs, **CaDrA** implements a forward and/or backward search algorithm to find a set of features that together is maximally associated with the observed input scores, based on one of several scoring functions (*Kolmogorov-Smirnov*, *Conditional Mutual Information*, *Wilcoxon*, or *custom-defined scoring function*), making it useful to find complementary omics features likely driving the input molecular phenotype.

Please see our [documentation](https://montilab.github.io/CaDrA/) for additional examples.

# Web Interface

We developed an R Shiny Dashboard that would allow users to interact with **CaDrA** directly without the need to install or maintain the package.

See our web portal at [https://cadra.bu.edu/](https://cadra.bu.edu/)

# Installation

- Using `devtools` package

```r
library(devtools)
devtools::install_github("montilab/CaDrA")
```

- Using `BiocManager` package

```r
# Install BiocManager
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

# Install CaDrA
BiocManager::install("CaDrA")

# Install SummarizedExperiment
BiocManager::install("SummarizedExperiment")
```

# Usage

Here, we are using a dataset of somatic mutations and CNAs extracted from the TCGA Breast Cancer Dataset. We will query this Feature Set based on an Input Score that measures the per-sample activity of YAP/TAZ (two important regulators of the hippo pathway). This score represents the projection on the TCGA BrCa dataset of a gene expression signature of YAP/TAZ knockdown derived in breast cancer cell lines. Our question of interest: what is the combination of genetic features (mutations and copy number alterations) that best “explain” the YAP/TAZ activity?

## (i) Load R packages

```r
library(CaDrA)
library(SummarizedExperiment)
```

## (ii) Format and filter data inputs

```{r load.data}
## Read in BRCA GISTIC+Mutation object
utils::data(BRCA_GISTIC_MUT_SIG)
eset_mut_scna <- BRCA_GISTIC_MUT_SIG

## Read in input score
utils::data(TAZYAP_BRCA_ACTIVITY)
input_score <- TAZYAP_BRCA_ACTIVITY

## Samples to keep based on the overlap between the two inputs
overlap <- base::intersect(base::names(input_score), base::colnames(eset_mut_scna))
eset_mut_scna <- eset_mut_scna[, overlap]
input_score <- input_score[overlap]

## Binarize FS to only have 0's and 1's
SummarizedExperiment::assay(eset_mut_scna)[SummarizedExperiment::assay(eset_mut_scna) > 1] <- 1.0

## Pre-filter FS based on occurrence frequency
eset_mut_scna_flt <- CaDrA::prefilter_data(
  FS = eset_mut_scna,
  max_cutoff = 0.6,  # max event frequency (60%)
  min_cutoff = 0.03  # min event frequency (3%)
)  
```

## (iii) Run CaDrA

Here, we repeat the candidate search starting from each of the top 'N' features and report the combined results as a heatmap (to summarize the number of times each feature is selected across repeated runs). 

**IMPORTANT NOTE**: The legacy function `topn_eval()` is equivalent to the new recommended `candidate_search()` function.

```{r cadra}
topn_res <- CaDrA::candidate_search(
  FS = eset_mut_scna_flt,
  input_score = input_score,
  method = "ks_pval",          # Use Kolmogorow-Smirnow scoring function 
  method_alternative = "less", # Use one-sided hypothesis testing
  weights = NULL,              # If weights is provided, perform a weighted-KS test
  search_method = "both",      # Apply both forward and backward search
  top_N = 7,                   # Evaluate top 7 starting points for each search
  max_size = 7,                # Maximum size a meta-feature matrix can extend to
  do_plot = FALSE,             # Plot after finding the best features
  best_score_only = FALSE      # Return all results from the search
)
```

## (iv) Visualize the results

### Meta-feature plot

This plot produces 3 graphics stacked on top of each other: 

1. A density diagram of observed input scores sorted from highest to lowest 
2. A tile plot for the top meta-features that associated with a molecular phenotype of interest (e.g. input_score)
3. A KS enrichment plot of the meta-feature set (this correspond to the logical OR of the features)

```{r visualize.best}
## Fetch the meta-feature set corresponding to its best scores over top N features searches
topn_best_meta <- CaDrA::topn_best(topn_res)

# Visualize the best results with the meta-feature plot
CaDrA::meta_plot(topn_best_list = topn_best_meta, input_score_label = "YAP/TAZ Activity")
```

### Top-N plot

This plot is a heatmap of overlapping meta-features by repeating `candidate_search` over top N feature searches.

```{r summarize}
# Evaluate results across top N features you started from
CaDrA::topn_plot(topn_res) 
```

# Additional Guides

- [How to run CaDrA within a Docker environment](https://montilab.github.io/CaDrA/articles/docker.html)

# Acknowledgements

This project is funded in part by the [NIH/NIDCR](https://www.nidcr.nih.gov/) (3R01DE030350-01A1S1, R01DE031831), [Find the Cause Breast Cancer Foundation](https://findthecausebcf.org), and [NIH/NIA](https://www.nia.nih.gov/) (UH3 AG064704).

Owner

Name: Monti Lab
Login: montilab
Kind: organization
Email: montilab@bu.edu

Repositories: 21
Profile: https://github.com/montilab

GitHub Events

Total

Last Year

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 46
Total pull requests: 0
Average time to close issues: 4 months
Average time to close pull requests: N/A
Total issue authors: 7
Total pull request authors: 0
Average comments per issue: 0.87
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 0
Average time to close issues: 13 days
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 0
Average comments per issue: 0.5
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

tetomonti (25)
vkartha (15)
katgit (2)
RC-88 (1)
naarkhoo (1)
bioliyezhang (1)
mmkhan19 (1)

Pull Request Authors

Top Labels

Issue Labels

enhancement (6) bug (1)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- bioconductor 4,464 total

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 6
Total maintainers: 1

bioconductor.org: CaDrA

Candidate Driver Analysis

Homepage: https://github.com/montilab/CaDrA/
Documentation: https://bioconductor.org/packages/release/bioc/vignettes/CaDrA/inst/doc/CaDrA.pdf
License: GPL-3 + file LICENSE
Latest release: 1.6.0
published 11 months ago

Versions: 6
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 4,464 Total

Rankings

Dependent repos count: 0.0%

Stargazers count: 7.3%

Forks count: 22.4%

Average: 31.7%

Dependent packages count: 31.7%

Downloads: 97.1%

Maintainers (1)

rchau88@bu.edu

Last synced: 7 months ago

Dependencies

DESCRIPTION cran

R >= 4.1.0 depends
MASS * imports
R.cache * imports
SummarizedExperiment * imports
doParallel * imports
ggplot2 * imports
gplots * imports
graphics * imports
grid * imports
gtable * imports
methods * imports
misc3d * imports
plyr * imports
ppcor * imports
reshape2 * imports
stats * imports
BiocStyle * suggests
devtools * suggests
knitr * suggests
rmarkdown * suggests
testthat >= 3.0.0 suggests

.github/workflows/R-CMD-CHECK.yaml actions

actions/checkout v3 composite
r-lib/actions/check-r-package v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/document.yaml actions

actions/checkout v3 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/pkgdown.yaml actions

JamesIves/github-pages-deploy-action v4.4.1 composite
actions/checkout v3 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/render-news-rmd.yml actions

actions/checkout v3 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/render-readme-rmd.yml actions

actions/checkout v3 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science