cube

Intuitive Nonparametric Gene Network Search Algorithm

https://github.com/connerlambden/cube

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary

Keywords

gene-network gene-relationships network-analysis scanpy single-cell-rna-seq

Last synced: 10 months ago · JSON representation ·

Repository

Intuitive Nonparametric Gene Network Search Algorithm

Basic Info

Host: GitHub
Owner: connerlambden
Language: Python
Default Branch: main
Homepage:
Size: 9.72 MB

Statistics

Stars: 3
Watchers: 1
Forks: 3
Open Issues: 0
Releases: 0

Topics

gene-network gene-relationships network-analysis scanpy single-cell-rna-seq

Created about 5 years ago · Last pushed over 4 years ago

Metadata Files

Readme Citation

Cubé: Intuitive Gene Network Search Algorithm

Cubé

How It Works

Given a single-cell dataset and an input gene(s), Cubé looks for simple & nonlinear gene-gene relationships to construct a regulation network informed by prior gene signatures. For example, Cubé might give you the result that GeneA * GeneB ~= GeneC, potentially meaning that genes A & B coregulate to produce C, or there is some other nonlinear relationship. Cubé then recursively feeds outputs back into itself to great a gene network.

Cubé

Install

$ python3 -m pip install git+https://github.com/connerlambden/Cube.git

Running Cubé

``` from sccube import cube import scanpy as sc adata = sc.readh5ad('myexpressiondata.h5ad') # Load AnnData Object containing logged expression matrix gofiles = ['BioPlanet2019.tsv', 'GeneSigDB.tsv'] # Load Gene Signatures to Search In

cube.runcube(adata=adata, seedgene1='ifng', seedgene2='tbx21', gofiles=gofiles, outdirectory='CubéResults', numsearchchildren=4, searchdepth=2) ```

Example Outputs

Inputs

adata: AnnData Object with logged expression matrix

seedgene1: Starting search gene of interest

seedgene2: Optional: Additional seed gene of interest (to search for seedgene1 * seedgene2)

go_files: List of Pathway files to search in. Each edge in Cubé requires all connected genes to be present in at least 2 pathways. Examples To Download or Download More From Enrichr

out_directory: Folder to put results in

numsearchchildren: How many search children to add to the network on each iteration. For example, a value of 2 will add two children to each node.

search_depth: Recursive search depth. Values above 2 may take a long time to run

Outputs

Cubédatatable.csv: Table showing the genes, pathways, and weight for each edge in the network. Positive correlations will have small edge weights and negative correlations will have large edge weights.

*.graphml file. Network file that can be visualized in programs like Cytoscape

Cubé_network.png: Network visualization where green edges are positive correlation & red edges are negative correlation. For better visualizations, we recommend loading the .graphml file into Cytoscape

Visualizing The Product of 2 Genes Using Scanpy

``` import numpy as np

Visualizing Product of 2 Genes using Scanpy (assuming adata.X is logged and sparse)

gene1 = 'ifng' gene2 = 'tbx21' adataexpressingboth = adata[(adata[:,gene1].X.toarray().flatten() > 0) & (adata[:,gene2].X.toarray().flatten() > 0),:] adataexpressingboth.obs[gene1 + ' * ' + gene2] = np.exp(adataexpressingboth[:,gene1].X.toarray() + adataexpressingboth[:,gene2].X.toarray()) sc.pl.umap(adataexpressingboth, color=[gene1 + ' * ' + gene2]) ```

Why Cubé?

Cubé

Single-cell RNA sequencing has allowed for unprecedented resolution into the transcriptome of single cells, however the sheer complexity of the data and high rates of dropout have posed interpretive and computational challenges to create biological meanings and gene relationships. Many methods have been proposed for inferring gene regulatory networks, leading to sometimes dramatic differences depending upon the initial assumptions made 😬. Even in the case of unsupervised learning (UMAP) or clustering (Leiden), it’s not clear how to balance local/global structure or what data features are most important. Additionally, these “black-box” machine learning methods are closed to scrutiny of their inner workings and cannot explicate logical, understandable steps and tend to be fragile to model parameters. Cubé addresses the dropout issue by only comparing sets of genes together in cells that have nonzero expression in all cells. This removes the need for biased imputation methods and focuses each relationship to relevant cells. Cubé addresses the interpretability problem by presenting solutions in the form of expression(gene1) ~= expression(gene2) * expression(gene3) which succinctly express nonlinear relationships between specific genes in an understandable way without any pesky parameters. Since Cubé samples from the space of all possible nonlinear gene-gene pairs, results have high representational capacity and low ambiguity. Cubé is a descriptive search algorithm that optimizes for biologically & statistically informed gene patterns.

How It Works Under The Hood

Cubé

Special Thanks to Vijay Kuchroo, Ana Anderson, Lloyd Bod, & Aviv Regev

Contact: conner@connerpro.com

Owner

Login: connerlambden
Kind: user

Twitter: _conner12
Repositories: 1
Profile: https://github.com/connerlambden

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Lambden
    given-names: Conner
    orcid:  https://orcid.org/0000-0003-0162-6622
title: "Cubé Intuitive Nonparametric Gene Network Search Algorithm"
version: 1.0.1
date-released: 2020-06-10

GitHub Events

Total

Last Year

Dependencies

setup.py pypi

jit *
networkx *
numpy *
pandas *
scipy *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

cube

Science Score: 44.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Cubé: Intuitive Gene Network Search Algorithm

How It Works

Install

Running Cubé

Inputs

Outputs

Visualizing The Product of 2 Genes Using Scanpy

Visualizing Product of 2 Genes using Scanpy (assuming adata.X is logged and sparse)

Why Cubé?

How It Works Under The Hood

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies