sjaracne

Scalable Tool for Gene Network Reverse Engineering

https://github.com/jyyulab/sjaracne

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 10 committers (10.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary

Keywords

gene-network inference mutual-information

Keywords from Contributors

dimensionality-reduction
Last synced: 6 months ago · JSON representation

Repository

Scalable Tool for Gene Network Reverse Engineering

Basic Info
  • Host: GitHub
  • Owner: jyyulab
  • License: other
  • Language: C++
  • Default Branch: master
  • Homepage:
  • Size: 244 MB
Statistics
  • Stars: 24
  • Watchers: 7
  • Forks: 16
  • Open Issues: 9
  • Releases: 3
Topics
gene-network inference mutual-information
Created about 8 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

SJARACNe

Build Status

SJARACNe is a scalable solution of ARACNe that dramatically improves the computational performance, especially on the memory usage to allow even researchers with modest computational power to generate networks from thousands of samples. The algorithm uses adaptive partitioning mutual information to calculate the correlation between all pairs of genes to reconstruct the regulatory network.

Download

git clone https://github.com/jyyulab/SJARACNe # Clone the repo

Prerequisites

Create a virtual environment (recommended)

Using conda to create a virtual environment

The recommended method of setting up the required Python environment and dependencies is to use the conda dependency manager:

bash $ conda create -n py392 python=3.9.2 $ source activate py392

Installation

Depends on the runtime environment, node.js may be installed manually to run cwltool locally; cwlexec may be installed manually to run on IBM LSF platform.

There are two options to install SJARACNe and its dependencies:

(Option 1) Install via pip

$ pip install SJARACNe

(Option 2) Install from source

bash $ git clone https://github.com/jyyulab/SJARACNe $ cd SJARACNe $ python setup.py build # build SJARACNe binary $ python setup.py install

Install optional packages depends on runtime platform

SJARACNe workflow is implemented in Common Workflow Language. Install node.js for running locally using cwltool; install cwlexec to run on IBM LSF platform. Users may check Common Workflow Language site for available workflow engines to run on other platforms, e.g., Toil.

Usage

```$ sjaracne usage: sjaracne [-h] {local,lsf} ...

SJARACNe is a scalable tool for gene network reverse engineering.

optional arguments: -h, --help show this help message and exit

Subcommands: {local,lsf} platforms local run cwltool in a local workstation lsf run cwlexec as in a IBM LSf cluster sjaracneworkflow is implemented with [CWL](https://www.commonwl.org/). It supports multiple computing platforms. We have tested it locally using [cwltool](https://github.com/common-workflow-language/cwltool) and on an IBM LSF cluster using [cwlexec](https://github.com/IBMSpectrumComputing/cwlexec). For the convenience, a python wrapper is developed for you to choose computing platform usingsubcommand```.

The local mode (sjaracne local) runs in parallel by default using cwltool's --parallel option. To run it in serial, use --serial option.

To use LSF mode, editing the LSF-specific configuration file SJARACNe/config/config_cwlexec.json to change the default queue and adjust memory reservation for each step is necessary. Consider increasing memory reservation for bootstrap step and consensus step if the dimension of your expression matrix file is large.

Inputs

The main input for SJARACNe is a tab-separated genes/protein by cells/samples expression matrix with the first two columns being ID and symbol. The second required input file is the list of significant genes/proteins IDs to be considered as hubs in the reconstructed network (the most recent version of curated transcription factors and signaling proteins can be found in ./SJARACNe/config/TFlist.txt and ./SJARACNe/config/SIGlist.txt, respectively). An output directory is required for storing output files. Additional parameters (e.g., LSF queue) for running on different platforms are required. Those are available in the helping information of the corresponding subcommands, e.g., sjaracne lsf -h.

Outputs

The main output of SJARACNe is a network file, which is a tab delimited text file with the following columns: source, target, mutual information, Pearson and Spearman correlations coefficients, regression line slope and p-value. SJARACNe also outputs two meta information files: parameterinfo.txt and bootstrapinfo.txt, which stores SJARACNe input parameters and bootstrap parameters respectively.

Examples to create a transcription factor network

Note: for testing purpose, the number of bootstraps (-n) is set to 2, the consensus p-value threshold -pc is set to 1.0 in the following examples. -n 100 and -pc 1e-5 are recommended for real applications. Note that there is no / at the end of the -o option but there is a / at the end of the -tmp option.

Running on a single machine (Linux/OSX)

sjaracne local -e ./test_data/inputs/BRCA100.exp -g ./test_data/inputs/tf.txt -n 2 -o ./test_data/outputs/cwl/cwltool/SJARACNE_out.final -pc 1.0 -tmp ./test_data/outputs/cwl/cwltool/tmp/

Running on an IBM LSF cluster

sjaracne lsf -j ./SJARACNe/config/config_cwlexec.json -e ./test_data/inputs/BRCA100.exp -g ./test_data/inputs/tf.txt -n 2 -o ./test_data/outputs/cwl/cwltool/SJARACNE_out.final -pc 1.0

Reference

Alireza Khatamian, Evan O. Paull, Andrea Califano* & Jiyang Yu*. SJARACNe: a scalable software tool for gene network reverse engineering from big data. Bioinformatics (2018). *Corresponding authors.

Owner

  • Name: Yu Laboratory @ St. Jude
  • Login: jyyulab
  • Kind: organization
  • Location: Memphis, TN

Yu Lab in the Department of Computational Biology at St. Jude Children's Research Hospital

GitHub Events

Total
  • Issues event: 1
  • Watch event: 1
  • Member event: 1
  • Issue comment event: 3
Last Year
  • Issues event: 1
  • Watch event: 1
  • Member event: 1
  • Issue comment event: 3

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 182
  • Total Committers: 10
  • Avg Commits per committer: 18.2
  • Development Distribution Score (DDS): 0.555
Top Committers
Name Email Commits
Liang Ding a****g@g****m 81
Jimmy Veloso j****0@y****r 44
Alireza Khatamian a****5@g****m 30
Khatamian a****i@d****l 12
Keith Hughitt k****t@g****m 8
Jiyang Yu j****u@u****m 3
Alireza a****i@l****l 1
Alireza Khatamian a****k@u****u 1
Qingfei Pan 3****n@u****m 1
Lei Yan l****n@s****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 45
  • Total pull requests: 9
  • Average time to close issues: 9 months
  • Average time to close pull requests: 10 days
  • Total issue authors: 30
  • Total pull request authors: 3
  • Average comments per issue: 1.91
  • Average comments per pull request: 0.44
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 3
  • Pull request authors: 0
  • Average comments per issue: 2.33
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • adamdingliang (11)
  • ERIGR (3)
  • khughitt (3)
  • guillaumecharbonnier (2)
  • albertop210 (1)
  • karlie002 (1)
  • fossbert (1)
  • MathildaStigenberg (1)
  • zjgt (1)
  • saisaitian (1)
  • decarlin (1)
  • Tangke98 (1)
  • albertoriva (1)
  • MingBit (1)
  • jasonchian (1)
Pull Request Authors
  • ZebYulon (6)
  • jimmyv9 (3)
  • khughitt (3)
Top Labels
Issue Labels
enhancement (5) bug (2)
Pull Request Labels
enhancement (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 4 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 2
  • Total maintainers: 1
pypi.org: sjaracne

Gene network reverse engineering from big data

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 4 Last month
Rankings
Forks count: 8.9%
Dependent packages count: 10.0%
Stargazers count: 13.6%
Dependent repos count: 21.7%
Average: 28.7%
Downloads: 89.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • cwltool >=3.0.20201117141248
  • numpy ==1.20.1
  • pandas ==1.2.3
  • scipy ==1.6.1
setup.py pypi
  • cwltool *
  • numpy *
  • pandas *
  • scipy *