netactivitytrain
Nextflow pipeline to train models for NetActivity
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.3%) to scientific vocabulary
Keywords
Repository
Nextflow pipeline to train models for NetActivity
Basic Info
Statistics
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Introduction
NetActivityTrain is a bioinformatics pipeline to encode gene expression measurements into gene set activity scores. NetActivityTrain uses sparsely connected autoencoders to perform the encoding.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible.
Functionalities
NetActivityTrain can be used to train a model or to compute the gene set activity scores from a pre-trained model (under development). The training of a model with NetActivityTrain has the following steps:
- Gene expression standardization
- Split of input data in training and test datasets
- Model training
- Model export for use with NetActivity
Quick Start
Install
Nextflow(>=22.0.3)Install any of
Docker,Singularity(you can follow this tutorial). See docs)_.Download the pipeline and test it on a minimal dataset with a single command:
bash
nextflow run yocra3/NetActivityTrain -profile test,YOURPROFILE --outdir <OUTDIR>
Note that some form of configuration will be needed so that Nextflow knows how to fetch the required software. This is usually done in the form of a config profile (YOURPROFILE in the example command above). You can chain multiple config profiles in a comma-separated string.
- The pipeline comes with config profiles called
dockerandsingularitywhich instruct the pipeline to use the named tool for software management. For example,-profile test,docker.- Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use
-profile <institute>in your command. This will enable eitherdockerorsingularityand set the appropriate execution settings for your local compute environment.- If you are using
singularity, please use thenf-core downloadcommand to download images first, before running the pipeline. Setting theNXF_SINGULARITY_CACHEDIRorsingularity.cacheDirNextflow options enables you to store and re-use the images from a central location for future pipeline runs.
- Start running your own analysis!
bash
nextflow run yocra3/NetActivityTrain --data_prefix SE_h5 --gene_mask gene_mask.txt --network network.py --network_params params.py --outdir <OUTDIR> -profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
Documentation
The yocra3/NetActivityTrain pipeline comes with documentation about the pipeline usage, parameters and output.
Credits
yocra3/NetActivityTrain was originally written by @yocra3.
Support
For further information or help, don't hesitate to contact Carlos Ruiz at cruizarenas@unav.es.
Citations
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
You can cite the nf-core publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.
Owner
- Login: yocra3
- Kind: user
- Repositories: 36
- Profile: https://github.com/yocra3
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use `nf-core tools` in your work, please cite the `nf-core` publication"
authors:
- family-names: Ewels
given-names: Philip
- family-names: Peltzer
given-names: Alexander
- family-names: Fillinger
given-names: Sven
- family-names: Patel
given-names: Harshil
- family-names: Alneberg
given-names: Johannes
- family-names: Wilm
given-names: Andreas
- family-names: Garcia
given-names: Maxime Ulysse
- family-names: Di Tommaso
given-names: Paolo
- family-names: Nahnsen
given-names: Sven
title: "The nf-core framework for community-curated bioinformatics pipelines."
version: 2.4.1
doi: 10.1038/s41587-020-0439-x
date-released: 2022-05-16
url: https://github.com/nf-core/tools
prefered-citation:
type: article
authors:
- family-names: Ewels
given-names: Philip
- family-names: Peltzer
given-names: Alexander
- family-names: Fillinger
given-names: Sven
- family-names: Patel
given-names: Harshil
- family-names: Alneberg
given-names: Johannes
- family-names: Wilm
given-names: Andreas
- family-names: Garcia
given-names: Maxime Ulysse
- family-names: Di Tommaso
given-names: Paolo
- family-names: Nahnsen
given-names: Sven
doi: 10.1038/s41587-020-0439-x
journal: nature biotechnology
start: 276
end: 278
title: "The nf-core framework for community-curated bioinformatics pipelines."
issue: 3
volume: 38
year: 2020
url: https://dx.doi.org/10.1038/s41587-020-0439-x
GitHub Events
Total
Last Year
Dependencies
- tensorflow/tensorflow 2.7.0 build
- tensorflow/tensorflow 2.7.0-gpu build
- bioconductor/bioconductor_docker RELEASE_3_15 build